Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

How to Design, Build and Publish a Python package?
Latest   Machine Learning

How to Design, Build and Publish a Python package?

Last Updated on July 17, 2023 by Editorial Team

Author(s): Prithivee Ramalingam

Originally published on Towards AI.

With Poetry

What is a Package?

In Python, a package organizes related modules (Python files) into a single hierarchical structure. Packages allow you to easily manage and reuse code across multiple projects and make it easier to distribute and install your code for others.

A Python package is simply a directory that contains one or more Python modules, along with a special __init__.py file that tells Python that this directory should be treated as a package. Packages can also contain sub-packages, which are simply nested directories with their own __init__.py files. This creates a hierarchical structure of packages and sub-packages.

Image Source — Python Package Index — Wikipedia

What is PyPI?

PyPI (Python Package Index) is the official repository for Python packages. It is a large collection of software packages written in Python that are available for installation and use by Python developers. PyPI serves as a central hub where developers can publish their Python packages, making them easily discoverable and accessible to the wider Python community.

PyPI provides a platform for developers to share their code with others, facilitating the distribution, installation, and versioning of Python packages. It allows developers to publish their packages so that others can easily install them in their own projects using package managers like pip (Python’s package installer).

pip install <package_name>

Different ways to build a package

There are 3 most common ways to build a package in Python.

  1. Using setuptools and setup.py
  2. Using Poetry
  3. Using Cookiecutter

The choice of method depends on our specific needs, the complexity of our package, and our preferred development workflow. In this article, we will build and publish the package with Poetry.

Poetry is a tool for managing Python packages and dependencies. It provides a comprehensive solution for packaging, dependency resolution, virtual environments, and project management. With Poetry, you can easily create, manage, and distribute Python packages.

In this article, we will be discussing the whole lifecycle of a Python package. We will see how to design a package, create a skeletal structure to build on top of it and finally publish the package to the open-source community. Additionally, we will also be looking at testing and versioning the created packages.

Synopsis

1. Poetry Installation

2. Creating a repository on GitHub

3. The Code

4. Create project structure using Poetry

5. Preparing for build

6. Testing the package

7. Building the package

8. Publishing the package

9. Testing the published package

10. Creating a new version of our package

1. Poetry Installation

You can find the poetry installation guide here. As I have a Windows machine, I have used the below Powershell command.

(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content U+007C py -

After installation, we need to add Poetry to the path variable.

Image by Author — Poetry Installation

2. Creating a repository on GitHub

Version control is a very important aspect when building a package. We can use GitHub to host our code, tests, and documentation and version them accordingly. In addition to that, we also require to run automated tests after building the package. GitHub Actions serve as an able tool to accomplish the said criteria.

Create a repository with a README.md file and license. After the repository is created, we need to clone it to our local system.

git clone https://github.com/Prithivee7/periodic-element-properties.git
Image By Author — Creating a new repository

3. The Code

Since we have created the repository, it is time to go through the code which we will be packaging along the course of the article. We are going to build a simple package that will provide us with the properties of elements in the periodic table. The data for this package has been picked up from Periodic Table with 28 Features dataset from Kaggle.

For the sake of brevity, I have written only 3 functions. But a real package will have much more. You can find the code here.

4. Create a project structure using Poetry

Poetry provides a basic skeletal structure for our project, and we can build on top of that. Go inside the cloned directory and open the command prompt. In the common prompt type

poetry new <your_repository_name>

Once the command is executed, we will get to see the following folder structure.

Image by Author — Folder structure generated by Poetry

A new folder will be created inside our current folder with the name of the repository. It will have pyproject.toml file and a README.md file in the root level. A new package will be created with an __init__.py file and a separate folder will be created for writing unit tests.

The pyproject.toml file is a configuration file used by various tools in the Python ecosystem, including Poetry, to define project-specific settings and metadata. It is a human-readable file and we can use this file as a replacement for the setup.py file and requirements.txt file.

The pyproject.toml file has all the basic information about our package. It has the name, version, description, and author of the package. Additionally, the dependencies can be mentioned in the [tool.poetry.dev-dependencies] section. The following is the basic structure of the pyproject.toml file.

[tool.poetry]
name = "periodic-element-properties"
version = "0.0.1"
description = ""
authors = ["Prithivee7 <[email protected]>"]
readme = "README.md"
packages = [{include = "periodic_element_properties"}]

[tool.poetry.dependencies]
python = "^3.7"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

5. Preparing for the build

We added the files which we had created earlier to the package we created. In this case, it is the periodic_element_properties directory. After we add the files, we need to update the __init__ file. This is done so that we can access the code inside the directory from outside. We have the “from” keyword, which references the namespace, which is specified afterward. In our case, it is the “periodic_element_properties” package. In that package, we have a file called elements.py, and in elements.py we need to access the class “Functions.”

__version__ = '0.2.0'
from periodic_element_properties.elements import Functions

After we have the logic ready we need to build an environment to check for dependency issues. Creating an environment will save the pain arising due to cyclic dependencies. Poetry comes in handy for building the environment. We need to build the environment and install the required dependencies in it to test the code in local. The following commands needs to be executed in the command prompt for creating an evironment and installing dependencies.

# This creates an environment
poetry shell

# Install all the required dependencies
pip install pandas pytest

The final step in building the package is updating the pyproject.toml file. We need to update the version of the package and add the dependencies here. For our package, we require pandas and pytest libraries. So we include them in the [tool.poetry.dev-dependencies] section. Also, based on our library’s compatibility we can specify the versions of python.

For maintaining clean code we can use libraries like Radon, Vulture and Black

6. Testing the package

Once we are done with the building part, like any responsible developer, we move on to testing. We have used the Pytest library for executing our test cases. The testing logic will go inside the “tests” folder that Poetry created for us. We can define any number of test cases according to our requirements and execute them using pytest. To install pytest, you can use the below command.

pip install pytest

We need to go inside the folder which Poetry had created for us and in the command prompt, we need to execute “pytest -v”. This command will run all the tests inside our folder, and only when all the tests have passed should we publish them. To run only one function, we can run the following command “pytest -k <function_name>”.

Image by Author — Testing the package using pytest

7. Building the package

After all our tests have run successfully, we can execute the “poetry check” command. If we get “all set” from this command, we can move on to the building phase.

Image by Author — Executing the check command to get go ahead for the build

The “poetry build” command is used in the Poetry package management tool for Python to build a distributable package from your project. It creates distribution files that can be easily distributed, installed, and published.

Image by Author — Performing the build

When you run poetry build, it performs the following tasks:

1. Compiles Source Code: Poetry compiles your project’s source code, including modules, packages, and any other Python files, into a format suitable for distribution.

2. Collects Dependencies: Poetry ensures that all the project’s dependencies, are as specified in the pyproject.toml file, are collected and included in the distribution. This ensures that the package can be installed and run with its required dependencies.

3. Generates Distribution Files: Poetry generates distribution files in different formats, such as source distributions (sdist) and wheels (bdist_wheel). These files contain the packaged code, metadata, and other necessary files required to install and use the package.

4. Stores Distribution Files: The generated distribution files are stored in the dist directory within your project’s root directory. Each distribution file has a specific naming convention based on the project name and version.

After running the poetry build, you can find the generated distribution files in the dist directory. These files can be shared, distributed, or published to package indexes like PyPI for others to install and use your package.

For example, if you run poetry build for a project named “my-package” with version “0.1.0”, you might get distribution files like my-package-0.1.0.tar.gz (source distribution) and my_package-0.1.0-py3-none-any.whl (wheel distribution) in the dist directory.

After the build is complete, we will push the source code to our repository. It is not a good practice to push the dist folder to GitHub. So we can mention that in our .gitignore file.

Image by Author — Looking inside the dist folder

8. Publishing the package

Publishing is the final step in the lifecycle of the package. We will be publishing our package in PyPI, which is the official repository for Python packages. We need to register an account with PyPI to publish all our packages. After this, we need to add an API token for authentication. After we have added the API token, we need to configure it with the following command.

poetry config pypi-token.pypi <enter_your_generated_pypi_token_here>

Once we have configured it, we can finally execute the “poetry publish” command. This will pick up the dist files generated from the “poetry build” command and publish them to PyPI.

Image by Author — Publishing the package

Congratulations. We have designed, built, and published our very own Python package. The package which we have created is now available for public use. People can install our packages and use them. The information written in our README.md file will be reflected on the homepage.

Image by Author — Homepage of the published package in PyPI

9. Testing the published package

After publishing the package, the last step is to test them. We can install the package in our environment by running the below command.

pip install <package_name>

During installation, we must make sure that we have the required Python version installed in our system. If there is an incompatibility between the Python version installed and what is demanded by the package, we won’t be able to install the package.

It is always advisable to create a virtual env when testing our new package. This is done so that we don’t have to wrestle with the multiple versions of the package. After installation, we can use test it by importing the library and calling the functions to check whether it is performing as expected.

Image by Author — Testing the published package

10. Creating a new version of our package

Once we have published the package, we might identify some bugs or want to include some new functionality. For that, we don’t have to create a new package, we can just create a new version of the package and publish it. The steps for publishing the new version are similar to that of what we did earlier. The first step is to update the code base for the new version. Then we need to update the version in pyproject.toml file. This is done because PyPI doesn’t allow two builds for the same version.

After we are satisfied with our test results, we perform “poetry build” and “poetry publish” commands. While installing, the latest version will be picked by default.

Conclusion:

In this article, we saw the whole lifecycle of designing, building, and publishing a Python package using Poetry. Poetry performs dependency management, packaging, publishing, maintaining project structure, handling dependency constraints, and versioning.

This article is the first of the 3 parts of my Python Package series. The second part will be talking about the documentation of the Python package, and the final part will focus on automating the process with GitHub Actions. Please feel free to connect with me on LinkedIn in case of queries or feedback. Cheers.

Reference

(258) Python Package Development Tutorial ( Design,Build and Publish) — YouTube

Want to Connect?

If you have enjoyed this article, please follow me here on Medium for more stories about machine learning and computer science.

Linked In — Prithivee Ramalingam U+007C LinkedIn

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓