How to Design, Build and Publish a Python package?
Last Updated on July 17, 2023 by Editorial Team
Author(s): Prithivee Ramalingam
Originally published on Towards AI.
With Poetry
What is a Package?
In Python, a package organizes related modules (Python files) into a single hierarchical structure. Packages allow you to easily manage and reuse code across multiple projects and make it easier to distribute and install your code for others.
A Python package is simply a directory that contains one or more Python modules, along with a special __init__.py file that tells Python that this directory should be treated as a package. Packages can also contain sub-packages, which are simply nested directories with their own __init__.py files. This creates a hierarchical structure of packages and sub-packages.
What is PyPI?
PyPI (Python Package Index) is the official repository for Python packages. It is a large collection of software packages written in Python that are available for installation and use by Python developers. PyPI serves as a central hub where developers can publish their Python packages, making them easily discoverable and accessible to the wider Python community.
PyPI provides a platform for developers to share their code with others, facilitating the distribution, installation, and versioning of Python packages. It allows developers to publish their packages so that others can easily install them in their own projects using package managers like pip (Pythonβs package installer).
pip install <package_name>
Different ways to build a package
There are 3 most common ways to build a package in Python.
- Using setuptools and setup.py
- Using Poetry
- Using Cookiecutter
The choice of method depends on our specific needs, the complexity of our package, and our preferred development workflow. In this article, we will build and publish the package with Poetry.
Poetry is a tool for managing Python packages and dependencies. It provides a comprehensive solution for packaging, dependency resolution, virtual environments, and project management. With Poetry, you can easily create, manage, and distribute Python packages.
In this article, we will be discussing the whole lifecycle of a Python package. We will see how to design a package, create a skeletal structure to build on top of it and finally publish the package to the open-source community. Additionally, we will also be looking at testing and versioning the created packages.
Synopsis
1. Poetry Installation
2. Creating a repository on GitHub
3. The Code
4. Create project structure using Poetry
5. Preparing for build
6. Testing the package
7. Building the package
8. Publishing the package
9. Testing the published package
10. Creating a new version of our package
1. Poetry Installation
You can find the poetry installation guide here. As I have a Windows machine, I have used the below Powershell command.
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content U+007C py -
After installation, we need to add Poetry to the path variable.
2. Creating a repository on GitHub
Version control is a very important aspect when building a package. We can use GitHub to host our code, tests, and documentation and version them accordingly. In addition to that, we also require to run automated tests after building the package. GitHub Actions serve as an able tool to accomplish the said criteria.
Create a repository with a README.md file and license. After the repository is created, we need to clone it to our local system.
git clone https://github.com/Prithivee7/periodic-element-properties.git
3. The Code
Since we have created the repository, it is time to go through the code which we will be packaging along the course of the article. We are going to build a simple package that will provide us with the properties of elements in the periodic table. The data for this package has been picked up from Periodic Table with 28 Features dataset from Kaggle.
For the sake of brevity, I have written only 3 functions. But a real package will have much more. You can find the code here.
4. Create a project structure using Poetry
Poetry provides a basic skeletal structure for our project, and we can build on top of that. Go inside the cloned directory and open the command prompt. In the common prompt type
poetry new <your_repository_name>
Once the command is executed, we will get to see the following folder structure.
A new folder will be created inside our current folder with the name of the repository. It will have pyproject.toml file and a README.md file in the root level. A new package will be created with an __init__.py file and a separate folder will be created for writing unit tests.
The pyproject.toml file is a configuration file used by various tools in the Python ecosystem, including Poetry, to define project-specific settings and metadata. It is a human-readable file and we can use this file as a replacement for the setup.py file and requirements.txt file.
The pyproject.toml file has all the basic information about our package. It has the name, version, description, and author of the package. Additionally, the dependencies can be mentioned in the [tool.poetry.dev-dependencies] section. The following is the basic structure of the pyproject.toml file.
[tool.poetry]
name = "periodic-element-properties"
version = "0.0.1"
description = ""
authors = ["Prithivee7 <[email protected]>"]
readme = "README.md"
packages = [{include = "periodic_element_properties"}]
[tool.poetry.dependencies]
python = "^3.7"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
5. Preparing for the build
We added the files which we had created earlier to the package we created. In this case, it is the periodic_element_properties directory. After we add the files, we need to update the __init__ file. This is done so that we can access the code inside the directory from outside. We have the βfromβ keyword, which references the namespace, which is specified afterward. In our case, it is the βperiodic_element_propertiesβ package. In that package, we have a file called elements.py, and in elements.py we need to access the class βFunctions.β
__version__ = '0.2.0'
from periodic_element_properties.elements import Functions
After we have the logic ready we need to build an environment to check for dependency issues. Creating an environment will save the pain arising due to cyclic dependencies. Poetry comes in handy for building the environment. We need to build the environment and install the required dependencies in it to test the code in local. The following commands needs to be executed in the command prompt for creating an evironment and installing dependencies.
# This creates an environment
poetry shell
# Install all the required dependencies
pip install pandas pytest
The final step in building the package is updating the pyproject.toml file. We need to update the version of the package and add the dependencies here. For our package, we require pandas and pytest libraries. So we include them in the [tool.poetry.dev-dependencies] section. Also, based on our libraryβs compatibility we can specify the versions of python.
For maintaining clean code we can use libraries like Radon, Vulture and Black
6. Testing the package
Once we are done with the building part, like any responsible developer, we move on to testing. We have used the Pytest library for executing our test cases. The testing logic will go inside the βtestsβ folder that Poetry created for us. We can define any number of test cases according to our requirements and execute them using pytest. To install pytest, you can use the below command.
pip install pytest
We need to go inside the folder which Poetry had created for us and in the command prompt, we need to execute βpytest -vβ. This command will run all the tests inside our folder, and only when all the tests have passed should we publish them. To run only one function, we can run the following command βpytest -k <function_name>β.
7. Building the package
After all our tests have run successfully, we can execute the βpoetry checkβ command. If we get βall setβ from this command, we can move on to the building phase.
The βpoetry buildβ command is used in the Poetry package management tool for Python to build a distributable package from your project. It creates distribution files that can be easily distributed, installed, and published.
When you run poetry build, it performs the following tasks:
1. Compiles Source Code: Poetry compiles your projectβs source code, including modules, packages, and any other Python files, into a format suitable for distribution.
2. Collects Dependencies: Poetry ensures that all the projectβs dependencies, are as specified in the pyproject.toml file, are collected and included in the distribution. This ensures that the package can be installed and run with its required dependencies.
3. Generates Distribution Files: Poetry generates distribution files in different formats, such as source distributions (sdist) and wheels (bdist_wheel). These files contain the packaged code, metadata, and other necessary files required to install and use the package.
4. Stores Distribution Files: The generated distribution files are stored in the dist directory within your projectβs root directory. Each distribution file has a specific naming convention based on the project name and version.
After running the poetry build, you can find the generated distribution files in the dist directory. These files can be shared, distributed, or published to package indexes like PyPI for others to install and use your package.
For example, if you run poetry build for a project named βmy-packageβ with version β0.1.0β, you might get distribution files like my-package-0.1.0.tar.gz (source distribution) and my_package-0.1.0-py3-none-any.whl (wheel distribution) in the dist directory.
After the build is complete, we will push the source code to our repository. It is not a good practice to push the dist folder to GitHub. So we can mention that in our .gitignore file.
8. Publishing the package
Publishing is the final step in the lifecycle of the package. We will be publishing our package in PyPI, which is the official repository for Python packages. We need to register an account with PyPI to publish all our packages. After this, we need to add an API token for authentication. After we have added the API token, we need to configure it with the following command.
poetry config pypi-token.pypi <enter_your_generated_pypi_token_here>
Once we have configured it, we can finally execute the βpoetry publishβ command. This will pick up the dist files generated from the βpoetry buildβ command and publish them to PyPI.
Congratulations. We have designed, built, and published our very own Python package. The package which we have created is now available for public use. People can install our packages and use them. The information written in our README.md file will be reflected on the homepage.
9. Testing the published package
After publishing the package, the last step is to test them. We can install the package in our environment by running the below command.
pip install <package_name>
During installation, we must make sure that we have the required Python version installed in our system. If there is an incompatibility between the Python version installed and what is demanded by the package, we wonβt be able to install the package.
It is always advisable to create a virtual env when testing our new package. This is done so that we donβt have to wrestle with the multiple versions of the package. After installation, we can use test it by importing the library and calling the functions to check whether it is performing as expected.
10. Creating a new version of our package
Once we have published the package, we might identify some bugs or want to include some new functionality. For that, we donβt have to create a new package, we can just create a new version of the package and publish it. The steps for publishing the new version are similar to that of what we did earlier. The first step is to update the code base for the new version. Then we need to update the version in pyproject.toml file. This is done because PyPI doesnβt allow two builds for the same version.
After we are satisfied with our test results, we perform βpoetry buildβ and βpoetry publishβ commands. While installing, the latest version will be picked by default.
Conclusion:
In this article, we saw the whole lifecycle of designing, building, and publishing a Python package using Poetry. Poetry performs dependency management, packaging, publishing, maintaining project structure, handling dependency constraints, and versioning.
This article is the first of the 3 parts of my Python Package series. The second part will be talking about the documentation of the Python package, and the final part will focus on automating the process with GitHub Actions. Please feel free to connect with me on LinkedIn in case of queries or feedback. Cheers.
Reference
(258) Python Package Development Tutorial ( Design,Build and Publish) β YouTube
Want to Connect?
If you have enjoyed this article, please follow me here on Medium for more stories about machine learning and computer science.
Linked In β Prithivee Ramalingam U+007C LinkedIn
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI