A One-liner That Validates Input Types to Your Functions At Runtime
Last Updated on January 13, 2023 by Editorial Team
Last Updated on January 13, 2023 by Editorial Team
Author(s): Shangjie Lyu
Originally published on Towards AI.
with source code explanation and simplified implementation
Note: this article is part of my series βSimple Techniques and Tools for Data Scienceβ, which is a collection of complementary powerful tools. This article is based on: mypy 0.991 pydantic 1.10.2 beartypeΒ 0.11.0.
The one-liner to be discussed here is the validate_arguments decorator provided by pydantic. I will also explain the source code and implement a simplified version at theΒ end!
1. TypeΒ hints
If you are already familiar with type hints, please feel free to skip this section; but in case you donβt, Iβll explain it with a simpleΒ example.
Most static languages like Java require argument and return types to be specified for functions.
Here in the addTwoNums Java function, we know that the two arguments num1 num2 and the return are all int typed. In its Python equivalent, we can declare the types in the docstring or documentation.
However, there is a more native way to define themβββtypeΒ hints.
As the above code shows, we can define the types of the arguments and the return using type hints (: for arguments and -> for return), so we donβt need to write them again in the docstring.
Type hinting is a relatively new feature and was introduced in Python 3.5. There are debates around whether or not to use type hints because, after all, Python is a dynamically typed language, and the types are by no means enforced. Personally, I think itβs good practice to add type hints, as it not only improves the code readability and makes documentation easier but also allows for various type-related checks, as I will introduce below, and I do use type hints in myΒ work.
2. Validate input types against typeΒ hints
There are several tools that allow us to check the code based on typeΒ hints.
2.1. StaticΒ checker
As the name suggests, a static type checker checks your code statically, in other words, it checks the input types of your functions if those function calls already exist in your codeΒ base.
MyPy (14.4k stars on GitHub) is probably the most popular tool of this kind, with some alternatives, including Pyright, Pytype, andΒ Pyre.
For example, if we have a script that calls a function with type hints but has invalid inputΒ types:
Mypy is able to scan the scripts and detect the type errors. However, if the function is called at runtime, mypy will not be able to checkΒ it.
For more information about static code checking and linting, please follow up on my series βWrite Production-ready Code for Data Scienceβ, where I will create an article on thisΒ topic.
2.2. RuntimeΒ checker
Assuming we are developing a Python package for the team, and we want to validate the input types to our functions at runtime when the colleagues are using them, we can simply add a validate_arguments decorator from Pydantic (12k stars on GitHub) to our function.
The examples are quite self-explanatory, and we can see that the decorator checks the inputs against the type hints and raises a ValidationError if the types mismatch.
Pydantic is not the only one that offers this functionality; there are other decorators achieving the same thing, such as beartype (1.4k stars onΒ GitHub).
However, beartypeβs error messages are clearly not as readable as the ones from pydantic.
3. Allow arbitrary types withΒ pydantic
By default, the validate_arguments decorator only supports Python built-in types, but often we might want to pass a Pandas DataFrame as an argument, and we can do so by allowing arbitrary types in config asΒ below:
This example function simply returns the first n rows of a dataframe, and we can see from the examples that the validate_arguments decorator can now validate the pd.DataFrame type asΒ well.
Caveat: as the official documentation stated, the validate_arguments decorator is in beta, is has been added to pydantic in v1.5 on a provisional basis. It may change in future releases and its interface will not be concrete untilΒ v2.
4. (Optional) How does itΒ work
If you are an intermediate to advanced user of Python, then the last two sections are forΒ you!
While a detailed breakdown of the source code is certainly beyond this articleβs scope, letβs look at the two most fundamental pieces to the puzzle: type hints and userΒ inputs.
4.1. Access type hints from functionβs attribute
If we have a function with type hints, we can access the type information in the functionβs __annotations__ attribute. ForΒ example:
And this is used in the _typing_extra.get_type_hints function in pydantic.
4.2. Access type hints from the functionβs signature
We can also access the annotations from a functionβs signature with Pythonβs built-in inspect module. ForΒ example:
This is also used in the ValidatedFunction class in pydantic.
4.3. Access user inputs in a decorator
This will be easy to understand if you have experience with Python decorators. In short, a decorator is (mostly) a function that takes another function as its input, does something about it, and then returns the original function. We can use a @ symbol to add a decorator to a function, but really it just means the decorator takes the function as its input, and the below two methods (line 13 and line 19βββwith or without the @ symbol) are equivalent.
In this simple example, my_decorator prints a message before and after executing the original greet function. Without diving into more details about decorators, the important takeaway here is that all the arguments (args and kwargs, in this example, βJohnβ) to the original function are also fed into the decorator itself, and thatβs how we can get the userΒ inputs.
5. (Advanced) Letβs implement ourΒ own!
Having learned the fundamental parts of the validate_arguments decorator, why donβt we implement a simplified version of ourΒ own?
Perfect! We have now implemented a validate_arguments decorator ourselves, and as we can see from the examples provided, it works exactly as we would expect. A few key pointsΒ here:
- Firstly we use signature(func).parameters (line 9) to obtain the expected argument names andΒ types
- We then use signature(func).bind(*args, **kwargs) (lines 12β14) to read all parameter names and values to the function call, including arguments, keyword arguments, and defaultΒ values
- Finally, check all the input values against the expected types, and call the original function if all types are valid, otherwise raise a TypeError (linesΒ 19β24)
You might have noticed the @wraps decorator from functools (built-in Python module) in line 6, which also appeared in the third screenshot of pydantic source code above. It essentially copies the metadata (name, docstring, annotations, etc.) of the original function into the wrapper so that we can preserve the information in our original function after itβs being decorated, which is good practice when developing decorators. And in lines 62β65, we can see that the metadata e.g. __doc__, __annotations__ attributes of the decorated get_dataframe_head function is preserved, rather than getting the metadata of theΒ wrapper.
Of course, I donβt expect my simplified implementation to beat pydanticβs, which is far more detailed and has been tested with various edge cases, but nevertheless, I hope you found this helpful and have learned something new or gained some inspiration. (That said, our simple decorator actually works pretty well in most cases we would encounter, but below is just an example of the edge case testsβββclassmethodβ that pydanticβs decorator passes and oursΒ fails.)
Happy coding!
Thanks for reading and your feedback is more than welcome. You could also follow or connect with me on LinkedIn, where I created the hashtag #100ArticlesForDS (100 Articles for Data Scientists) to share data science articles that I find insightful and helpful, alongside my comments, thoughts, and additional practical tips.
A One-liner That Validates Input Types to Your Functions At Runtime was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI