Exploring Python’s zip() Function: Simplifying Iteration and Data Combination
Last Updated on July 24, 2023 by Editorial Team
Author(s): Muhammad Arham
Originally published on Towards AI.
A Beginner’s Guide to Streamlining Data Manipulation and Iteration with zip() in Python for Enhanced Efficiency and Productivity
Introduction
Zip is an in-built function in the standard Python interpreter. It is a powerful method, that makes it easier to work with iterables such as dictionaries and lists.
In this article, we explore the syntax and working of the zip function, as we gain a practical understanding of how to utilize the method in real circumstances.
Syntax
The general function definition as per the Python documentation:
zip(*iterables, strict=False)
Given the definition, it is evident the method takes an arbitrary number of iterables using the *iterables non-keyword argument. Thus, we can pass any number of iterables to the zip function, which will be passed as a single list of argument.
There is a ‘strict’ keyword argument that we will explore later.
Use Case Example 1
To understand how to use the zip function, go over this interactive code snippet.
Consider we have 3 different lists. One contains the products, and the other two lists contain the quantity of products sold and its price.
If we want to calculate the the total sales for each product, we can do that using a simple for loop. However, we will have to index each array and will have to deal with edge cases in case of size mismatch. Moreover, the solution is not scalable for an arbitrary number of iterables.
The zip function provides a simple interface to perform such tasks, as you can pass all iterables to the zip function.
This returns a iterator of tuples, where i-th tuple contains the i-th element from each of the argument iterables.
So, for iterating over the zipped lists in the above example, each product will be combined with its price and quantity within a single tuple. This can be de-structured to obtain values during iteration.
Run the code above to better understand the working of zip function.
Use Case Example 2
This is a harder example. Suppose we have a 2-dimensional array, and we want to average over all values in the column.
Using for loop we can iterate rows of a matrix. For average of columns, we will have to use nested loops to iterate over columns separately.
Zip provides the a work around.
transposed = zip(*matrix)
We can transpose a matrix simply by passing all rows as iterables to the zip function. The * operator is used to unpack arguments, so it unpacks the matrix to transpose it.
column_averages = [sum(column) / len(column) for column in transposed]
We can then iterate over each column from the zipped iterable, to average the columns.
How to Unzip Iterables
Now that we have an understanding of both the zip and * operator, we can combine both to reverse the zip function.
Once we have zipped some iterables together, we can recover them by unpacking all the zipped tuples and then zipping them together.
Going over it step by step:
products = ['Apple', 'Banana', 'Cherry']
prices = [1.5, 0.75, 2.25]
quantities = [10, 15, 5]
# Using zip() to combine the lists
sales = zip(products, prices, quantities)
We first spread the iterables, that creates new iterables that is similar to the transpose matrix example above.
print(*sales)
# Output
# ('Apple', 1.5, 10) ('Banana', 0.75, 15) ('Cherry', 2.25, 5)
We can now zip them together, such that all elements on first position are members of first iterable, all elements on second position are members of second iterable and so on.
It’s important to note that once you iterate over a zip object, it is exhausted, meaning that the zip object becomes empty.
Thus, we do this in a single-line short hand:
products, prices, quantities = zip(*sales)
Strict Keyword Argument
Zip function allows iterable of different sizes by default. If the iterables passed are of different sizes, only the elements up to the shortest iterable are zipped.
list(zip(range(3), ['fee', 'fi', 'fo', 'fum']))
# Output [(0, 'fee'), (1, 'fi'), (2, 'fo')]
However, mostly zip function is used for iterables of same sizes. To ensure this restriction, we can pass strict=True. This will raise a ValueError if there is a size match between iterable arguments.
Benefits of Using Zip
Memory Efficient
Zip is lazy. The iterables are generated on-the-fly during iteration. So a new list is not required to store the zipped iterables.
Flexibility
Zip works with a wide range of iterables such as lists, dictionaries, tuples and strings. It can even work with user-defined classes. All you have to do is implement the __iter__ dunder method in Python. A code example illustrates this.
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __iter__(self):
return iter([self.name, self.age])
# Create instances of the Person class
person1 = Person("John", 30)
person2 = Person("Alice", 25)
person3 = Person("Bob", 35)
# Zip the Person objects
zipped = zip(person1, person2, person3)
# Iterate over the zipped object and print the elements
for item in zipped:
print(item)
Conclusion
The zip function is versatile function that allows simultaneous iterations with simplified code structure. It can be combined with the map, filter and reduce function to achieve complex data manipulation using a few lines of code.
If you like this article, follow me for more articles on Python and Machine Learning domain.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI