Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

Try These Pandas Display Configurations in Your Next Analysis
Latest

Try These Pandas Display Configurations in Your Next Analysis

Last Updated on June 26, 2022 by Editorial Team

Author(s): Hrishikesh Patel

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Make your Jupyter notebook more presentable with these useful Pandas display customizations

Image by author

While analyzing data using Pandas, you might have faced the following display-related issues:

  1. Unable to see the entire text if they are lengthy. In the following image, URLs get shortened.
Lengthy text gets truncated in Pandas (image by author)

2. Pandas by default show large floating-point numbers using scientific notation, e.g. 1,000,000.5 is shown as 1.000e+06

Large floats are shown using scientific notation (image by author)

3. Inconsistent decimal place accuracy among float type columns. E.g, in the following figure, col_1 has one digit after decimal point whereas col_2 has three digits post decimal point. Though this will not affect your analysis, it might not look good while sharing your notebook with others.

Inconsistent precision among columns (image by author)

In this story, I am going to cover how to solve these issues using the following common Pandas display customizations.

List of contents

  1. Customize how many rows to display
  2. Customize how many columns to display
  3. Customize column width
  4. Make decimal place accuracy consistent among float columns
  5. Disable the scientific notation
  6. Bonus!

Note: These options only change how data will be displayed. It does not affect underlying data.

  1. Customize how many rows to display

When you print a large data frame, pandas display the first 5 and last 5 rows by default as illustrated below.

Pandas by default display 10 rows (image by author)

However, we can change how many rows to display by setting a value for the display option display.max_rows. Let’s set it to 4.

Displays 4 rows after setting ‘display.max_rows’ to 4 (image by author)

You can also reset the option using pd.reset_option("display.max_rows") to return to the default behavior.

2. Customize how many columns to display

You can customize the number of columns to show while printing the data frame by setting display.max_columns.

Displays 6 columns by setting `display.max_columns’ to 6 (image by author)

Like previous, you can also reset this option using pd.reset_option("display.max_columns") to return to the default behavior.

3. Customize column width

In the following image, we cannot see the full text for the first two rows as their character length exceeds 50.

Lengthy text gets shortened in Pandas (image by author)

However, after setting display.max_colwidth it to 70, we can see the entire text. You can choose a different number based on your data.

Displays full text by setting ‘display.max_colwidth’ to 70

This option can also be reset by using pd.reset_option("display.max_colwidth") .

4. Make decimal place accuracy consistent among float columns

Currently, col_1 and col_2 have inconsistent decimal place accuracy as depicted below.

Inconsistent decimal accuracy among float columns (image by author)

By setting display.float_format to "{:.2f}".format we can make the format consistent. As shown in the image below, the option will only affect the float columns, not the integer columns.

Making decimal place accuracy consistent among float columns by setting ‘display.float_format’ to “{:.2f}”.format (image by author)

This option can be reset using pd.reset_option("display.float_format")

5. Disable the scientific notation

Pandas by default show large float values in the scientific notation.

Large floating-point numbers are displayed in scientific notation (image by author)

By setting display.float_format to "{:,.2f}".format , we can add a separator for thousands and also set decimal place accuracy to two decimals.

Add thousands separator by setting “display.float_format” to “{:,.2f}”.format (image by author)

You can also add a $ sign before the numbers to show currency by setting display.float_format to "$ {:,.2f}".format.

Add $ before a number by setting “display.float_format” to “$ {:,.2f}”.format (image by author)

6. Bonus

How to find all such useful display options when you are working offline? The trick is to use pd.describe_option() and you’ll get a list of all available options.

However, if you are looking for a specific option, you can type the option name as an argument in pd.describe_option() . For example, pd.describe_option("max_rows") will print the description of display.max_rows the option.

Get the description of “display.max_rows” option using ‘pd.describe_option(“max_rows”)’ (image by author)

Reference
Pandas Options and Settings

Before you go!

I hope you have enjoyed the story and found it useful. Follow me on Medium if you’d like more stories like this and subscribe to me to get my new stories directly into your inbox.

My other stories you might enjoy…


Try These Pandas Display Configurations in Your Next Analysis was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓