Here’s How to Create a Bar Chart Race in Minutes for Any Data
Last Updated on June 11, 2024 by Editorial Team
Author(s): Muhammad Saad Uddin
Originally published on Towards AI.
Before diving into this article, let me share a little story with you. I had two potential titles for this piece: “Building Dynamic Bar Chart Races for LLM Leaderboards in Python” and “Here’s How to Create a Bar Chart Race in Minutes for Any Data.” I found myself awkwardly torn between the two. So, I turned to GPT-4 for some much-needed advice (don’t judge me!). After weighing the pros and cons that GPT-4 provided, I ended up choosing the latter, even though my personal preference was for the former. This time, the LLM managed to convince me to go with the second option. Cheers to being open to constructive criticism and embracing suggestions, even from an AI!
The idea to create a bar chart race was originally sparked after I stumbled upon an eye-catching analysis on LinkedIn, which originally lead to this post on X and some code inspiration from
Karolina Stawicka. Motivated by these influences, I decided to create my own version, which ultimately inspired me to write this article. 😊
Before diving into the technical details, I want to give a shout-out to the sources of our data and tools. All the data used in this project is sourced from the LMSYS Leaderboard available on HuggingFace 🤗. For creating these stunning visuals, I utilized the bar char race library from PyPi. Thanks to these resources, I am able to bring this project to life!
Setting Up
We begin by installing the bar_chart_race
package. This package, available on PyPi, provides the tools necessary to create dynamic and engaging bar chart race visuals. To install it, simply use the following command in your terminal:
pip install bar_chart_race
Next, we import the necessary modules. These modules include the bar_chart_race
package, as well as other essential libraries for data manipulation and visualization.
import bar_chart_race as bcr
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
import pandas as pd
import seaborn as sns
With these imports, we’re all set to begin building our bar chart race visualizations!
Data Preparation
Next, we import and read data from the LMSYS repository. The data available there is not uniform and has some data quality issues. I will not go into details about data preprocessing step but if you wan to know about it contact me directly and I will share the notebook with you.
df = pd.read_csv('/content/lmsys.csv',sep=';')
df['date'] = pd.to_datetime(df['date'], format='%Y%m%d')
# Pivot the DataFrame
df = df.drop_duplicates(subset=["date", "model"])
elo_pivot = df.pivot(index='date', columns='model', values='elo_rating_final')
elo_pivot = elo_pivot.replace('-', 0).fillna(0)
elo_pivot = elo_pivot.astype(float)
elo_pivot
We converted the cleaned and preprocessed CSV into a pivot table where the index consists of all dates, the columns are the model names, and the values contain the Elo ratings. The “Arena Elo rating” is a measurement system used within the Chatbot Arena platform to evaluate the relative skill levels of various LLMs based on their performance in anonymous, randomized battles.
Here’s a snapshot of what the DataFrame looks like now:
Creating the Bar Chart Race
Now, we can directly create a visualization by calling the bar_chart_race
function. However, trust me, the default results aren't very impressive. For those who want to give it a try, here's the basic code:
#bcr.bar_chart_race(elo_pivot)
#bcr.bar_chart_race(elo_pivot, n_bars=10)
#bcr.bar_chart_race(elo_pivot, n_bars=10, steps_per_period=20, period_length=1000)
Next, I built some functions on top of the default setup to add customizations and make the visuals more dynamic and appealing. These enhancements include adjusting colors, fonts, labels, and other stylistic elements to create a more engaging and professional presentation. Here’s an example of how you can enhance the visualization:
def customize_bcr_chart(title):
"""
This function initiates a bar chart with a given title. It sets up the figure and axes, applies styling to the plot,
and returns the figure and axes objects for further customization and plotting.
Parameters:
title (str): The title of the bar chart.
Returns:
fig (Figure): The figure object for the plot.
ax (Axes): The axes object for the plot.
"""
# Set the font family to Helvetica
plt.rcParams['font.family'] = 'Helvetica'
# Initiate the figure and axes with specified size, background color, and resolution
fig, ax = plt.subplots(figsize=(12,8), facecolor='white', dpi=80)
ax.margins(0, 0.01) # Set margins
ax.set_axisbelow(True) # Ensure gridlines are below other elements
# Configure grid and tick parameters
ax.grid(which='major', axis='x', linestyle='-', linewidth=0.2, color='dimgrey')
ax.tick_params(axis='x', colors='dimgrey', labelsize=12, length=0)
ax.tick_params(axis='y', colors='dimgrey', labelsize=12, length=0)
# Customize the spines of the plot
for pos in ['top', 'bottom', 'right', 'left']:
if pos == 'top':
ax.spines[pos].set_edgecolor('dimgrey')
else:
ax.spines[pos].set_edgecolor('white')
# Format the x-axis to show numbers with commas
ax.xaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
ax.xaxis.set_ticks_position('top') # Position x-axis ticks on top
# Set the title of the plot
ax.set_title(title, fontsize=18, color='dimgrey')
return fig, ax
and we used it as:
palette = sns.color_palette("mako", 24).as_hex()
title = 'LLM Leaderboard 2023-24'
fig, ax = customize_bcr_chart(title)
bcr.bar_chart_race(df=elo_pivot,
n_bars=15,
fig=fig,
steps_per_period=25,
period_length=2500,
cmap=palette,
period_fmt='%b %-d, %Y',
filter_column_colors=True,
filename='elo_mako.mp4')
The reason for the high steps and period length was to ensure smooth and detailed transitions in the animation, making it visually appealing and informative. The palette can be changed via the color
parameter, allowing you to customize the colors used in the bar chart race to fit your preferences or branding. Additionally, you can save the animation directly as an MP4 file by specifying the filename
parameter in the bcr.bar_chart_race
function, providing a convenient way to share or present your visualizations.
Enhancing with Dark Mode
Next, I thought, let’s improve the customization further and build a BCR visual in “dark mode” because all devs like it 🙂
def customize_bcr_chart(title):
"""
This function initiates a bar chart with a given title. It sets up the figure and axes, applies styling to the plot,
and returns the figure and axes objects for further customization and plotting.
Parameters:
title (str): The title of the bar chart.
Returns:
fig (Figure): The figure object for the plot.
ax (Axes): The axes object for the plot.
"""
# Set the font family to Helvetica
plt.rcParams['font.family'] = 'Helvetica'
# Initiate the figure and axes with specified size and resolution
fig, ax = plt.subplots(figsize=(12,8), dpi=80)
# Set the background color of the figure and axes to black
fig.patch.set_facecolor('black')
ax.set_facecolor('black')
# Set margins and ensure gridlines are below other elements
ax.margins(0, 0.01)
ax.set_axisbelow(True)
# Configure grid and tick parameters
ax.grid(which='major', axis='x', linestyle='-', linewidth=0.2, color='dimgrey')
ax.tick_params(axis='x', colors='white', labelsize=12, length=0)
ax.tick_params(axis='y', colors='white', labelsize=12, length=0)
# Customize the spines of the plot
for pos in ['top', 'bottom', 'right', 'left']:
if pos == 'top':
ax.spines[pos].set_edgecolor('dimgrey')
else:
ax.spines[pos].set_edgecolor('black')
# Format the x-axis to show numbers with commas
ax.xaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
ax.xaxis.set_ticks_position('top') # Position x-axis ticks on top
# Set the title of the plot
ax.set_title(title, fontsize=18, color='white')
return fig, ax
These are the improvements made in the revised function customize_bar_chart
:
- Background Color Adjustment:
The background color of the figure and axes is set to black (fig.patch.set_facecolor('black')
andax.set_facecolor('black')
) . - Color Contrast for Text and Grid:
The tick parameters and title color are changed to white to ensure they are readable against the black background (ax.tick_params(axis='x', colors='white', labelsize=12, length=0)
andax.tick_params(axis='y', colors='white', labelsize=12, length=0)
, andax.set_title(title, fontsize=18, color='white')
) - Consistency in Spine Colors:
The spines colors are updated to match the background (ax.spines[pos].set_edgecolor('black')
) except for the top spine which is set todimgrey
. This ensures that the plot has a consistent look with the black background. - Removed Unnecessary Facecolor Setting:
Thefacecolor
parameter inplt.subplots
is omitted since the background color is set later withfig.patch.set_facecolor
andax.set_facecolor
. This reduces redundancy and potential conflicts.
These adjustments improve the visual appeal and readability of the chart in scenarios where a dark theme is preferable.
Adding Period Summary
def period_summary_func(values, ranks): #### not used
"""
This function summarizes the period information for the bar chart race.
It extracts the last date from the DataFrame index and formats it for display.
Parameters:
values (DataFrame): The DataFrame containing the values for the bar chart.
ranks (DataFrame): The DataFrame containing the ranks for the bar chart.
Returns:
dict: A dictionary containing the formatted date and text properties for display.
"""
# Get the last date from the index of 'values' DataFrame
date = values.index[-1]
return {'s': date.strftime('%b %-d, %Y'), 'x': .99, 'y': .05, 'ha': 'right', 'size': 16, 'family': 'DejaVu Sans', 'weight': 'bold', 'color': 'white'}
The period_summary_func
function is designed to create a summary of the period information for a bar chart race. It extracts the last date from the index of a provided DataFrame (values
), formats this date, and returns a dictionary with the formatted date and text properties for display. The text properties include position, alignment, font details, and color, making the summary suitable for visual representation in the bar chart race.
Customizing Bar Labels
def customize_bar_labels(bars): #### not used
"""
This function customizes the bar labels for the bar chart race.
It adjusts the color of the text labels on each bar to white.
Parameters:
bars (BarContainer): The container with all the bars to be labeled.
"""
# Adjust the bar label properties
for bar in bars:
# Get the text from each bar (matplotlib.container.BarContainer)
text = bar.get_text()
bar.set_color('white')
The customize_bar_labels
function customizes the appearance of bar labels in a bar chart race by setting their color to white.
Final Touches
Finally, we put everything together:
title = 'LLM Leaderboard 2023-24'
fig, ax = customize_bcr_chart(title)
bcr.bar_chart_race(
df=elo_pivot,
n_bars=15,
fig=fig,
steps_per_period=25,
period_length=2500,
cmap='ice_r',
period_fmt='%b %-d, %Y',
filter_column_colors=True,
filename='llms_elo.mp4',
period_label={'color': 'white', 'fontsize': 16, 'x': 0.99, 'y': 0.1, 'ha': 'right'}, # Corrected period label
bar_kwargs={'alpha': 0.8, 'ec': 'white', 'lw': 1.5}, # Set bar transparency for better visibility of labels if needed
bar_label_size=12, # Increase the bar label size for better visibility
tick_label_size=12, # Increase tick label size for better visibility
#bar_label_font = {'size': 12, 'family': 'Helvetica', 'color': '#7f7f7f'}
period_summary_func=period_summary_func
)
# Modify the text properties for the bar labels
for container in ax.containers:
customize_bar_labels(container.get_children())
Let’s see the changes:
Extending to MMLU Ratings
I did it for MMLU rating as well, MMLU (Massive Multitask Language Understanding) is a benchmark for evaluating LLMs through multiple-choice questions. For scoring, MMLU averages each model’s performance per category (humanities, social science, STEM, and others) and then averages these four scores for a final score.
MMLU Purpose:
- Evaluation of LLMs: MMLU is used to assess the capability of LLMs to understand and process information across various domains.
- Diversity of Tasks: It covers a wide array of subjects, ensuring that models are tested on multiple facets of knowledge.
MMLU Significance:
- Comprehensive Evaluation: By spanning a wide range of subjects, MMLU ensures that LLMs are not only specialized in specific areas but have a broad understanding.
- Benchmarking Progress: MMLU helps track the progress and improvements in LLMs over time, providing a standard against which different models can be compared.
title = 'LLM Leaderboard 2023-24: MMLU'
fig, ax = customize_bcr_chart(title)
bcr.bar_chart_race(
df=elo_pivot,
n_bars=10,
fig=fig,
steps_per_period=15,
period_length=2000,
cmap='ice_r', #lighter color on a dark background
period_fmt='%b %-d, %Y',
filter_column_colors=True,
filename='mmlu_icr.mp4',
period_label={'color': 'white', 'fontsize': 16, 'x': 0.99, 'y': 0.1, 'ha': 'right'}, # Corrected period label
bar_kwargs={'alpha': 0.8, 'ec': 'white', 'lw': 1.5}, # Set bar transparency for better visibility of labels if needed
bar_label_size=12, # Increase the bar label size for better visibility
tick_label_size=12, # Increase tick label size for better visibility
)
# Modify the text properties for the bar labels
for container in ax.containers:
for text in container.get_children():
# Check if the object is an instance of Text
if isinstance(text, plt.Text):
text.set_color('white')
for label in ax.get_yticklabels():
label.set_color("white")
label.set_fontsize(12)
Lets see how this one looks.
Amazing, isn’t it?
That’s it for today. I hope you enjoyed this article and are excited to create your own stunning visuals with BCR. Recently, I’ve been focusing on topics related to LLMs, which is why this article also includes information about them. If you’re interested in learning more about LLMs and how to build your own applications, whether it’s RAG, function calling, or agents, be sure to follow me. I will be sharing the latest trends in this domain and showing you how to work with them in the most engaging and straightforward ways. Stay tuned for more exciting projects and insightful guides!. Furthermore:
- 👏 Clap for the story (50 claps) to help this article be featured
- 🔔 Follow Me: LinkedIn |Medium | Website
- 🌟 Need help in converting these prototypes to products? Contact Me!
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI