Speed up EDA With the Intelligent Lux

Last Updated on January 6, 2023 by Editorial Team

Automate your visual data exploration with the new python library, Lux 💡.

Have you ever been tired of writing multiple lines of code even for a simple graph during EDA?

Did you ever wish for recommendation-based interactive graphs within the jupyter notebook itself?

If that’s a big yes!

Thankfully! We now have the new python library, Lux.

This article is based on Doris Jung-Lin Lee’s session in WiCDS 2021.

Lux is a python API for intelligent visual discovery, which comes with an inbuilt interactive jupyter widget.

Lux could be your intelligent assistant which can automate the visual aspects of the exploratory data analysis.
It provides powerful abstractions of the visualizations soon after the data frame has been displayed in the jupyter notebook with just a click.
Lux is a very rich user intent-based language.

The main intention of the Lux Library is,to make the visualizations as simple as loading a dataframe.

The interactive Lux widget assists the user to quickly browse through the data and view important trends and patterns. It provides recommendations for the user to analyze further. Lux, can also create visualizations for those sections of the data, you have no clear idea about.

Lux works pretty well with the pandas and you do not have to worry about modifying the code. In fact, Lux was developed in such a way that, it preserves the pandas data frame semantics. This means it synchronizes its behavior with the pandas instructions itself.

That’s Awesome Right!

Let’s get started and bring in our intelligent visual assistant powered by Lux.

Installation requirements

Lux can be installed through PyPI.

pip install lux-api

2. If you use conda, Lux can be installed by,

conda install -c conda-forge lux-api

3. For the setup in the jupyter notebook, you need to add the following extensions as well.

jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget

That’s it! we are ready to go…

Case Study

Let’s consider an example dataset to explore the features of the Lux library.

I would be using the Graduate Admission dataset taken from the Kaggle data repository.

This dataset contains several parameters which are considered important during the application for Masters Programs.

Data Dictionary

GRE Scores ( out of 340 )
TOEFL Scores ( out of 120 )
University Rating ( out of 5 )
Statement of Purpose and Letter of Recommendation Strength ( out of 5 )
Undergraduate GPA ( out of 10 )
Research Experience ( either 0 or 1 )
Chance of Admit ( ranging from 0 to 1 )

1. Importing all the necessary libraries

Now that the package has been successfully installed. We just have to import the lux library into our jupyter notebook.

2. Loading the data set and checking the concise summary

Let’s load the dataset and check the top 5 rows.

Checking the shape of the data set.

(400, 9)

There are a total of 400 rows and 9 columns.

Removing the first column Serial No. and checking the concise summary of the data set with the info()

We observe that the data type of all the 8 columns in the dataset is numeric.

3. Visual Data Exploration with Lux 💡

Let’s now display the data frame and explore the Lux widget.

When the data frame is displayed Lux by default provides us with 3 tabs which are Correlation, Distribution, and Occurrence.

Let’s get to know each of these

Correlation

The correlation tab displays the relationship between the quantitative variables present in the dataset.

The order in which it displays is the most correlated ones to the least correlated ones.

2. Distribution

The distribution tab displays the histograms of the quantitative variables in the dataset.

The order in which it displays is the highly skewed ones to the least skewed ones.

3. Occurrence

The occurrence tab displays the bar charts of the categorical attributes.

The order it follows is the most uneven distribution to the even distribution.

Although our dataset did not contain any features with the categorical data type. It did recommend bar charts for those features which it thinks might be useful for our analysis.

4. Visualizations and Recommendations based on user intent.

Let’s say you want to know more about a specific feature or multiple features together. You can get all the visualizations related to those attributes with the help of intent

The lux widget not only displays the visualization for that feature intended. But will also provide you with additional recommendations for further analysis with the help of Filter and Enhance options.

Enhance

The Enhance feature of lux adds an additional attribute to the intended attributes specified by the user for visualization.

It lets the user compare the effect of the added attribute to the intended visualization. This is similar to adding a hue.

2. Filter

The filter lets the user visualize the intended attributes for different subsets of the data.

Let’s understand better with the following examples.

Consider one attribute CGPA,

df.intent=[“CGPA”]
df

1.Enhance Recommendations for one attribute

Enhance Tab recommendations when the intended attribute is CGPA, Source: Image by Author

The Enhance tab when the given input is one feature “CGPA” fixes the intended variable “CGPA” on the x-axis and gives us recommendations by comparing it with different attributes.

2. Filter Recommendations for one attribute

Filter Tab recommendations when the intended attribute is CGPA, Source: Image by Author

The Filter tab fixes the intended variable “CGPA” on the x-axis and gives us recommendations by comparing it with different subparts of the data set.

Consider two attributes “TOEFL Score” and “GRE Score”,

df.intent=[“TOEFL Score”,”GRE Score”]
df

1.Enhance Recommendations for two attributes

Enhance Tab recommendations when the intended attributes are TOEFEL Score and GRE Score, Source: Image by Author

The Enhance tab when the given input is two attributes “TOEFL Score”, “GRE Score”. It fixes the intended variables “TOEFL Score” on the x-axis and the “GRE Score” on the y-axis. It then gives us recommendations by comparing with different attributes.

2. Filter Recommendations for two attributes

Filter Tab recommendations when the intended attributes are TOEFEL Score and GRE Score, Source: Image by Author

The Filter tab when the given input is two attributes “TOEFL Score”, “GRE Score”. It fixes the intended variable “TOEFL Score” on the x-axis and the “GRE Score” on the y-axis. It then gives us recommendations by comparing both together with different subparts of the data.

5. Exporting Visualizations.

Lux makes it very easy to share the visualizations. To export visualizations into a static HTML the following command has to be used.

df.save_as_html(“File name.html”)

Conclusion

Lux the new python open-source library is definitely making data exploration a lot easier. This article has demonstrated how Lux was able to automate most of our visualizations with very minimal code. It also explained some of the prominent features of the Lux library.

Status of Project Lux: Currently, Lux is in its early development stage.

Resources

To know more about the Lux library you can find the details at lux-API.

You can also try their Hands-on exercises or tutorials on Binder.

Hope you enjoyed reading this article!

Please feel free to check my other articles on pranaviduvva at medium.

Thanks for reading!

Speed up EDA With the Intelligent Lux was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Speed up EDA With the Intelligent Lux

Author(s): Pranavi Duvva

Data Analysis

Automate your visual data exploration with the new python library, Lux 💡.

Installation requirements

Case Study

1. Importing all the necessary libraries

2. Loading the data set and checking the concise summary

3. Visual Data Exploration with Lux 💡

4. Visualizations and Recommendations based on user intent.

5. Exporting Visualizations.

Conclusion

Towards AI Team

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

LAI #66: Information Theory for People in a Hurry

🔎 Decoding LLM Pipeline — Step 1: Input Processing & Tokenization

Meta to Launch Its Own In-House AI Chip

I Built an AI Money Coach in Python — Here’s How You Can Too (Step-by-Step Guide!)

ChatGPT Now Works Natively in Xcode and VS Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Speed up EDA With the Intelligent Lux

Author(s): Pranavi Duvva

Automate your visual data exploration with the new python library, Lux 💡.

Installation requirements

Case Study

1. Importing all the necessary libraries

2. Loading the data set and checking the concise summary

3. Visual Data Exploration with Lux 💡

4. Visualizations and Recommendations based on user intent.

5. Exporting Visualizations.

Conclusion

Towards AI Team

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥