Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Speed up EDA With the Intelligent Lux
Data Analysis

Speed up EDA With the Intelligent Lux

Last Updated on January 6, 2023 by Editorial Team

Author(s): Pranavi Duvva

Image by Colin Behrens fromΒ Pixabay

Data Analysis

Automate your visual data exploration with the new python library, LuxΒ πŸ’‘.

Have you ever been tired of writing multiple lines of code even for a simple graph duringΒ EDA?

Did you ever wish for recommendation-based interactive graphs within the jupyter notebookΒ itself?

If that’s a bigΒ yes!

Thankfully! We now have the new python library,Β Lux.

This article is based on Doris Jung-Lin Lee’s session in WiCDSΒ 2021.

Lux is a python API for intelligent visual discovery, which comes with an inbuilt interactive jupyterΒ widget.

  • Lux could be your intelligent assistant which can automate the visual aspects of the exploratory data analysis.
  • It provides powerful abstractions of the visualizations soon after the data frame has been displayed in the jupyter notebook with just aΒ click.
  • Lux is a very rich user intent-based language.

The main intention of the Lux Library is,to make the visualizations as simple as loading a dataframe.

The interactive Lux widget assists the user to quickly browse through the data and view important trends and patterns. It provides recommendations for the user to analyze further. Lux, can also create visualizations for those sections of the data, you have no clear ideaΒ about.

Source: Image byΒ Author

Lux works pretty well with the pandas and you do not have to worry about modifying the code. In fact, Lux was developed in such a way that, it preserves the pandas data frame semantics. This means it synchronizes its behavior with the pandas instructions itself.

That’s AwesomeΒ Right!

Let’s get started and bring in our intelligent visual assistant powered byΒ Lux.

Installation requirements

  1. Lux can be installed throughΒ PyPI.
pip install lux-api

2. If you use conda, Lux can be installed by,

conda install -c conda-forge lux-api

3. For the setup in the jupyter notebook, you need to add the following extensions asΒ well.

jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget

That’s it! we are ready toΒ go…

Case Study

Let’s consider an example dataset to explore the features of the LuxΒ library.

I would be using the Graduate Admission dataset taken from the Kaggle data repository.

This dataset contains several parameters which are considered important during the application for Masters Programs.

Data Dictionary

  1. GRE Scores ( out of 340Β )
  2. TOEFL Scores ( out of 120Β )
  3. University Rating ( out of 5Β )
  4. Statement of Purpose and Letter of Recommendation Strength ( out of 5Β )
  5. Undergraduate GPA ( out of 10Β )
  6. Research Experience ( either 0 or 1Β )
  7. Chance of Admit ( ranging from 0 to 1Β )

1. Importing all the necessary libraries

Now that the package has been successfully installed. We just have to import the lux library into our jupyter notebook.

2. Loading the data set and checking the conciseΒ summary

Let’s load the dataset and check the top 5Β rows.

Source: Image byΒ Author

Checking the shape of the dataΒ set.

(400, 9)

There are a total of 400 rows and 9Β columns.

Removing the first column Serial No. and checking the concise summary of the data set with theΒ info()

Source: Image byΒ Author

We observe that the data type of all the 8 columns in the dataset isΒ numeric.

3. Visual Data Exploration with LuxΒ πŸ’‘

Let’s now display the data frame and explore the LuxΒ widget.

Source: Image byΒ Author

When the data frame is displayed Lux by default provides us with 3 tabs which are Correlation, Distribution, and Occurrence.

Let’s get to know each ofΒ these

  1. Correlation
Source: Image byΒ Author

The correlation tab displays the relationship between the quantitative variables present in theΒ dataset.

The order in which it displays is the most correlated ones to the least correlated ones.

Source: Image byΒ Author

2. Distribution

Source: Image byΒ Author

The distribution tab displays the histograms of the quantitative variables in theΒ dataset.

The order in which it displays is the highly skewed ones to the least skewedΒ ones.

Source: Image byΒ Author

3. Occurrence

Source: Image byΒ Author

The occurrence tab displays the bar charts of the categorical attributes.

The order it follows is the most uneven distribution to the even distribution.

Although our dataset did not contain any features with the categorical data type. It did recommend bar charts for those features which it thinks might be useful for our analysis.

4. Visualizations and Recommendations based on userΒ intent.

Let’s say you want to know more about a specific feature or multiple features together. You can get all the visualizations related to those attributes with the help ofΒ intent

The lux widget not only displays the visualization for that feature intended. But will also provide you with additional recommendations for further analysis with the help of Filter and EnhanceΒ options.

  1. Enhance

The Enhance feature of lux adds an additional attribute to the intended attributes specified by the user for visualization.

It lets the user compare the effect of the added attribute to the intended visualization. This is similar to adding aΒ hue.

2. Filter

The filter lets the user visualize the intended attributes for different subsets of theΒ data.

Let’s understand better with the following examples.

Consider one attribute CGPA,

df.intent=[β€œCGPA”]
df

1.Enhance Recommendations for one attribute

Enhance Tab recommendations when the intended attribute is CGPA, Source: Image byΒ Author

The Enhance tab when the given input is one feature β€œCGPA” fixes the intended variable β€œCGPA” on the x-axis and gives us recommendations by comparing it with different attributes.

2. Filter Recommendations for one attribute

Filter Tab recommendations when the intended attribute is CGPA, Source: Image byΒ Author

The Filter tab fixes the intended variable β€œCGPA” on the x-axis and gives us recommendations by comparing it with different subparts of the dataΒ set.

Consider two attributes β€œTOEFL Score” and β€œGREΒ Score”,

df.intent=[β€œTOEFL Score”,”GRE Score”]
df

1.Enhance Recommendations for two attributes

Enhance Tab recommendations when the intended attributes are TOEFEL Score and GRE Score, Source: Image byΒ Author

The Enhance tab when the given input is two attributes β€œTOEFL Score”, β€œGRE Score”. It fixes the intended variables β€œTOEFL Score” on the x-axis and the β€œGRE Score” on the y-axis. It then gives us recommendations by comparing with different attributes.

2. Filter Recommendations for two attributes

Filter Tab recommendations when the intended attributes are TOEFEL Score and GRE Score, Source: Image byΒ Author

The Filter tab when the given input is two attributes β€œTOEFL Score”, β€œGRE Score”. It fixes the intended variable β€œTOEFL Score” on the x-axis and the β€œGRE Score” on the y-axis. It then gives us recommendations by comparing both together with different subparts of theΒ data.

Source: Image byΒ Author

5. Exporting Visualizations.

Lux makes it very easy to share the visualizations. To export visualizations into a static HTML the following command has to beΒ used.

df.save_as_html(β€œFile name.html”)

Conclusion

Lux the new python open-source library is definitely making data exploration a lot easier. This article has demonstrated how Lux was able to automate most of our visualizations with very minimal code. It also explained some of the prominent features of the LuxΒ library.

Status of Project Lux: Currently, Lux is in its early development stage.

Resources

To know more about the Lux library you can find the details atΒ lux-API.

You can also try their Hands-on exercises or tutorials onΒ Binder.

Hope you enjoyed reading thisΒ article!

Please feel free to check my other articles on pranaviduvva atΒ medium.

Thanks forΒ reading!


Speed up EDA With the Intelligent Lux was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓