Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Regression Feature Selection using the Kydavra LassoSelector
Machine Learning

Regression Feature Selection using the Kydavra LassoSelector

Last Updated on September 18, 2020 by Editorial Team

Author(s): Vasile Păpăluță

Machine Learning

This image was created by public association β€œSigmoid”

We all know the Occam’sΒ Razor:

From a set of solutions took the one that is the simplest.

This principle is applied in the regularization of the linear models in Machine Learning. L1-regularisation (also known as LASSO) tend to shrink wights of the linear model to 0 while L2-regularisation (known as Ridge) tend to keep overall complexity as simple as possible, by minimizing the norm of vector weights of the model. One of Kydavra’s selectors uses Lasso for selecting the best features. So let’s see how to applyΒ it.

Using Kydavra LassoSelector.

If you still haven’t installed Kydavra just type the following in the following in the commandΒ line.

pip install kydavra

Next, we need to import the model, create the selector, and apply it to ourΒ data:

from kydavra import LassoSelector
selector = LassoSelector()
selected_cols = selector.select(df, β€˜target’)

The select function takes as parameters the panda's data frame and the name of the target column. Also, it has a default parameter β€˜cv’ (by default it is set to 5) it represents the number of folds used in cross-validation. The LassoSelector() takes the next parameters:

  • alpha_start (float, default = 0) the starting value ofΒ alpha.
  • alpha_finish (float, default = 2) the final value of alpha. These two parameters define the search space of the algorithm.
  • n_alphas (int, default = 300) the number of alphas that will be tested during theΒ search.
  • extend_step (int, default=20) if the algorithm will deduce that the most optimal value of alpha is alpha_start or alpha_finish it will extend the search range with extend_step, in such a way being sure that it will not stick and will find finally the optimalΒ value.
  • power (int, default = 2) used in formula 10^-power, defines the maximal acceptable value to be taken asΒ 0.

So the algorithm after finding the optimal value of alpha will just see which weights are higher than 10^-power.

Let’s take see anΒ example:

To show its performance I chose the Avocado PricesΒ dataset.

After a bit of cleaning and training it on the next features:

'Total Volume', '4046', '4225', '4770', 'Small Bags', 'Large Bags', 'XLarge Bags', 'type', 'year'

The LinearRegression has the mean absolute error equal to 0.2409683103736682.

When LassoSelector applied on this dataset it chooses the next features:

'type', 'year'

Using only these features we got an MAE = 0.24518692823037008

A quite good result (keep in mind, we are using only 2 features).

Note: Sometimes is recommended to apply the lasso on scaled data. In this case, applied to the data, the selector didn’t throw away any feature. You are invited to experiment and try with scaled and unscaledΒ values.

Bonus.

This module also has a plotting function. After applying the select function you can see why the selector selected some features and others not. To plot justΒ type:

selector.plot_process()
This is the plot created by Kydavra LassoSelector on Avocado PriceΒ Dataset

The dotted lines are features that were thrown away because their weights were too close to 0. The central-vertical dotted line is the optimal value of the alpha found by the algorithm.

The plot_process() function has the next parameters:

  • eps (float, default = 5e-3) the length of theΒ path.
  • title (string, default = β€˜Lasso coef Plot’)β€Šβ€”β€Šthe title of theΒ plot.
  • save (boolean, default= False) if set to true it will try to save theΒ plot.
  • file_path (string, default = None) if the save parameter was set to true it will save the plot using thisΒ path.

Conclusion

LassoSelector is a selector that usee the LASSO algorithm to select features the most useful features. Sometimes it will be useful to scale the features, we highly recommend you to tryΒ both.

Source tenor.com

If you tried kydavra we invite you to share your impression by filling out thisΒ form.

Made with ❀ by Sigmoid.

Useful links:


Regression Feature Selection using the Kydavra LassoSelector was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓