Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Python Prior Machine Learning Part 2 & Data Analysis
Latest

Python Prior Machine Learning Part 2 & Data Analysis

Last Updated on January 6, 2023 by Editorial Team

Last Updated on July 16, 2022 by Editorial Team

Author(s): Gencay I.

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Machine Learning Prior Part 2 & DataΒ Analysis

Data Frame Analysis withΒ Python

Photo by Markus Spiske onΒ Unsplash
Content Table
Β· Introduction
∘ Installation
Β· How to gather your data?
∘ Example
Β· How long your data frame is? What are the column data types? How can I look at a little bit of my data?
∘ Info
∘ Shape
∘ Sample
∘ Head
∘ Tail
∘ Describe
∘ Value Counts
Β· How to select your pre-defined row?
Β· How to select multiple columns?
∘ First Two Columns
∘ Select Columns with Name
∘ Select Column with their Indexes
Β· How can I sort the values?
Β· How can I look at the mean/standard deviation/max of one column per its categories?
Β· How can I drop the NA Values?
Β· Conclusion

Introduction

Hi from another Machine Learning Tutorial. I want to explain this to you guys briefly here and really think about that, how can I explain it really briefly? Reading too many articles may have helped me. I want to explain to you guys the pandas library with questions and theirΒ answers.

Installation

Now let's begin with the installation process.

Here is the main page of the panda'sΒ library.

Pip or conda, this will depend on yourΒ set-up.

pip install pandas
conda install pandas

Now it's time to import yourΒ package.

import pandas as pd

How to gather yourΒ data?

Now it is time to download yourΒ Data.

CSV is mostly used file type when you will deal withΒ pandas.

url = " " 
col =
df = pd.read_csv(β€œβ€)
  • The URL you will download yourΒ data.
  • The column you want to select toΒ see.
  • Define your data frame asΒ df.

Here is the documentation of this method, and you can see the following codes.

Example

Now let's look up real-life examples.

Iris data set is really famous one, you can download it by using the sklearn datasets module or seaborn or viaΒ URL.

This is perhaps the best known database to be found in the pattern recognition literature. Fisher’s paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other.

Predicted attribute: class of irisΒ plant.

Here are the remaining details of thisΒ Dataset.

Now, let's implement our codes in thatΒ Dataset;

Image byΒ Author

How long your data frame is? What are the column data types? How can I look at a little bit of myΒ data?

Info

It will give your column dataΒ types.

df.info()
Your column data types.
Image byΒ Author

Shape

It will give the dimension from your Dataframe.

df.shape()
Shape of your df.
Image byΒ Author

Sample

It will give β€œn” random samples from your Dataframe.

df.sample(5)
5 random samples of your df
Image byΒ Author

Head

Looking first β€œn” rows of your DataΒ frame.

df.head(5)
Looking first 5 rows of your df.
Image byΒ Author

Tail

Looking at the last β€œn” rows of your DataΒ frame.

df.tail(5)
Looking last 5 rows of your df.
Image byΒ Author

Describe

Shows a summary of numerical features.

df.describe()
Shows a summary of a numerical features.
Image byΒ Author

Value Counts

Looking at your categorical columnΒ types.

df["Column"].value_counts()
Looking this values data types.
Image byΒ Author

How to select your pre-defined row?

By using the locΒ method.

Now you want to see Iris-virginica class and Iris-virginica classΒ only.

In addition to that, if you want your sepal length to be bigger than five and petal length to be smaller than five, then your code will be likeΒ that;

Image byΒ Author

How to select multipleΒ columns?

First TwoΒ Columns

Now first β€œ:” means all rows, and 0:2 means start from the first column, end from the third column but do not select the thirdΒ one.

Image byΒ Author

Select Columns withΒ Name

By using two brackets.

Image byΒ Author

Select Column with theirΒ Indexes

Selecting the first and third columns by using the indexΒ method;

Image byΒ Author

How can I sort theΒ values?

Image byΒ Author
  • by = The column you want toΒ sort

If you want that order to be different, then you should add the following argument:

Image byΒ Author

For more, visitΒ here

How can I look at the mean/standard deviation/max of one column per its categories?

Now, if you want to be a good programmer, you should start reading documents today.

Here is the explanation;

A group by operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on theseΒ groups.

You can look up the arguments of this method, visit here and start reading documents from thisΒ library.

How can I drop the NAΒ Values?

Now there are too many approaches to doΒ that.

You can fill the mean of the column to the NA Values if your dataset is small and you do not want to lose yourΒ data.

df.dropna()

Now, I can not give a real-life explanation to you because my dataset does not contain NA values, however, as I mentioned earlier, it will be good for you to read library documents.

Here you can find other examples here, official document.

Conclusion

I try to be brief as much as IΒ can.

Although there are too many other methods that may have helped you along the machine learning journey, I think this prior knowledge would be okay to launch your first Machine LearningΒ Model.

In addition to all of these, thank you for your support of my previous articles, your reactions really motivate me to keep writing tutorials and articles.

If you want to be noticed in my upcoming articles via e-mail, hereΒ ;

Get an email whenever Gencay I. publishes.

I actually mentioned to you guys before about my preparation for E-Book, in this one, I will plan to explain to you guys all concepts in detail, not briefly this time, and with real-life explanations and datasets.

Machine learning is the last invention that humanity will ever need to make.” NickΒ Bostrom

Thanks.


Python Prior Machine Learning Part 2 & Data Analysis was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓