Boxes, Violins and Contours Conclude the Exploratory Data Analysis Process.
Author(s): Chandra Prakash Bathula
Originally published on Towards AI.
This member-only story is on us. Upgrade to access all of Medium.
Photo by Stefany Andrade on UnsplashDealing with Box Plots, Violin Plots and Contour Plots reveals a lot about Data before Machine Learning Modeling,
Welcome back to the wrap up article for the prerequisites of ML modeling.
Now, we can continue with the rest of the concepts in this article. Previously in the series we have discussed about Plotting in the first article, Densities and Deviation in the next one.
Letβs start with the alternatives of mean,
For mean we have the βMedian,β which is simply the middle value in the array.
print(np.median(iris_setosa["petal_length"]))#Median with an outlierprint(np.median(np.append(iris_setosa["petal_length"],50)));print(np.median(iris_virginica["petal_length"]))print(np.median(iris_versicolor["petal_length"]))
The Median for Setosa is 1.5; with the outlier, it is still the same at 1.5. But this is not the case with the mean; there is a significant difference. For Versicolor, Virginica, we have 5.55 and 4.35, which are closer to their means.
Intuitively, these medians are very similar to their mean or central tendency, but it has the nice property that one or a few points do not significantly affect the value.
Letβs say we have seven observations. X = {1.1,1.2,1,1.2,1.6,2.1,1.8}; let them be any observations like Sepal Length or Petal Length.
The Median can be computed as the following:Step 1:… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI