The Misuse of Statistics

Last Updated on June 22, 2020 by Editorial Team

Some people can misuse statistics because we can lie with statistics

“Politicians use statistics in the same way that a drunk uses lamp-posts for support rather than illumination” -A.Lang

Statistics is the primary tool for assessing relationships and evaluating study questions by revealing the underlying truth of unbiased data. Unfortunately, these tools are often misused either inadvertently because of ignorance or lack of planning or conspicuously and deliberately to achieve some particular target or result. In this era of Big Data, Statistics and its methods are vital because they not only help to summarize or analyze the data but also provide interpretations and further future consequences. These consequences(social, economical, etc.) are one of the key reasons that propel a politician (or political party) to distort data and statistical analysis for his (or their) own fulfillment. In many ways, a certain fact or conclusion can be altered by performing some statistical techniques carelessly or deliberately.

Let us consider a locality L. Suppose party A used to rule that locality. In a certain election party, B won over A. Thereafter, party B is ruling the locality L. Now Party B is willing to show that their performance has been better than that of party A because again, the election is coming, and obviously they don’t want to lose. Party B wants inhabitants of locality L to have a good impression on them before the vital election so they will use data and statistics in some way that will turn in favor of party B. Inhabitants of locality L were asked to mark the performances of party A and party B on a scale of 6(on the real line). Depending on the satisfaction levels of inhabitants, both the party were marked on their respective performances. Suppose, Now, Party B claims that people are more satisfied with them, and the overall satisfaction level is higher for party B than that of party A.

Here comes statistics to justify the claim raised by party B. Suppose, X denotes the satisfaction marks for the party A given by a randomly selected inhabitant in the locality and Y denotes the same for party B. In this background, it is clear that (X, Y) generates a paired data. Two standard tests, namely Fisher’s t-test and paired t-test, can be used to test for the hypothesis concerning equality of means. But paired t-test is used usually when we do have bivariate data, and Fisher’s t-test is used when two variables of interest are independent. So in this context, to test the hypothesis(above mentioned) paired t-test is expected to provide the prominent result. But what happens if one uses Fisher’s t-test instead of paired t-test using n data points on X and Y variables(sample size is n).

Suppose, n=10, then ten pairs of data on (X, Y) is given :

X: 1.77, 5.68, 0.07, 2.26, 2.60, 3.12, 3.56, 1.04, 2.68, 3.10

Y: 1.45, 3.93, -0.03, 1.01, 3.20, 2.02, 1.57, -0.61, 2.68, 2.43

Here, we are to test, H0:μ1=μ2 against H1:μ1>μ2 (where μ1, μ2 denotes mean satisfaction level of inhabitants by performances of party A and party B respectively)

We observe,

The test I i.e., Paired t-test rejects Ho if T1>c1=1.833.

So, Test I rejects Ho (null hypothesis) at a 5% level of significance.

Test II, i.e., Fishers Test rejects Ho if T2>c2=1.734

So, Test II accepts Ho(null hypothesis)at a 5% level of significance.

[T1 and T2 denotes the value of test statistics for paired t-test and Fisher’s t-test respectively, T1=3.024135, T2=1.262702]

This fact may induce party B to publish the conclusion based on Fisher’s t-test instead of the paired t-test simply because Fisher’s t-test does not have enough reason to deny their (party B’s) claim. Thus people living in locality L will be digesting a false fact.

Although the above-mentioned trick of swapping or twisting the truth does not involve the worst factor, that is data manipulation the harsh truth of the real-world data is that data are manipulated even in the medical sector, social projects, environmental projects, etc. for certain political and other benefits. Sometimes tricks may be implemented in graphical diagrams like bar diagrams, histograms, time series plots, etc.

For a time-series data choice of the time interval is important in understanding the true trend of the variable of interest. In capturing the trend of employment rate, data must be seasonally adjusted. There are several little twisting factors that may produce ‘overestimates’ or ‘underestimates,’ and there are people always ready to use those estimates obtained by statistically wrong means.

Thus we see when this body of scientific methods, statistics is used in a misleading fashion can trick the casual observer into believing something other than what data really shows. This introduces statistical fallacy, which occurs when a statistical argument asserts a falsehood. This type of cheap activity with data in this century of data should be abolished immediately, and this can happen if we become more aware of statistics with understanding.

“False Facts are highly injurious to the progress of science, for they often long endure; but false views if supported by some evidence, do little harm, as everyone takes a salutary pleasure in proving their falseness; and when this is done, one path towards error is closed, and the road to truth is often at the same time opened”-Charles Darwin, The Descent

The Misuse of Statistics was originally published in Towards AI — Multidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

The Misuse of Statistics

Author(s): Arghyamalya Biswas

Opinion, Statistics

Some people can misuse statistics because we can lie with statistics

Towards AI Team

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

NN#9 — Neural Networks Decoded: Concepts Over Code

Opera Unveils AI Browser Operator & Web Automation

I Created an Openai API Server, Because There Wasn’t One

TAI #142: GPT-4.5 Released — But Can It Stack Up Against Reasoning Models?

Beyond Training Data: How RAG Lets LLMs Retrieve, Not Guess

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

The Misuse of Statistics

Author(s): Arghyamalya Biswas

Opinion, Statistics

Some people can misuse statistics because we can lie with statistics

Towards AI Team

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement