Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

CAPTCHAs vs. MACHINES: A Bitter Rivalry?
Latest

CAPTCHAs vs. MACHINES: A Bitter Rivalry?

Last Updated on June 23, 2021 by Editorial Team

Author(s): Daksh Trehan

Machine Learning, Cybersecurity

And how to crack CAPTCHA using Machine Learning!

I am kind of amazed by the technology, sometimes, it hooks me to weird-yet-interesting short videos, other times, it asks me to prove, β€˜I’m aΒ human!’

You book Flight Tickets, you face CAPTCHA. You create accounts, you face CAPTCHA. You check for plagiarism for your article, CAPTCHAΒ again!

Sometimes, I want to yell, YES! I am a Robot. (well obviously I am aΒ humanπŸ™„)

Other times, I wonder who gets all mountains/bikes/fire hydrants/cycles in the firstΒ pass?

What is Captcha? And Why do we use it? Are they gettingΒ harder?

CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers & HumansΒ Apart.

In the early 21st century, when Yahoo! was blooming, they were afraid that there will be a day when users will write code to create millions of fake accounts to spam. And to stop spammers, a mechanism is needed to differentiate human users from automated scripts.

The required mechanism should be something that can’t be cracked by our computers, but still, they must be able to grade that test. I told you technology is weird-yet-interesting.

At that time, due to the weaker configuration of machines, less exposure to Machine Learning and Python, computers were weak at recognizing texts. But on the other hand, we humans had expertise in text recognition, as, all we do is read texts all dayΒ long.

Luis Von Ahn developed CAPTCHA, where Computers were given a random image of text with its answer, and the text would be warped, thus, making it computers difficult to understand it.

Photo by Marija Zaric onΒ Unsplash

The test helped to differentiate between humans and users. But it wasn’t for the long run, soon computers started to learn that warped text and got better atΒ it.

The same problems arose, the computers were too smart to bypass the test, and now with the increase in traffic, a more robust mechanism was required.

Re-CAPTCHA

It was something very similar to CAPTCHA, but now, instead of providing one piece of text, there are two words in theΒ CAPTCHA.

For the first word, Computers know the answer but the second word was pulled randomly from any article/book. It was assumed, that if humans answered the first word right, there is a high possibility another word would be rightΒ too!

For the second word, usually, Computers are used to send the same CAPTCHA to many users and check for the majority. But soon, this method got exhausted and computers were yet able to crack Re-CAPTCHA.

They brought this method down so very well that, according to a test conducted by Google, only 33% of times humans conquered Re-CAPTCHA, but AI did it with an accuracy ofΒ 99.8%

Re-CAPTCHA(v2)

This time, the approach was different, this time, humans were expected to teach machines about real-world entities.

Photo by dedy kurniawan onΒ Unsplash

We all remember Fire Hydrants, Buses, Cycle, Bikes test,Β right?

When we try to choose the correct image, we are trying to teach the machine what a real-world entity looks like. The input given by us is recorded and is used for self-learning cars to better understand these entities.

But, guess what? AI is getting better at itΒ too!

Re-CAPTCHA(v3)

By this time, humans have lost all hopes and temper to create a robustΒ test.

Now, we are starting to verify the user’s identity based on her behavior. This is a kind of invisible test, of which users are unaware. It is secretly running behind your web pages to determine whether you’re human or aΒ bot.

Privacy is a myth, for sure!Β πŸ™‚

The test can track your clicks, your typing speed, your workflow. And based on that it tries to judge. If you show some unusual behavior, that is writing 100s of words of texts in a second, clicking very frequently. It will prompt Re-CAPTCHA(v2) and will ask you toΒ verify.

How Machine Learning crackedΒ CAPTCHA?

Till this time, you must have understood cracking CAPTCHA with Machine Learning isn’t a biggie. All you need to do is built a simple OCR model with the requiredΒ data.

The training data can be found atΒ Github

The dataset consists of 1040Β images.

Visualizing theΒ data

The Model

Training ourΒ model

Predicting output

The code can be found at: Solving CAPTCHA usingΒ ML

If you like this article, please consider subscribing to my newsletter: Daksh Trehan’s Weekly Newsletter.

Conclusion

Hopefully, this article has given you an insight into the CAPTCHAs.

The work was created as an academic/fun project and doesn’t intend to be used for harmful/malicious purposes.

References:

[1] OCR Model for readingΒ CAPTCHA.

Find me on Web: www.dakshtrehan.com

Follow me at LinkedIn: www.linkedin.com/in/dakshtrehan

Read my Tech blogs: www.dakshtrehan.medium.com

Connect with me at Instagram: www.instagram.com/_daksh_trehan_

Want to learnΒ more?

How is YouTube using AI to recommend videos?
Detecting COVID-19 Using Deep Learning
The Inescapable AI Algorithm: TikTok
GPT-3 Explained to a 5-year old.
Tinder+AI: A perfect Matchmaking?
An insider’s guide to Cartoonization using Machine Learning
How Google made β€œHum to Search?”
One-line Magical code to perform EDA!
Give me 5-minutes, I’ll give you a DeepFake!

Cheers


CAPTCHAs vs. MACHINES: A Bitter Rivalry? was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓