The Curse of Dimensionality: Why More Isn’t Always Better in Machine Learning
Last Updated on September 2, 2024 by Editorial Team
Author(s): Souradip Pal
Originally published on Towards AI.
This member-only story is on us. Upgrade to access all of Medium.
In the world of machine learning, you’re often knee-deep in datasets. These datasets could be anything — a collection of housing prices, handwritten digits, or even details about the passengers on the Titanic. To make accurate predictions, you rely on features or dimensions within these datasets. But here’s the kicker: sometimes, having too many features can be a real headache. That’s where the “Curse of Dimensionality” comes into play.
Photo by Cederic Vandenberghe on UnsplashNow, before you start thinking this curse belongs in a Harry Potter book, let me assure you — it’s very much grounded in reality. The term “Curse of Dimensionality” was coined by Richard Bellman back in 1957. Essentially, it describes how things get exponentially trickier as you add more features (or dimensions) to your dataset. More dimensions might sound like a good thing, but trust me, it’s not always that simple.
Let’s break this down with a simple analogy. Imagine you’re a student, heading to class, and suddenly you realize you’ve lost your wallet (ugh, the worst). Now, you have three options for where to search: a one-dimensional road, a two-dimensional field, and a three-dimensional college building…. Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI