Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Free: 6-day Agentic AI Engineering Email Guide.
Learnings from Towards AI's hands-on work with real clients.
How Machines Understand Meaning: A Simple Guide to Embeddings.
Artificial Intelligence   Latest   Machine Learning

How Machines Understand Meaning: A Simple Guide to Embeddings.

Last Updated on October 18, 2025 by Editorial Team

Author(s): Deepak Chahal

Originally published on Towards AI.

How Machines Understand Meaning: A Simple Guide to Embeddings.

Have you ever wondered how ChatGPT knows that “a car” and “a bike” are related but “a car” and “a human” aren’t?
Or how a word can have different meanings in different sentences, like “a fly on the wall” vs “I’ll fly an aircraft”?

And the answer to that lies in something called embeddings. Now, let’s break down what embeddings actually are.

What are Embeddings?

In layman’s terms, embeddings are a way to organise words(tokens to be precise) so that the words with similar meaning are placed close to each other in a kind of semantic space.

But how does a machine actually figure out whether two words have similar meaning or not?

As we know, machines can’t understand human language like us. To solve that problem, they convert words into numbers (called vectors). Once each word is represented as a vector, then machines can compare those vectors to see how similar they are.
A popular way to measure this similarity is cosine similarity — It tells us how similar two vectors are based on the angle between them.

Simple Example:
Let’s take 4 words: dog, cat, mobile, and tablet. If we plot their embeddings in a 2D space, you’ll notice that dog and cat are close to each other (both animals), while mobile and tablet form another small cluster (both electronic devices).
These pairs also have high cosine similarity, meaning they’re semantically related, while the animal and gadget pairs are far apart.

Word embeddings visualised in 2D space showing cosine similarities.

Beyond Two Dimensions

So far, we have visualised embeddings in 2D for simplicity.
In reality, the embeddings exist in hundreds or thousands of dimensions, with each dimension representing some hidden feature or characteristic.

If I ask how similar or different a cow and a tiger are?

In our mind, we’ll compare them on different characteristics like both are animals and have four legs — on those characteristics, they’re similar. But a cow is herbivorous, while a tiger is carnivorous, which is very different.
Similarly, their embeddings will also reflect the same, with some dimensions aligning and others differing depending on the features.

That’s why we need to represent the vectors in a large number of dimensions based on different hidden features that help machines understand relationships.

The below table illustrates how different features contribute to the similarity and differences between two animals.

Please note: real embeddings are high-dimensional numeric vectors and they don’t explicitly represent humanly interpretable features.

Feature Cow Tiger Difference Analysis
----------------------------------------------------------------------
animal 1.000 1.000 0.000 SIMILAR
living 1.000 1.000 0.000 SIMILAR
furry 0.300 1.000 0.700 VERY DIFFERENT
domestic 1.000 0.000 1.000 VERY DIFFERENT
wild 0.000 1.000 1.000 VERY DIFFERENT
predator 0.000 1.000 1.000 VERY DIFFERENT
herbivore 1.000 0.000 1.000 VERY DIFFERENT
carnivore 0.000 1.000 1.000 VERY DIFFERENT
large 1.000 1.000 0.000 SIMILAR
dangerous 0.100 1.000 0.900 VERY DIFFERENT
farm_animal 1.000 0.000 1.000 VERY DIFFERENT
Cow and Tiger Comparison based on few features

Contextual Embeddings

So the earlier models like Word2Vec and GloVe generate static embeddings, i.e, a single vector regardless of the context. For example, in our example at the start, the word “fly” in “a fly on the wall” vs “I’ll fly an aircraft” had two different meanings, and static embedding can’t capture that.

The modern models like BERT, GPT, and ELMo solve that problem by having contextual embeddings, meaning the same word can have different embeddings based on the context.
For example, the word“fly” in “a fly on the wall” and “I’ll fly an aircraft” would have different embeddings because the surrounding context changes their meaning.

Static vs Contextual Word Embeddings

Different Embedding Models

Now that we have seen what embeddings are, let’s look at a few popular models that are used to generate embeddings.

  1. Word2Vec, GloVe: These are traditional static embedding models that generate a fixed vector for each word, capturing semantic relationships in a continuous vector space. These are great for simple word similarity and clustering-related tasks.
  2. BERT, GPT, ELMo: These are modern contextual models that generate different vectors for the same word based on the context.
  3. Sentence Transformers: These models convert sentences into fixed-length numerical vectors (embeddings) that capture their semantic meaning. Thus making them great for document similarity and semantic search.

Embeddings play a role in almost every modern NLP task. And here are a few common ones.

Real World Use Cases

  1. Text Classification: In text classification, the embeddings are often used for spam detection and topic categorisation.
  2. Named Entity Recognition(NER): In NER, the word embeddings are used to identify and classify different entities like names, places, etc., in text.
  3. Word Analogy: Embeddings can be used to capture relationships between words, like a classical example of how “king” is to “queen” as “man” is to “woman”.
  4. Chatbots & Q&As: In Q&As and chatbots, the embeddings convert the user query into a numerical representation, with context and semantic search are then used to find the most appropriate answer.
  5. Recommendation Engines: By comparing embeddings, systems can suggest products, movies, or content that are semantically similar to what users like.

Conclusion

To conclude what we have discussed till now, the embeddings are how machines represent language as numbers, capturing the meaning and relationships between words, sentences, and even entire documents.

Whether it’s a recommendation algorithm suggesting your next movie or a chatbot answering your question, embeddings play a major role behind the scenes to make things run smoothly.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

6-Day Agentic AI Engineering Email Guide — one practical lesson per day

Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.