Inside LinkedIn’s Embedding Architecture Powering its Job Search Capabilities
Last Updated on November 5, 2023 by Editorial Team
Author(s): Jesus Rodriguez
Originally published on Towards AI.
I recently started an AI-focused educational newsletter, that already has over 160,000 subscribers. TheSequence is a no-BS (meaning no hype, no news, etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers, and concepts. Please give it a try by subscribing below:
TheSequence U+007C Jesus Rodriguez U+007C Substack
The best source to stay up-to-date with the developments in the machine learning, artificial intelligence, and data…
thesequence.substack.com
Embeddings have become one of the most important components of large language model(LLMs) applications in recent months. Entire market segments such as vector databases have emerged as a mechanism to power embedding architectures. However, embedding architectures are still in very early stages and only a handful of organizations have successfully implement them at scale. That’s why is super important to learn from those companies about the best practices and techniques used by these organizations. Recently, LinkedIn published some details about their use of Embedding Based Retrieval (EBR) technology to transform its search and recommendation systems. If you’ve ever come across the “Jobs You Might Be Interested In” feature or noticed the tailored content in your LinkedIn Feed and Notifications, you’ve seen EBR in action.
So, what’s EBR? In simple terms, it’s a technique used in the early stages of recommendation systems. It scans a vast array of items (like job postings or feed articles) and identifies those that are most relevant based on their similarity to a given request. Think of it as finding items that are “nearby” in a digital space. Once these items are identified, another AI model ranks them to present the most pertinent ones to the user.
To streamline the use of EBR, LinkedIn has rolled out several new tools and features:
1) Composite and Multi-Task Learning Models
LinkedIn now supports the creation of composite models, which consolidate various objectives into one model. This speeds up the learning process and enhances transfer learning. For instance, an embedding might capture a user’s interests based on their profile and interactions. This embedding then informs the search and recommendation systems, ensuring users see content that truly resonates with them.
2) Feature Cloud Platform
LinkedIn has launched a platform named “Feature Cloud” that merges offline and real-time embedding generation. This platform taps into existing services and orchestrates various tasks, preparing embeddings for use in EBR indexes and feature stores. The feature cloud can be used to serve embeddings of many different kinds across LinkedIn’s search applications.
3) Upgraded Hosted Search System
LinkedIn’s search system, compatible with Lucene, now supports automated embedding version management and a range of EBR algorithms. This adaptability is crucial in the ever-evolving world of EBR. The search system is tightly integrated with the feature cloud described previously.
4) Automated Embedding Version Management
Ensuring that content matches a search query’s intent is crucial. But managing versions of embeddings can be complex. For instance, if a team updates an embedding model, it might not align with the existing item embeddings, even if the dimensions remain consistent. LinkedIn’s feature cloud supports native versioning for embeddings ensuring a better management of its lifecycle.
5) Model Cloud for Streamlined Inference for Graph Orchestration
LinkedIn’s “Model Cloud” now supports inference graphs using embeddings powered by Ray Serve. This simplifies the execution of inference graphs, reduces the need for complex workflows, and ensures version consistency. The result? A streamlined system where AI experts can focus more on enhancing user experience and less on managing infrastructure.
6) Enhancing Job Search Precision
Search tasks, especially job searches, demand a precise blend of user data, query, and context. Before EBR, LinkedIn’s Job Search mainly relied on text matching. While it provided results, it lacked depth in personalization and semantic matching. With the introduction of EBR and the new tools, LinkedIn has elevated its matching capabilities, offering users a more personalized and accurate job search experience.
In essence, LinkedIn is pushing the boundaries of AI and EBR to ensure users get the most relevant content, whether it’s job recommendations, feed content, or notifications. The platform’s commitment to innovation is evident in its continuous efforts to refine and enhance the user experience.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI