Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!


The LLM Advantage: Transforming E-commerce Search
Latest   Machine Learning

The LLM Advantage: Transforming E-commerce Search

Last Updated on November 6, 2023 by Editorial Team

Author(s): Maithri Vm

Originally published on Towards AI.

Photo by Nadine Shaabana on Unsplash

The exceptional abilities of LLMs have been a remarkable feat in addressing a multitude of challenges across various business domains. The advanced outcomes pertaining to knowledge discovery using the RAG (Retrieval Augmented Generation) method have flooded the web with myriad interesting use cases and implementation approaches. While all that is advancing towards production-level maturity, drawing creative problem-solving with engineering, it is also important to explore the other niche possibilities that it can help reimagine whole new user experiences that have remained status quo for a long! E-commerce product search is one such case that this blog delves into.

In my blog series, I have covered the impressive outcomes of LLMs that can be harnessed for LLM-enabled product search and how context at the epicenter of it drives customer value, impacting business outcomes!

LLM-Powered Product Discovery : A Leap Beyond Hybrid Search

In this rapidly evolving era of cutting-edge technology, the world is immersed in the wave of LLM churning novel ideas…

In this post, we shall explore some of these ideas and possible implementation approaches, along with code samples with Here are a few such cases where LLMs could catalyze product discovery and hence increase customer engagement in an e-commerce product –

  1. Producing comprehensive product summaries with LLM’s generative ability to capture a broader context for the product.
  2. Leveraging the power of generalization and reasoning ability of LLMs to overlap the contexts — both query and products for better product discovery driven by shared understanding.
  3. Harnessing the power of code /query generation ability of the LLMs to produce backend queries to cater to the advanced search filters and segments from the generic expressions given by the user in natural language form.
  4. Employing the conversational interface for stateful product discovery instead of discrete search queries alongside filters and paginated scrolls. The chat interface goes a long way in harnessing deep contextual signals expressed by the user over multi-level interactions, which only strengthens the context over time. LLM’s innate ability in chat-context propagation and reasoning the impending user action aligns perfectly with the concept of conversational discovery!
  1. Generating comprehensive product summaries with LLMs: It is imperative to assume that product descriptions are generally targeted towards ads, or presented to the user as part of the product details page. While it is obvious to have LLMs generate personalized ads and creative descriptions, my idea for this point is more towards serving the purpose of the search.

In general, product descriptions in natural language form cater to very limited attributes like title and description, which often may not be rich in context. Having the LLM-enriched product description in an expressive manner helps build a rich context for the product. Let us call this a product summary (for disambiguating), which is the very data against which the user queries are interfaced to retrieve the best matching results. The value of this could grow leaps and bounds when these autogenerated summaries are extended to:

a. Explicit features: Product attributes explicitly available in the form of metadata. For example, some of the static attributes like product category, size, color, gender /size applicable, etc. This can be extended further by auto-captioning from product images /demo videos by utilizing computer vision models.

b. Implicit features: Descriptions enriched with the objective of being more expressive about the inherent product details. It could be in the form of its implicit features like tags, utility, unique features, problems it is catered to solve, etc

c. Crowd-sourced features: The overall product sentiment and its compelling features can also be mined from the product reviews given by the customers, which can also be worthy consideration to add to an expansive product summary.

Comprehensive product info aggregated from these various dimensions is best utilized when it is also a constituent of the dense embeddings, hence yielding better generalization for retrieving the best matching results for a given query. This approach also greatly reduces the dependencies on product SKUs to have comprehensive product metadata.

Note: While adding product metadata to the dense embeddings can enrich the relevance of the retrieved results, there could be other dynamic attributes like price, avg_ratings, stock availability, shipping charges, etc. that may not be viable parts of dense embeddings due to its continuously changing nature. Updating these embeddings repeatedly may not be the most efficient approach.

Let us look at this code sample, which implements the auto-generation of a rich product summary encompassing all the above-listed attributes by leveraging text-bison LLM.

from vertexai.preview.language_models import TextEmbeddingModel
model = TextEmbeddingModel.from_pretrained("textembedding-gecko")

def senti_extract(reviews,temperature: float = .2):
"""Ideation example with a Large Language Model"""

parameters = {
"temperature": temperature,
"max_output_tokens": 200,
"top_p": .8,
"top_k": 2,

model = TextGenerationModel.from_pretrained("text-bison@001")

prompt = """Summarise the overall product summary from the list of different user reviews given below
Also highlight important good attributes without being too explicit.
["The apparel's style and premium feel are appreciated for its comfort, ease of maintenance, and being reasonably priced for the quality."
.format(",\n ".join(reviews))
response = model.predict(
#print(f"Response from Model: {response.text}")
return response.text

# Implicit signals
product_df["prod_highlights"] = product_df["reviews"].apply(lambda x: senti_extract(x))
# Combine relevant explicit signals with implicit features into 'prod_summary'
product_df['prod_summary'] = product_df['product_name']+product_df['description']+product_df['pattern']+product_df['colors']+product_df['category']+product_df['size']+product_df['gender_category']+product_df['tags']+product_df["prod_highlights"]

# Other metadata to be indexed separately to achieve efficient reindexing
selected_columns = ['price', 'avg_ratings']
product_df['metadata'] = product_df[selected_columns].apply(lambda row: {col: row[col] for col in selected_columns}, axis=1)
# Initialize an empty list to store dense embeddings
dense_embeds = []

# Loop through your descriptors
for descriptor in product_df['prod_summary']:
# Get the embeddings for each descriptor and append to the list
embeddings = model.get_embeddings([descriptor])[0].values

#<code to index metadata and dense_embeds in choice of vector db goes here>

2. Leveraging the power of generalization and reasoning ability of LLMs to overlap the contexts — query, and products. It is prudent that the virtue of embeddings generated by LLMs that are trained with a large corpus of text from a wide variety of sources capture worldly context very well. This context is highly essential for not only understanding the user’s query intent and choosing the next action amongst a variety of options but also for reasoning and evaluating the relevance of the results that are retrieved from the database.

The below sample demonstrates response evaluation and synthesis with LLM call.

response_prompt = """
You are a response synthesiser. You are to transform certain data into a human readable form.
Firstly, you must evaluate the relevance of the search results to the query given by user. Only include the relevant results while generating the repsonse.

You will receive the `query` and some `data` using which the query must be answered.
The data comes from a machine search and may not be accurate or reflect the user's needs, which the user has no access to or knowing about.
List results based on relevance to the query and filter out irrelevant results accordingly.

You can think of `data` as your knowledge base. Cross reference the `data` to answer the `query` using the rules provided.
Refer to the source and metadata information to cite your responses from the data

Make sure to say there are no results if matches are empty. Do not spit out garbage.

List out only the key products. the response must be polite, professional and guide the customer to make a purchase!.

Please remember you are nice, cheerful, and helpful. You will also try to ensure the user has found what they need.
Nudge the user to a specific recommendation as necessary.

query: {query}
search results: {data}
Always correlate responses with the results!

synthesised response:

text_model = TextGenerationModel.from_pretrained("text-bison@001")
def synth(query, data):
response = text_model.predict(
response_prompt.format(query = query, data = data),
max_output_tokens = 1024,
temperature = 1
return response

Please note that the example for query intent identification is added as part of the block in the next section, along with query autogeneration.

3. Harnessing the power of code /query generation with LLMs: While it is pretty obvious to use the user query directly to retrieve top-k relevant content from the vector db, it is often the case that users explicitly narrow the target dataset with filters and preferences. Generally, product category, size, brand, and price range are mostly applied as filters to narrow down the search results; sorting results by ratings is another popular usage trend. Though all these hassles associated with online shopping have been the accepted norm, it is now the time to think of blending the best of both worlds — dense and sparse retrieval alongside metadata filtering to achieve effective hybrid search. Though applying these filters /sorting before or after the retrieval serves the purpose here, it would be a great experience if we could generalize this further by having LLMs extract the features and generate the filter queries from the query context given purely in natural language over multiple turns.

query_prompt = """
You are a vector-database retrieval system that writes queries to filter metadata on Pinecone.
The user will input a natural language query in English that will have to be transformed into the corresponding filter.
You must only return the dictionary for the filter.
The filter follows MongoDB's query format, except it does not use any regexes.
The user may keep giving input to narrow down the query. Update the filter every user message to reflect their most recent change.
Make sure that filters only refer to attributes that exist in the data source.
Make sure that filters take into account the descriptions of attributes and only make \
comparisons that are feasible given the type of data being stored.
This is the corresponding schema for the data available:
'features': string - All other text to be embedded, this is only a plain string and never a regex or other filter
'price': float - The price of the product. Cheap is less than $50, expensive is more than $200,
'product_ratings': float - Product rating from range 0.0 to 5.0. 5.0 is best and 0.0 is worst. Great ratings are >3.8.
Do not ever insert features not in the schema.
There is only $gt or $lt.
Just include the entire term for features apart from price and ratings.
$regex should not be included, this is pinecone.

natural language query: {search_term}

DO NOT FORGET THE SCHEMA. DO NOT VIOLATE THE SCHEMA. ONLY USE features, price, and product_ratings, if available.


text_model = TextGenerationModel.from_pretrained("text-bison@001")
# Examplefying the case with pineconde index named ecommerce
index = = pinecone.Index('ecommerce')
def search(search_term):
prompt = query_prompt.format(search_term = search_term)
res = text_model.predict(prompt, temperature = 0)
filter = eval(f'{res.text}'.replace('```', ''))
filter = {}
print('filters : ',filters)
embedding = [0] * 768

#use only the "features" part to generate embeddings as it captures user's intent
if "features" in filter:
embedding = embedding_model.get_embeddings([filter['features']])[0].values
del filter["features"]

#and rest of the filters as filters to search index i.e pinecone
results = index.query(embedding, top_k = 6, include_metadata = True, filter = filter)
answers = [r['metadata'] for r in results['matches']]
for i in range(len(results['matches'])):
answers[i]['id'] = results['matches'][i]['id']
return answers

The sample output below shows LLM’s impressive ability. You can further develop this idea yourself to appreciate how LLM can use the entire chat history to build these contextual filters over time.

#note that only the values under 'feature' is considered for query embedding
search("blue partywear dress with great ratings")
filters : {'features': {'$eq': 'blue patywear dress'}, 'avg_ratings': {'$gt': 4.0}}

search("teal formal shirt men reasonable price")
filters : {'features': {'$eq': 'formal shirt in teal'}, 'price': {'$lt': 50}}

4. Conversational, natural language interface for product discovery: Having elaborate discourse on “the context is the king” in my previous posts, it is quintessential to reimagine and redefine the overall search experience with the advent of LLMs. New age search experience could be evolved to have contiguous, conversational, fully (or mostly) in natural language format with users’ query context extended at length and depth. This would not only remove the frictions caused by discrete search queries & filters and quirky paginated scrolls but also help users appreciate the sense of understanding of the behavior. Multimodal search capability with new age vector databases takes the overall search experience a level further to offer a common interface for all heterogeneous formats (multi-modal — text and image) as well.

The below code exemplifies a conversational interface to augment all the above ideas using chat-bison LLM.

chat_model = ChatpModel.from_pretrained("chat-bison@001")
history = list()
context = """
You are product discovery system who is responsible to extract a search query from a conversation with a shopping site chatbot.
A search query should be generated so that it can search a vector database, using the entire message context of the conversation.
Exclude grammar and language, only extract traits like type of product, pricing details, rating requirement, and so on.
It should be a succinct query consisting of only the required words based on what one might send to a search engine.

The user can make adjustments to their queries. Denote this appropriately, and do not discard such things if the relevant context exists.
However, also note the user changing to a completely new product, ignore context in that case.

Help the user make a decision based on their preferences. Don't forget to check if the user is disruptive.
Do not forget to piece together the intent through the whole conversation.

addendum = "----\nYour response WILL ALWAYS be in the form of a search query based on the contextual information for product discovery.Always remember to refer to the entire conversation."
def conversational_discovery(user_chat):
global history
if len(history):
chat = chat_model.start_chat(context = context, message_history = history, examples = examples)
chat = chat_model.start_chat(context = context, examples = [])
response = chat.send_message(user_chat + addendum)
search_results = search(response)
history = chat.message_history[:-1] + [ChatMessage(
content = synth(query = response, data = search_results.copy()).text,
author = 'bot'
return history[-1].content,search_results

The steps outlined above have successfully demonstrated remarkable outcomes for search experience in the age of LLMs. There are, however important aspects that need to be considered to improve its reliability.

  1. The choice for product attributes management between metadata / dense index. While retaining the attributes as part of metadata ensures higher precision, adding it as part of the dense embedding helps it with better generalization. For ex, The product category, dress size, etc, are very crucial information for capturing users’ stated /learned preferences. It would be counterproductive if LLM’s generic embedding-based retriever fails to strictly abide by that; hence, it makes an ideal candidate for it to be added as metadata and applied with explicit filters at the db level to ensure reliability. In the meantime, the other attributes like color, category etc. are loosely defined, and users generally tend to express them vaguely, making such attributes better suited for dense representation to draw best value from it.
  2. Auto-query generation may fail at times due to hallucination. Hence, examples with few-shot learning and repeated checks and balances could be employed to ensure queries are error-free before it is interfaced with the database.
  3. Strictness with query sense disambiguation: Though foundational LLMs do perform well with the user’s intent understanding and query generation from a given context, it has not yet reached the accuracy goals for it to be deployed to production systems. This can be further improved by either utilizing knowledge graphs or by exploiting user query-intent mapping records to influence decision-making. Prompt tuning or Fine tuning with such a list is another reliable option that teams are building to improve the overall search accuracy.

To wrap up, LLM-enhanced search is reshaping how we discover products online. Share your thoughts and observations on how you are employing Generative AI to redefine traditional search systems.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓