Introduction
Many e-commerce websites are interested in enhancing their recipe search experiences. Semantic search, when applied correctly, allows customers to quickly find the necessary ingredients based on more natural queries, such as "something for Valentine's Day" or "Thanksgiving meals."
In this article, we will demonstrate how to use Elasticsearch to implement a semantic search that supports such queries. We will configure an index to store the catalog of ingredients and products from a supermarket and demonstrate how this index can be used to improve recipe searches. Throughout the article, we will explain how to create this data structure and apply natural language processing techniques to provide relevant results aligned with the customer's intent.
All the code presented in this article was developed in Python and is available on GitHub. You can access the repository to review the source code, make adjustments as needed, and implement the solutions directly in your development environment.
Starting the implementation of semantic search
To begin implementing semantic search, we first need to define the natural language model. Elastic provides its own model, ELSER, but also offers support for integrating NLP models from various providers, such as Hugging Face. This flexibility allows you to choose the option that best suits your needs.
In this article, we will be using ELSER, which reduces the complexity of deploying and managing NLP models. Additionally, Elastic offers the semantic_text feature, which greatly simplifies the process. With semantic_text, the entire embedding generation process becomes straightforward and automated. You simply need to define an inference point and specify the field that will receive the embeddings in your index mapping. During document indexing, embeddings will be generated and automatically associated with the specified field.
Setup steps
Below are the steps to create an index with semantic search support. By following these instructions, you will have an index configured and ready for semantic searches:
- Create the inference point.
- Create the index, setting the description field as semantic_text so that it can receive the embeddings.
- Index the data into the grocery-catalog index, which will store a product catalog. This catalog was obtained from a dataset available here.
Application of semantic search in grocery stores
Now that we have the index populated with grocery store product data, we are testing and validating queries to improve search results using semantic search. Our goal is to provide a smarter search experience that understands the context and user intent, delivering more relevant and accurate results.
Challenges solved by semantic search
Based on the product catalog, let's explore how semantic search can transform the search experience in grocery stores by addressing vocabulary and context issues that traditional lexical search often struggles with.
1. Interpretation of culinary intentions
Problem 01: A customer might search for "seafood for grilling" but a lexical search system may not fully understand the intent behind the query. It could fail to identify all the seafood products suitable for grilling, returning only those with the exact term "seafood" or "grill" in the product title.
First, we will perform a lexical search and analyze the results. Then, we will do the same with a semantic search, comparing the results for the same search term.
Query lexical search
response = client.search(
index="grocery-catalog",
size=5,
source_excludes="description_embedding",
query={
"multi_match": {
"query": "seafood for grilling",
"fields": [
"name",
"description"]
}
}
)
Results:
Search Type | Name | Score |
---|---|---|
Lexical | Northwest Fish Alaskan Bairdi Snow Crab | 10.453125 |
Lexical | Mr. Yoshida's, Sauce Original Gourmet | 7.2289705 |
Lexical | Premium Seafood Variety Pack - 20 pcs | 7.1924105 |
Lexical | American Red Snapper - Whole, Head-On, Cleaned | 6.998647 |
Lexical | Lobster Claws & Arms, Sustainable Wild Caught | 6.438654 |
Lexical search returned some seafood items suitable for grilling, such as American Red Snapper and Northwest Fish Alaskan Bairdi Snow Crab. However, lexical search returned less relevant products at the top of the list, such as Mr. Yoshida sauce, which is not a seafood item but a meat sauce, suggesting that the lexical algorithm struggled to fully understand the context of "for grilling."
Semantic search solution
We use a query that combines the term "seafood" with preparation contexts like "grilling" to return a comprehensive list of options, such as fish fillets, shrimp, and scallops, which are ideal for grilling—even if the words "grill" or "seafood" don't appear directly in the product name. This ensures the search results align more closely with the customer’s intent.
Query semantic search:
es_client.search(
index="grocery-catalog-elser",
size=size,
source_excludes="description_embedding",
query={
"semantic": {
"field": "description_embedding",
"query": "seafood for grilling"
}
})
Search Type | Name | Score |
---|---|---|
Semantic | Whole Head On, Cleaned Branzino Fish | 16.175909 |
Semantic | Alaska Black Cod (Sable Fish) | 15.855331 |
Semantic | American Red Snapper - Whole, Head-On | 15.454779 |
Semantic | Northwest Fish Alaskan Bairdi Snow Crab | 15.855331 |
Semantic | American Red Snapper - Whole, Head-On | 15.3892355 |
Semantic search not only returned products directly related to the term "seafood," but also understood the context of "grilling," bringing up whole fish and fillets suitable for grilling. The key here is the precision of the results, which included whole fish options like Branzino and Alaska Black Cod, both commonly used for grilling.
Problem 02 : Many customers search for quick and easy dinner solutions after a long day at work, using terms like "easy weeknight meals." Traditional lexical search may not fully capture the concept of quick meals, often focusing only on products that include the word "easy" in their name.
As we did in the previous problem, we will begin by performing a lexical search. After that, we will apply a solution using semantic search.
Query lexical search
response = client.search(
index="grocery-catalog",
size=5,
source_excludes="description_embedding",
query={
"multi_match": {
"query": "easy weeknight meals",
"fields": [
"name",
"description"]
}
}
)
Results:
Search Type | Name | Score |
---|---|---|
Lexical | Avery Easy Peel Address Labels, 4200-count | 8.017723 |
Lexical | Omeals Self Heating Emergency/Portable Meals 32 | 6.592727 |
Lexical | Coastal Seafood Yellowfin Tuna Cubed Poke | 5.836883 |
Lexical | Hefty Super Weight 12 oz Foam | 5.8116536 |
Lexical | Vanity Fair Everyday Napkin, 2-Ply, 110-count | 5.752989 |
The lexical search returned much less relevant results, including items completely unrelated to meals, such as Avery Easy Peel Address Labels and Vanity Fair Everyday Napkins. These products do not meet the user's need for quick meals. While the lexical search did return one useful product (Omeals Self Heating Emergency Meals), other results, like napkins and labels, only matched the words "easy" or "weeknight" in their descriptions, without truly addressing the user's intent for a fast meal solution.
Semantic search solution
We implemented a query that understands the intent behind quick and easy meals. It associates products that can be prepared quickly, such as pre-cooked meats, frozen pasta, or meal kits, even if they don't explicitly include the word "easy" in the name. This approach ensures that customers find the most suitable options for fast weeknight dinners, addressing the need for convenience.
Query semantic search
es_client.search(
index="grocery-catalog-elser",
size=size,
source_excludes="description_embedding",
query={
"semantic": {
"field": "description_embedding",
"query": "easy weeknight meals"
}
})
Results:
Search Type | Name | Score |
---|---|---|
Semantic | Omeals Self Heating Emergency/Portable Meals 32 | 14.610006 |
Semantic | Nissin, Cup Noodles, Shrimp, 2.5 oz | 13.751424 |
Semantic | Namaste Gluten Free Waffle & Pancake Mix | 13.73376 |
Semantic | Idaho Spuds, Golden Grill Hashbrown Potatoes | 12.549422 |
Semantic | Nissin, Cup Noodles, Chicken, 24-Count | 12.034527 |
The semantic search returned products clearly related to quick and convenient meals, such as instant noodles (Cup Noodles), pre-cooked potatoes, and pancake mixes, which are typical choices for easy weeknight dinners. This demonstrates that semantic search can grasp the concept behind the phrase "easy weeknight meals," capturing the user's intent to find fast and convenient meals. Interestingly, products from other categories, such as "soda," may also be included when relevant in the context (e.g., beverages to accompany meals).
2. Regional terms and vocabulary variations
Problem: A customer might search for "soda," while another customer could use "pop" for the same product. Traditional lexical search fails to recognize that both terms refer to the same item.
Query lexical search
response = client.search(
index="grocery-catalog",
size=5,
source_excludes="description_embedding",
query={
"multi_match": {
"query": "refreshing pop drink low sugar",
"fields": [
"name",
"description"]
}
}
)
Results:
Search Type | Name | Score |
---|---|---|
Lexical | Prime Hydration+ Sticks Electrolyte Drink Mix | 14.492869 |
Lexical | Capri Sun, 100% Juice, Variety Pack | 12.340851 |
Lexical | Joyburst Energy Drink, Frose Rose, 12 | 11.839179 |
Lexical | Kellogg’s Pop-Tarts, Frosted Brown Sugar Cinnamon | 9.97788 |
Lexical | Kind Mini Bars, Variety Pack, 0.7 | 9.336912 |
Lexical search focuses on exact word matches. While it returned products like Prime Hydration and Capri Sun, the direct match with the term "pop" also led to irrelevant results, such as Kellogg’s Pop-Tarts, which is a snack and not a beverage. This highlights how lexical search can be less effective when a term has multiple meanings or can be ambiguous.
Semantic search solution
In semantic queries, we can overcome the issue of vocabulary variations that lexical search fails to address. By expanding the search terms, we are able to achieve results based on contextual meaning, providing more relevant and comprehensive responses.
Query:
es_client.search(
index="grocery-catalog-elser",
size=size,
source_excludes="description_embedding",
query={
"semantic": {
"field": "description_embedding",
"query": "refreshing pop drink low sugar"
}
})
Results:
Search Type | Name | Score |
---|---|---|
Semantic | Olipop 12 oz Prebiotics Soda Variety | 14.776867 |
Semantic | Bai Antioxidant Cocofusion, Variety Pack, 18 | 14.663253 |
Semantic | Monster Energy Drink, Zero Ultra, 24 | 14.486348 |
Semantic | Joyburst Energy Variety, 12 fl oz | 14.007214 |
Semantic | Joyburst Energy Drink, Frose Rose, 12 | 13.641038 |
The semantic search returns products that directly match the concept of "pop" as a synonym for "soda" (such as Olipop Prebiotics Soda), even though the exact term "pop" may not be present in the product name. The search understood the user’s intent — a refreshing, low-sugar drink — and was able to return relevant products, including options like prebiotic sodas (Olipop) and sugar-free energy drinks (Monster Energy Drink).
Conclusion
The implementation of semantic search in the context of grocery stores has proven to be highly effective in understanding complex queries like "seafood for grilling" and "easy weeknight meals." This approach allowed us to interpret user intent more accurately, returning highly relevant products.
By using Elasticsearch and simplifying the process with ELSER, we were able to apply semantic search quickly and efficiently, significantly improving search results and providing a more agile and targeted shopping experience. This not only optimized the search process, but also increased the relevance of the results offered to customers.
References
Model ELSER:
https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/infer-service-elser.html
Semantic text:
https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html
Dataset:
https://www.kaggle.com/datasets/bhavikjikadara/grocery-store-dataset?select=GroceryDataset.csv
Semantic search:
https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-semantic-text.html
Ready to try this out on your own? Start a free trial.
Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!