Building a recipe search with Elasticsearch

Learn how to build recipe search based on semantic search with Elasticsearch.

Introduction

Many e-commerce websites are interested in enhancing their recipe search experiences. Semantic search, when applied correctly, allows customers to quickly find the necessary ingredients based on more natural queries, such as "something for Valentine's Day" or "Thanksgiving meals."

In this article, we will demonstrate how to use Elasticsearch to implement a semantic search that supports such queries. We will configure an index to store the catalog of ingredients and products from a supermarket and demonstrate how this index can be used to improve recipe searches. Throughout the article, we will explain how to create this data structure and apply natural language processing techniques to provide relevant results aligned with the customer's intent.

All the code presented in this article was developed in Python and is available on GitHub. You can access the repository to review the source code, make adjustments as needed, and implement the solutions directly in your development environment.

To begin implementing semantic search, we first need to define the natural language model. Elastic provides its own model, ELSER, but also offers support for integrating NLP models from various providers, such as Hugging Face. This flexibility allows you to choose the option that best suits your needs.

In this article, we will be using ELSER, which reduces the complexity of deploying and managing NLP models. Additionally, Elastic offers the semantic_text feature, which greatly simplifies the process. With semantic_text, the entire embedding generation process becomes straightforward and automated. You simply need to define an inference point and specify the field that will receive the embeddings in your index mapping. During document indexing, embeddings will be generated and automatically associated with the specified field.

Setup steps

Below are the steps to create an index with semantic search support. By following these instructions, you will have an index configured and ready for semantic searches:

  1. Create the inference point.
  2. Create the index, setting the description field as semantic_text so that it can receive the embeddings.
  3. Index the data into the grocery-catalog index, which will store a product catalog. This catalog was obtained from a dataset available here.

Application of semantic search in grocery stores

Now that we have the index populated with grocery store product data, we are testing and validating queries to improve search results using semantic search. Our goal is to provide a smarter search experience that understands the context and user intent, delivering more relevant and accurate results.

Based on the product catalog, let's explore how semantic search can transform the search experience in grocery stores by addressing vocabulary and context issues that traditional lexical search often struggles with.

1. Interpretation of culinary intentions

Problem 01: A customer might search for "seafood for grilling" but a lexical search system may not fully understand the intent behind the query. It could fail to identify all the seafood products suitable for grilling, returning only those with the exact term "seafood" or "grill" in the product title.

First, we will perform a lexical search and analyze the results. Then, we will do the same with a semantic search, comparing the results for the same search term.

Query lexical search

 response = client.search(
        index="grocery-catalog",
        size=5,
        source_excludes="description_embedding",
        query={
            "multi_match": {
                "query": "seafood for grilling",
                "fields": [
                    "name",
                    "description"]
            }
        }
    )

Results:

Search TypeNameScore
LexicalNorthwest Fish Alaskan Bairdi Snow Crab10.453125
LexicalMr. Yoshida's, Sauce Original Gourmet7.2289705
LexicalPremium Seafood Variety Pack - 20 pcs7.1924105
LexicalAmerican Red Snapper - Whole, Head-On, Cleaned6.998647
LexicalLobster Claws & Arms, Sustainable Wild Caught6.438654

Lexical search returned some seafood items suitable for grilling, such as American Red Snapper and Northwest Fish Alaskan Bairdi Snow Crab. However, lexical search returned less relevant products at the top of the list, such as Mr. Yoshida sauce, which is not a seafood item but a meat sauce, suggesting that the lexical algorithm struggled to fully understand the context of "for grilling."

Semantic search solution

We use a query that combines the term "seafood" with preparation contexts like "grilling" to return a comprehensive list of options, such as fish fillets, shrimp, and scallops, which are ideal for grilling—even if the words "grill" or "seafood" don't appear directly in the product name. This ensures the search results align more closely with the customer’s intent.

Query semantic search:

es_client.search(
   index="grocery-catalog-elser",
   size=size,
   source_excludes="description_embedding",
   query={
       "semantic": {
           "field": "description_embedding",
           "query": "seafood for grilling"

       }
   })
Search TypeNameScore
SemanticWhole Head On, Cleaned Branzino Fish16.175909
SemanticAlaska Black Cod (Sable Fish)15.855331
SemanticAmerican Red Snapper - Whole, Head-On15.454779
SemanticNorthwest Fish Alaskan Bairdi Snow Crab15.855331
SemanticAmerican Red Snapper - Whole, Head-On15.3892355

Semantic search not only returned products directly related to the term "seafood," but also understood the context of "grilling," bringing up whole fish and fillets suitable for grilling. The key here is the precision of the results, which included whole fish options like Branzino and Alaska Black Cod, both commonly used for grilling.

Problem 02 : Many customers search for quick and easy dinner solutions after a long day at work, using terms like "easy weeknight meals." Traditional lexical search may not fully capture the concept of quick meals, often focusing only on products that include the word "easy" in their name.

As we did in the previous problem, we will begin by performing a lexical search. After that, we will apply a solution using semantic search.

Query lexical search

 response = client.search(
        index="grocery-catalog",
        size=5,   
        source_excludes="description_embedding",
        query={
            "multi_match": {
                "query": "easy weeknight meals",
                "fields": [
                    "name",
                    "description"]
            }
        }
    )

Results:

Search TypeNameScore
LexicalAvery Easy Peel Address Labels, 4200-count8.017723
LexicalOmeals Self Heating Emergency/Portable Meals 326.592727
LexicalCoastal Seafood Yellowfin Tuna Cubed Poke5.836883
LexicalHefty Super Weight 12 oz Foam5.8116536
LexicalVanity Fair Everyday Napkin, 2-Ply, 110-count5.752989

The lexical search returned much less relevant results, including items completely unrelated to meals, such as Avery Easy Peel Address Labels and Vanity Fair Everyday Napkins. These products do not meet the user's need for quick meals. While the lexical search did return one useful product (Omeals Self Heating Emergency Meals), other results, like napkins and labels, only matched the words "easy" or "weeknight" in their descriptions, without truly addressing the user's intent for a fast meal solution.

Semantic search solution

We implemented a query that understands the intent behind quick and easy meals. It associates products that can be prepared quickly, such as pre-cooked meats, frozen pasta, or meal kits, even if they don't explicitly include the word "easy" in the name. This approach ensures that customers find the most suitable options for fast weeknight dinners, addressing the need for convenience.

Query semantic search

es_client.search(
   index="grocery-catalog-elser",
   size=size,
   source_excludes="description_embedding",
   query={
       "semantic": {
           "field": "description_embedding",
           "query": "easy weeknight meals"

       }
   })

Results:

Search TypeNameScore
SemanticOmeals Self Heating Emergency/Portable Meals 3214.610006
SemanticNissin, Cup Noodles, Shrimp, 2.5 oz13.751424
SemanticNamaste Gluten Free Waffle & Pancake Mix13.73376
SemanticIdaho Spuds, Golden Grill Hashbrown Potatoes12.549422
SemanticNissin, Cup Noodles, Chicken, 24-Count12.034527

The semantic search returned products clearly related to quick and convenient meals, such as instant noodles (Cup Noodles), pre-cooked potatoes, and pancake mixes, which are typical choices for easy weeknight dinners. This demonstrates that semantic search can grasp the concept behind the phrase "easy weeknight meals," capturing the user's intent to find fast and convenient meals. Interestingly, products from other categories, such as "soda," may also be included when relevant in the context (e.g., beverages to accompany meals).

2. Regional terms and vocabulary variations

Problem: A customer might search for "soda," while another customer could use "pop" for the same product. Traditional lexical search fails to recognize that both terms refer to the same item.

Query lexical search

 response = client.search(
        index="grocery-catalog",
        size=5,
        source_excludes="description_embedding",
        query={
            "multi_match": {
                "query": "refreshing pop drink low sugar",
                "fields": [
                    "name",
                    "description"]
            }
        }
    )

Results:

Search TypeNameScore
LexicalPrime Hydration+ Sticks Electrolyte Drink Mix14.492869
LexicalCapri Sun, 100% Juice, Variety Pack12.340851
LexicalJoyburst Energy Drink, Frose Rose, 1211.839179
LexicalKellogg’s Pop-Tarts, Frosted Brown Sugar Cinnamon9.97788
LexicalKind Mini Bars, Variety Pack, 0.79.336912

Lexical search focuses on exact word matches. While it returned products like Prime Hydration and Capri Sun, the direct match with the term "pop" also led to irrelevant results, such as Kellogg’s Pop-Tarts, which is a snack and not a beverage. This highlights how lexical search can be less effective when a term has multiple meanings or can be ambiguous.

Semantic search solution

In semantic queries, we can overcome the issue of vocabulary variations that lexical search fails to address. By expanding the search terms, we are able to achieve results based on contextual meaning, providing more relevant and comprehensive responses.

Query:

es_client.search(
   index="grocery-catalog-elser",
   size=size,
   source_excludes="description_embedding",
   query={
       "semantic": {
           "field": "description_embedding",
           "query": "refreshing pop drink low sugar"

       }
   })

Results:

Search TypeNameScore
SemanticOlipop 12 oz Prebiotics Soda Variety14.776867
SemanticBai Antioxidant Cocofusion, Variety Pack, 1814.663253
SemanticMonster Energy Drink, Zero Ultra, 2414.486348
SemanticJoyburst Energy Variety, 12 fl oz14.007214
SemanticJoyburst Energy Drink, Frose Rose, 1213.641038

The semantic search returns products that directly match the concept of "pop" as a synonym for "soda" (such as Olipop Prebiotics Soda), even though the exact term "pop" may not be present in the product name. The search understood the user’s intent — a refreshing, low-sugar drink — and was able to return relevant products, including options like prebiotic sodas (Olipop) and sugar-free energy drinks (Monster Energy Drink).

Conclusion

The implementation of semantic search in the context of grocery stores has proven to be highly effective in understanding complex queries like "seafood for grilling" and "easy weeknight meals." This approach allowed us to interpret user intent more accurately, returning highly relevant products.

By using Elasticsearch and simplifying the process with ELSER, we were able to apply semantic search quickly and efficiently, significantly improving search results and providing a more agile and targeted shopping experience. This not only optimized the search process, but also increased the relevance of the results offered to customers.

References

Model ELSER:

https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/infer-service-elser.html

Semantic text:

https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html

Dataset:

https://www.kaggle.com/datasets/bhavikjikadara/grocery-store-dataset?select=GroceryDataset.csv

Semantic search:

https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search-semantic-text.html

Ready to try this out on your own? Start a free trial.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself