Improving multilingual embedding model relevancy with hybrid search reranking

From vector search to powerful REST APIs, Elasticsearch offers developers the most extensive search toolkit. Dive into sample notebooks on GitHub to try something new. You can also start your free trial or run Elasticsearch locally today.

Introduction

In the last part of this series, we walked through deploying Elastic's pre-trained E5 model (as well as other multilingual text embedding models from Hugging Face) and dived into generating dense vector embeddings from your text data using Elasticsearch and Kibana. In this blog, we will examine the results of these embeddings and highlight the significant advantages of leveraging a multilingual model.

Now that we have our index coco_multilingual, performing the search will give us documents in multiple languages, with the “en” field for us to reference:

Performing a search in English

Let’s try to perform the search in English and see how well it does:

Here, even though the query looks deceptively simple, we are searching for the numerical embeddings of the word ‘kitty’ across all documents in all languages underneath the hood. And because we are performing vector search, we can semantically search for all words that might be related to ‘kitty’: “cat”, “kitten”, “feline”, “gatto” (Italian), “mèo” (Vietnamese), 고양이 (Korean), 猫 (Chinese), etc. As a result, even if my query is in English, we can search for content in all other languages too. For example, searching for a kitty lying on something gives back documents in Italian, Dutch, or Vietnamese, too. Talk about efficiency!

Performing a search for content in other languages

Similarly, performing a keyword search for “cat” in Korean (“고양이”) will also give back meaningful results. What’s spectacular here is that we don’t even have any documents in Korean in this index!

This works because the embedding model represents meaning in a shared semantic space, allowing retrieval of relevant images even with a query in a different language than the indexed captions.

Increasing relevant search results with hybrid search and reranking

We are happy that the relevant results showed up as expected. But, in the real world, say in ecommerce or in RAG applications that need to narrow down to the top 5-10 most applicable results, we can use a rerank model to prioritize the most relevant results.

Here, performing a query that asks “what color is the cat?” in Vietnamese will yield a lot of results, but the top 1 or 2 might not be the most relevant.

The results all mention cat, or some form of color:

So let’s improve that! Let’s integrate Cohere’s multilingual rerank model to improve the reasoning corresponding to our question.

Now, with the top results, our application can confidently answer that the kitten’s color is black or brown with stripes. What’s even more interesting here is that our vector search actually caught an omission in the English caption in the original dataset. It’s able to find the brown striped cat even though the reference English translation missed that detail. This is the power of vector search.

Conclusion

In this blog, we have walked through the utility of a multilingual embedding model, and how to leverage Elasticsearch to integrate the models to generate embeddings, and to effectively improve relevance and accuracy with a hybrid search and reranker. You can create a Cloud cluster of your own to try multilingual semantic search using our out-of-the-box E5 model on the language and dataset of your choice.

Report an issue