How to migrate your Ruby app from OpenSearch to Elasticsearch

A guide to migrate a Ruby codebase from the OpenSearch client to the Elasticsearch client.

The OpenSearch Ruby client was forked from the Elasticsearch Ruby Client in version 7.x, so the codebases are relatively similar. This means when migrating a Ruby codebase from OpenSearch to Elasticsearch, the code from the respective client libraries will look very familiar. In this blog post, I'm going to show an example Ruby app that uses OpenSearch and the steps to migrate this code to Elasticsearch.

Both clients are released under the popular Apache License 2.0, so they're open source and free software. Elasticsearch's license was recently updated and the core of Elasticsearch and Kibana are published under the OSI approved Open Source license AGPL since version 8.16.

Versions

One consideration when migrating is which version of Elasticsearch is going to be used. We recommend using the latest stable release, which at the time of writing this is 8.17.0. The Elasticsearch Ruby Client minor versions follow the Elasticsearch minor versions. So for Elasticsearch 8.17.x, you can use version 8.17.x of the Ruby gem.

OpenSearch was forked from Elasticsearch 7.10.2. So the APIs may have changed and different features could be used on either. But that's out of scope for this post, and I'm only going to look into the most common operations in an example app.

For Ruby on Rails, you can use the official Elasticsearch client, or the Rails integration libraries. We recommend migrating to the latest stable version of Elasticsearch and client respectively. The elasticsearch-rails gem version 8.0.0 support Rails 6.1, 7.0 and 7.1 and Elasticsearch 8.x.

The code

For this example, I followed the steps to install OpenSearch from a tarball. After downloading and extracting the tarball, I needed to set an initial admin password which I'm going to use later to instantiate the client.

I created a directory with a Gemfile that looks like this:

source 'https://rubygems.org'

gem 'opensearch-ruby'

After running bundle install, the gem is installed for my project. This installed opensearch-ruby version 3.4.0 and the version of OpenSearch I'm running is 2.18.0. I wrote the code in an example_code.rb file in the same directory. The initial code in this file is the instantiation of an OpenSearch client:

require 'opensearch'

client = OpenSearch::Client.new(
  host: 'https://localhost:9200',
  user: 'admin',
  password: ENV['OPENSEARCH_INITIAL_ADMIN_PASSWORD'],
  transport_options: { ssl: { verify: false } }
)

The transport option ssl: { verify: false} parameter is being passed as per the user guide to make things easier for testing. In production, this should be set up depending on the deployment of OpenSearch.

Since version 2.12.0 of OpenSearch, the OPENSEARCH_INITIAL_ADMIN_PASSWORD environment variable must be set to a strong password when running the install script. Following the steps to install OpenSearch from a tarball, I exported the variable in my console and now it's available for my Ruby script.

A simple API to make sure the client is connecting to OpenSearch is using the cluster.health API:

puts 'HEALTH:'
pp client.cluster.health

And indeed it works:

$ be ruby example_code.rb
HEALTH:
{"cluster_name"=>"opensearch",
"status"=>"yellow",
 "timed_out"=>false,
 "number_of_nodes"=>1,
 "number_of_data_nodes"=>1,

I tested some of the common examples we have on the Elasticsearch Ruby client documentation, and they work as expected:

index = 'books'
puts 'Creating index'
response = client.indices.create(index: index)
puts response
# Creating index
# {"acknowledged"=>true, "shards_acknowledged"=>true, "index"=>"books"}

puts 'Indexing a document'
document = { title: 'The Time Machine', author: 'H. G. Wells', year: 1895 }
response = client.index(index: index, body: document, refresh: true)
puts response
# Indexing document
# {"_index"=>"books", "_id"=>"esalT5MB4vnuJz5TtqOc", "_version"=>1, "result"=>"created", "forced_refresh"=>true, "_shards"=>{"total"=>2, "successful"=>1, "failed"=>0}, "_seq_no"=>0, "_primary_term"=>1}

id = response['_id']
puts 'Getting document'
response = client.get(index: index, id: id)
puts response
# Getting document
# {"_index"=>"books", "_id"=>"esalT5MB4vnuJz5TtqOc", "_version"=>1, "_seq_no"=>0, "_primary_term"=>1, "found"=>true, "_source"=>{"title"= >"The Time Machine", "author"=>"H. G. Wells", "year"=>1895}}

puts "Does an index exist?"
puts client.indices.exists(index: 'imaginary_index')
# Does an index exist?
# false

puts 'Processing Bulk request'
body = [
  { index: { _index: 'books', data: { name: 'Leviathan Wakes', author: 'James S.A. Corey', release_date: '2011-06-02', page_count: 561 } } },
  { index: { _index: 'books', data: { name: 'Hyperion', author: 'Dan Simmons', release_date: '1989-05-26', page_count: 482 } } },
  { index: { _index: 'books', data: { name: 'Dune', author: 'Frank Herbert', release_date: '1965-06-01', page_count: 604 } } },
  { index: { _index: 'books', data: { name: 'Dune Messiah', author: 'Frank Herbert', release_date: '1969-10-15', page_count: 331 } } },
  { index: { _index: 'books', data: { name: 'Children of Dune', author: 'Frank Herbert', release_date: '1976-04-21', page_count: 408 } } },
  { index: { _index: 'books', data: { name: 'God Emperor of Dune', author: 'Frank Herbert', release_date: '1981-05-28', page_count: 454 } } },
  { index: { _index: 'books', data: { name: 'Consider Phlebas', author: 'Iain M. Banks', release_date: '1987-04-23', page_count: 471 } } },
  { index: { _index: 'books', data: { name: 'Pandora\'s Star', author: 'Peter F. Hamilton', release_date: '2004-03-02', page_count: 768 } } },
  { index: { _index: 'books', data: { name: 'Revelation Space', author: 'Alastair Reynolds', release_date: '2000-03-15', page_count: 585 } } },
  { index: { _index: 'books', data: { name: 'A Fire Upon the Deep', author: 'Vernor Vinge', release_date: '1992-06-01', page_count: 613 } } },
  { index: { _index: 'books', data: { name: 'Ender\'s Game', author: 'Orson Scott Card', release_date: '1985-06-01', page_count: 324 } } },
  { index: { _index: 'books', data: { name: '1984', author: 'George Orwell', release_date: '1985-06-01', page_count: 328 } } },
  { index: { _index: 'books', data: { name: 'Fahrenheit 451', author: 'Ray Bradbury', release_date: '1953-10-15', page_count: 227 } } },
  { index: { _index: 'books', data: { name: 'Brave New World', author: 'Aldous Huxley', release_date: '1932-06-01', page_count: 268 } } },
  { index: { _index: 'books', data: { name: 'Foundation', author: 'Isaac Asimov', release_date: '1951-06-01', page_count: 224 } } },
  { index: { _index: 'books', data: { name: 'The Giver', author: 'Lois Lowry', release_date: '1993-04-26', page_count: 208 } } },
  { index: { _index: 'books', data: { name: 'Slaughterhouse-Five', author: 'Kurt Vonnegut', release_date: '1969-06-01', page_count: 275 } } },
  { index: { _index: 'books', data: { name: 'The Hitchhiker\'s Guide to the Galaxy', author: 'Douglas Adams', release_date: '1979-10-12', page_count: 180 } } },
  { index: { _index: 'books', data: { name: 'Snow Crash', author: 'Neal Stephenson', release_date: '1992-06-01', page_count: 470 } } },
  { index: { _index: 'books', data: { name: 'Neuromancer', author: 'William Gibson', release_date: '1984-07-01', page_count: 271 } } },
  { index: { _index: 'books', data: { name: 'The Handmaid\'s Tale', author: 'Margaret Atwood', release_date: '1985-06-01', page_count: 311 } } },
  { index: { _index: 'books', data: { name: 'Starship Troopers', author: 'Robert A. Heinlein', release_date: '1959-12-01', page_count: 335 } } },
  { index: { _index: 'books', data: { name: 'The Left Hand of Darkness', author: 'Ursula K. Le Guin', release_date: '1969-06-01', page_count: 304 } } },
  { index: { _index: 'books', data: { name: 'The Moon is a Harsh Mistress', author: 'Robert A. Heinlein', release_date: '1966-04-01', page_count: 288 } } }
]
puts client.bulk(body: body, refresh: true)
# Processing Bulk request
# {"took"=>38, "errors"=>false, "items"=>[{"index"=>{"_index"=>"books", "_id"=>" ...

query = { query: { multi_match: { query: 'dune', fields: ['name'] } } }
puts 'Search results'
response = client.search(index: index, body: query)
puts response
# Search results
# {"_index"=>"books", "_id"=>"oEawT5MBOXHuGXdEu5Wu", "_score"=>2.2886353, "_source"=>{"name"=>"Dune", "author"=>"Frank Herbert", "release_date"=>"1965-06-01", "page_count"=>604}}
# {"_index"=>"books", "_id"=>"oUawT5MBOXHuGXdEu5Wu", "_score"=>1.8893257, "_source"=>{"name"=>"Dune Messiah", "author"=>"Frank Herbert", "release_date"=>"1969-10-15", "page_count"=>331}}
# {"_index"=>"books", "_id"=>"okawT5MBOXHuGXdEu5Wu", "_score"=>1.6086557, "_source"=>{"name"=>"Children of Dune", "author"=>"Frank Herbert", "release_date"=>"1976-04-21", "page_count"=>408}}
# {"_index"=>"books", "_id"=>"o0awT5MBOXHuGXdEu5Wu", "_score"=>1.40059, "_source"=>{"name"=>"God Emperor of Dune", "author"=>"Frank Herbert", "release_date"=>"1981-05-28", "page_count"=>454}}

puts 'Updating document'
document = { title: 'Walkaway', author: 'Cory Doctorow', release_date: '2017' }
response = client.index(index: index, body: document, refresh: true)
id = response['_id']
response = client.update(index: index, id: id, body: { doc: { release_date: '2017-04-26' } })
puts response
# Updating document
# {"_index"=>"books", "_id"=>"degnZJMBIGr4X0Yim55L", "_version"=>2, "result"=>"updated", "_shards"=>{"total"=>2, "successful"=>1, "failed"=>0}, "_seq_no"=>26, "_primary_term"=>1}

puts 'Retrieveing multiple documents'
response = client.search(index: index, body: { query: { match_all: {} }, size: 3, stored_fields: '_id' })
ids = response['hits']['hits']
ids.map { |a| a.delete('_score') }
response = client.mget(body: { docs: [{ _index: index, _id: ids }] })
puts response
# Retrieveing multiple documents
# {"docs"=>[{"_index"=>"books", "_id"=>"qeg2ZJMBIGr4X0YiiqD2", "_version"=>1, "_seq_no"=>0, "_primary_term"=>1, "found"=>true, "_source"=>{"title"=>"The Time Machine", "author"=>"H. G. Wells", "year"=>1895}}, {"_index"=>"books", "_id"=>"q-g2ZJMBIGr4X0Yii6Ah", "_version"=>1, "_seq_no"=>1, "_primary_term"=>1, "found"=>true, "_source"=>{"name"=>"Leviathan Wakes", "author"=>"James S.A. Corey", "release_date"=>"2011-06-02", "page_count"=>561}}, {"_index"=>"books", "_id"=>"rOg2ZJMBIGr4X0Yii6Ah", "_version"=>1, "_seq_no"=>2, "_primary_term"=>1, "found"=>true, "_source"=>{"name"=>"Hyperion", "author"=>"Dan Simmons", "release_date"=>"1989-05-26", "page_count"=>482}}]}

puts "Count #{client.count(index: index)['count']}"
puts 'Deleting by query'
response = client.delete_by_query(index: index, body: { query: { match: { author: 'Robert A. Heinlein' } } }, refresh: true)
puts response
puts "Count #{client.count(index: index)['count']}"
# Count 26
# Deleting by query
# {"took"=>16, "timed_out"=>false, "total"=>2, "deleted"=>2, "batches"=>1, "version_conflicts"=>0, "noops"=>0, "retries"=>{"bulk"=>0, "search"=>0}, "throttled_millis"=>0, "requests_per_second"=>-1.0, "throttled_until_millis"=>0, "failures"=>[]}
# Count 24

puts 'Deleting document'
response = client.delete(index: index, id: id)
puts response
# Deleting document
# {"_index"=>"books", "_id"=>"nEawT5MBOXHuGXdEu5WA", "_version"=>2, "result"=>"deleted", "_shards"=>{"total"=>2, "successful"=>1, "failed"=>0}, "_seq_no"=>25, "_primary_term"=>1}

puts 'Deleting index'
response = client.indices.delete(index: index)
puts response
# Deleting index
# {"acknowledged"=>true}

Migating to Elasticsearch

The first step is to add elasticsearch-ruby in the Gemfile. After running bundle install, the Elasticsearch Ruby client gem will be installed. If you want to test your code before fully migrating, you can initially leave the opensearch-ruby gem there.

The next important step is going to be the client instantiation. This is going to depend on how you're running Elasticsearch. To keep a similar approach for these examples, I am following the steps in Download Elasticsearch and running it locally.

When running bin/elasticsearch, Elasticsearch will start with security features automatically configured. Make sure you copy the password for the elastic user (but you can reset it by running bin/elasticsearch-reset-password -u elastic). If you're following this example, make sure you stop OpenSearch before starting Elasticsearch, since they run on the same port.

At the beginning of example_code.rb, I commented out the OpenSearch client instantiation and added the instantiation for an Elasticsearch client:

# require 'opensearch'

# client = OpenSearch::Client.new(
#   host: 'https://localhost:9200',
#   user: 'admin',
#   password: ENV['OPENSEARCH_INITIAL_ADMIN_PASSWORD']
#   transport_options: { ssl: { verify: false } }
# )

require 'elasticsearch'

client = Elasticsearch::Client.new(
  host: 'https://localhost:9200',
  user: ENV['ELASTICSEARCH_USER'],
  password: ENV['ELASTICSEARCH_PASSWORD'],
  transport_options: { ssl: { verify: false } }
)

As you can see, the code is almost identical in this testing scenario. It will differ according to the deployment of Elasticsearch and how you decide to connect and authenticate with it. The same applies here as in OpenSearch regarding security, the option to not verify ssl is just for testing purposes and should not be used in production.

Once the client is set up, I run the code again with: bundle exec ruby example_code.rb. And everything just works!

Debugging

Depending on the APIs your application is using, there is a possibility that you receive an error when running your code against Elasticsearch if the APIs from OpenSearch diverge. The REST APIs documentation is an essential reference for detailed information on how to use the APIs. Make sure to check the documentation for the version of Elasticsearch that you're using. You can also refer to the Elasticsearch::API reference.

Some errors you may encounter from Elasticsearch could be:

  • ArgumentError: Required argument '<ARGUMENT>' missing - This is a Client error and it will be raised when a request is missing a required parameter.
  • Elastic::Transport::Transport::Errors::BadRequest: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"request [/example/_doc] contains unrecognized parameter: [test]"}]... This error comes from Elasticsearch and it means the client code is using a parameter that Elasticsearch doesn't recognize for the API being used.

The Elasticsearch client will raise errors from Elasticsearch with the detailed error message sent by the server. So for unsupported parameters or endpoints even, the error should inform you what is different.

Conclusion

As we demonstrated with this example code, the migration of a Ruby app from OpenSearch to Elasticsearch is not too complex from the Ruby side of things. You need to be aware of the versioning and any potential divergent APIs between the search engines. But for the most common actions, the main change when migrating clients is in the instantiation. They're both similar in that respect, but the way the host and credentials are defined varies in relation to how the Stack is being deployed. Once the client is set up, and you verify it's connecting to Elasticsearch, you can replace the OpenSearch client seamlessly with the Elasticsearch client.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself