Migrating data
editMigrating data
editIf you want to move your existing Elasticsearch data into your new infrastructure, check out the migration options. You’ll find instructions to guide you through:
- Migrating data from its original source
- Reindexing data from a remote Elasticsearch cluster
- Restoring data into a new deployment
- Restoring data from a snapshot
Migrate your Elasticsearch data
editYou might have switched to Elastic Cloud Enterprise for any number of reasons and you’re likely wondering how to get your existing Elasticsearch data into your new infrastructure. Along with easily creating as many new deployments with Elasticsearch clusters that you need, you have several options for moving your data over. Choose the option that works best for you:
- Index your data from the original source, which is the simplest method and provides the greatest flexibility for the Elasticsearch version and ingestion method.
- Reindex from a remote cluster, which rebuilds the index from scratch.
- Restore from a snapshot, which copies the existing indices.
One of the many advantages of Elastic Cloud Enterprise is that you can spin up a deployment quickly, try out something, and then delete it if you don’t like it. This flexibility provides the freedom to experiment while your existing production cluster continues to work.
Before you begin
editDepending on which option that you choose, you might have limitations or need to do some preparation beforehand.
- Indexing from the source
- The new cluster must be the same size as your old one, or larger, to accommodate the data.
- Reindex from a remote cluster
- The new cluster must be the same size as your old one, or larger, to accommodate the data. Depending on your security settings for your old cluster, you might need to temporarily allow TCP traffic on port 9243 for this procedure.
- Restore from a snapshot
- The new cluster must be the same size as your old one, or larger, to accommodate the data. The new cluster must also be an Elasticsearch version that is compatible with the old cluster (check Elasticsearch snapshot version compatibility for details). If you have not already done so, you will need to set up snapshots for your old cluster using a repository that can be accessed from the new cluster.
Before you migrate your Elasticsearch data, define your index mappings on the new cluster. Index mappings are unable to migrate during reindex operations.
Index from the source
editIf you still have access to the original data source, outside of your old Elasticsearch cluster, you can load the data from there. This might be the simplest option, allowing you to choose the Elasticsearch version and take advantage of the latest features. You have the option to use any ingestion method that you want—Logstash, Beats, the Elasticsearch clients, or whatever works best for you.
If the original source isn’t available or has other issues that make it non-viable, there are still two more migration options, getting the data from a remote cluster or restoring from a snapshot.
Reindex from a remote cluster
editThrough the Elasticsearch reindex API, available in version 5.x and later, you can connect your new Elastic Cloud Enterprise deployment remotely to your old Elasticsearch cluster. This pulls the data from your old cluster and indexes it into your new one. Reindexing essentially rebuilds the index from scratch and it can be more resource intensive to run.
- Log into the Cloud UI.
- Select a deployment or create one.
-
If the old Elasticsearch cluster is on a remote host (any type of host accessible over the internet), you need to make sure that the host can be accessed. Access is determined by the Elasticsearch
reindex.remote.whitelist
user setting.Domains matching the pattern
["*.io:*", "*.com:*"]
are allowed by default, so if your remote host URL matches that pattern you do not need to explicitly definereindex.remote.whitelist
.Otherwise, if your remote endpoint is not covered by the default settings, adjust the setting to add the remote Elasticsearch cluster as an allowed host:
- From your deployment menu, go to the Edit page.
- In the Elasticsearch section, select Manage user settings and extensions. For deployments with existing user settings, you may have to expand the Edit elasticsearch.yml caret for each node type instead.
-
Add the following
reindex.remote.whitelist: [REMOTE_HOST:PORT]
user setting, whereREMOTE_HOST
is a pattern matching the URL for the remote Elasticsearch host that you are reindexing from, and PORT is the host port number. Do not include thehttps://
prefix.Note that if you override the parameter it replaces the defaults:
["*.io:*", "*.com:*"]
. If you still want these patterns to be allowed you need to specify them explicitly in the value.For example:
reindex.remote.whitelist: ["*.us-east-1.aws.found.io:9243", "*.com:*"]
- Save your changes.
- From the API Console or in the Kibana Console app, create the destination index on Elastic Cloud Enterprise.
-
Copy the index from the remote cluster:
POST _reindex { "source": { "remote": { "host": "https://REMOTE_ELASTICSEARCH_ENDPOINT:PORT", "username": "USER", "password": "PASSWORD" }, "index": "INDEX_NAME", "query": { "match_all": {} } }, "dest": { "index": "INDEX_NAME" } }
-
Verify that the new index is present:
GET INDEX-NAME/_search?pretty
- You can remove the reindex.remote.whitelist user setting that you added previously.
Restore from a snapshot
editIf you cannot connect to a remote index for whatever reason, such as if it’s in a non-working state, you can try restoring from the most recent working snapshot.
-
On your old Elasticsearch cluster, choose an option to get the name of your snapshot repository bucket:
GET /_snapshot GET /_snapshot/_all
-
Get the snapshot name:
GET /_snapshot/NEW-REPOSITORY-NAME/_all
The output for each entry provides a
"snapshot":
value which is the snapshot name.{ "snapshots": [ { "snapshot": "scheduled-1527616008-instance-0000000004", ... }, ... ] }
-
From the Cloud UI of the new Elasticsearch cluster add the snapshot repository. For details about configuring snapshot repositories on Amazon Web Services (AWS), Google Cloud Storage (GCS), or Azure Blob Storage, check manage Snapshot Repositories.
If you’re migrating searchable snapshots, the repository name must be identical in the source and destination clusters.
-
Start the Restore process.
For deployments with Elastic Stack version 7.2 and higher:
- Open Kibana and go to Management > Snapshot and Restore.
- Under the Snapshots tab, you can find the available snapshots from your newly added snapshot repository. Select any snapshot to view its details, and from there you can choose to restore it.
- Select Restore.
- Select the indices you wish to restore.
- Configure any additional index settings.
- Select Restore snapshot to begin the process.
For deployments with Elastic Stack version 7.1 and lower:
-
Open the API Console or the Kibana Console app of the new Elasticsearch cluster and restore the snapshot:
POST /_snapshot/REPOSITORY_NAME/SNAPSHOT_NAME/_restore?pretty { "indices": "*", "ignore_unavailable": true, "include_global_state": true }
-
Verify that the new index is restored in your Elastic Cloud Enterprise deployment with this query:
GET INDEX_NAME/_search?pretty