Upgrading self-managed deployments

edit

Upgrading self-managed deployments

edit

We highly recommend that all Enterprise Search users keep their deployments up to date with the latest available version to have access to new features, security updates and performance improvements. This guide is designed to help customers through the upgrade process, to minimize the impact of an upgrade on production environments, and to ensure data safety during an upgrade. Finally, the guide should help Enterprise Search users troubleshoot any issues that may occur during an upgrade.

When upgrading Enterprise Search, you must also upgrade Elasticsearch. Both services must be running the same minor version.

If you’re upgrading from App Search 7.6 or earlier, follow Migrating from self-managed App Search.

Before you begin

edit

Before attempting an upgrade of an Enterprise Search cluster to a newer version, you need to take a few steps to guarantee your data safety and increase the probability of a successful upgrade:

  • Before you upgrade production servers, test the upgrades in a development environment to familiarize yourself with the process. Using an Elasticsearch snapshot created from a production deployment may be the best option to completely test the upgrade process without risking your service availability or data consistency.
  • Stop writing to your Elasticsearch cluster. Enable read-only mode to guarantee a consistent snapshot of your data.
  • Back up your data with Elasticsearch snapshots. To roll back to an earlier version of Enterprise Search, you must have a backup of your data stored in Elasticsearch.
  • Review breaking changes for the new version of Enterprise Search (see Breaking changes by version and product).

Upgrade process overview

edit

How you upgrade Enterprise Search is going to depend on many factors, such as:

  • Your ability to create Elasticsearch snapshots and recover from them
  • Available resources. Can you spin up a new Elasticsearch cluster and/or new Enterprise Search instances while the old infrastructure is running?

We will outline different scenarios for the upgrade process below, but first, here is an overview of the Enterprise Search upgrade process to help you navigate the different approaches outlined further in this guide:

  • Enterprise Search uses Elasticsearch as its only data store. The Elasticsearch cluster is used for control layer data (engines, settings, etc), documents, search indexes, API logs and analytics events.
  • Every time a new instance of Enterprise Search starts, it checks every index within the Elasticsearch cluster to see if it needs upgrading to a newer version. When its structure changes, we have to reindex the data stored in an index.
  • All upgrades are done in a non-destructive way and are performed automatically. Existing indexes are never deleted and we only create new indexes with a copy of the data.
  • If there are multiple Enterprise Search instances starting against a single Elasticsearch cluster, they will use a distributed locking mechanism to coordinate the upgrade process and ensure it is done in a consistent way. We create an installation lock record in Elasticsearch to do this. Until the upgrade process is complete, the new version of Enterprise Search will not begin to serve API requests.

Given the details of the upgrade process described above, there are a few different options available for performing an upgrade of an Enterprise Search deployment based on your ability to handle Enterprise Search downtime during an upgrade and to change Enterprise Search client configuration after an upgrade:

  • Simple in-place upgrade (with downtime, no client-side changes) — replace a set of Enterprise Search instances with a new set running a newer version. In other words, you would shut down the instance, upgrade the packages, then start it back up and let the instances migrate the data if needed.
  • Snapshot-based upgrades (no downtime, client-side changes needed) — create a snapshot of an Elasticsearch cluster used by Enterprise Search, restore the snapshot to a new Elasticsearch cluster, and then start Enterprise Search on the new cluster.
  • In-place upgrades in read-only mode (write downtime, no client-side changes) — switch a cluster to read-only mode, start new Enterprise Search instances to perform the upgrade, shut down old instances and remove read-only mode.

Unfortunately, we cannot prescribe the best upgrade path for your specific situation. It will depend on your requirements for Enterprise Search service availability, on the capabilities of the platform you are using to manage the deployment, and on available resources, etc. See the details below to better understand pros and cons of each upgrade path.

In-place upgrade with downtime

edit

If you are able to handle a downtime of the Enterprise Search service, the easiest upgrade method is an in-place upgrade. We recommend this method to all customers who are able to schedule a maintenance window for their service or for non-mission-critical applications based on Enterprise Search.

This upgrade method has a few characteristics that make it uniquely suitable for many deployments:

  • No need for new infrastructure — everything is done in-place and you do not need to provision any new instances of Elasticsearch or Enterprise Search.
  • No need to change client configurations — all of your API clients can use the same API endpoints before and after the upgrade since new Enterprise Search instances will simply replace the existing ones.

The disadvantages of this method:

  • Downtime required — you have to shut down all of your Enterprise Search instances to perform the upgrade and your service will not be available until the upgrade is complete. With proper planning, the downtime period can be reduced significantly, but downtime is still unavoidable.
  • Harder to roll back — if your upgrade fails for any reason, we do not recommend rolling back to the older version of Enterprise Search due to potential issues with the partially migrated dataset in Elasticsearch. You would have to restore from a backup to get back to your original state, which would prolong the downtime.

Here are the steps you need to take:

  1. (Add to this process any steps described in the breaking changes for the new version of Enterprise Search. See Breaking changes by version and product.)
  2. Stop ALL of your Enterprise Search instances.
  3. Back up your data from Elasticsearch, using snapshots.
  4. Upgrade Enterprise Search packages on your servers, or change your container image tags to point to the latest version if you use Docker or Kubernetes. Retain your existing configuration files or merge them with the configuration files from the new packages.
  5. Start up the new version of Enterprise Search and new instances will take care of coordinating and performing the upgrade before starting up and accepting your API traffic.

Snapshot-based upgrades

edit

For situations where Enterprise Search downtime is not acceptable or in cases where you want to ensure a rollback is possible — no matter what happens during an upgrade — the safest way to perform the upgrade is through snapshot-based cloning of a deployment.

As you can see, this process is a lot more involved, but it does guarantee data consistency and allows you to perform a migration without a downtime of the Search API. Here are the advantages of this method:

  • No Search API downtime required — both the old Enterprise Search deployment and the new one are able to handle search traffic throughout the upgrade process, meaning your clients should not notice the migration at all.
  • Easy to roll back — if you notice any issues with the new deployment (B) of Enterprise Search, you can retry the migration as many times as you need to, since your original deployment is still functional.

The method has a few important disadvantages:

  • The need for additional infrastructure — you need to provision a new Elasticsearch cluster and deploy a set of new Enterprise Search instances during the upgrade. This requires some coordination and additional compute resources during the upgrade process.
  • The need to change the client configuration — your API clients need to be switched to the new deployment endpoint after you perform the upgrade. Please note: You could proxy your traffic through a load-balancer (ELB, ALB, etc) or a CDN to keep your API endpoint stable while you replace the Enterprise Search cluster behind the proxy.
  • For some deployments it may be possible to scale down Elasticsearch and Enterprise Search clusters before using the freed-up hardware to provision new instances, but the details of that operation are beyond the scope of this guide.

Here are the steps you need to take:

  1. (Add to this process any steps described in the breaking changes for the new version of Enterprise Search. See Breaking changes by version and product.)
  2. Stop writes into your Enterprise Search deployment (A).
  3. Use the Enterprise Search read-only mode feature to block the deployment and ensure a consistent migration.
  4. Create a backup of your Elasticsearch cluster (A) using snapshots.
  5. Create a new Elasticsearch cluster (B) and restore data from the latest snapshot.
  6. Deploy a new set of Enterprise Search instances (B) using the the new Elasticsearch cluster (B) as the data store. Re-use the Enterprise Search configuration (config/enterprise-search.yml) from your original deployment, (A).
  7. At this point you should have two separate Enterprise Search deployments both serving the same data. You should spot-check to make sure the new deployment (B) looks correct to you.
  8. Switch API traffic from the original Enterprise Search deployment (A) to the new one (B).
  9. Shut down the old Enterprise Search and Elasticsearch clusters.

In-place upgrade with read-only mode

edit

Finally, for situations where you do not have the ability to provision new infrastructure during an upgrade, but scheduling Enterprise Search downtime is not an option, you can perform an in-place upgrade using Enterprise Search’s read-only mode.

This method has the following advantages:

  • No Search API downtime required — both old and new Enterprise Search instances are able to handle search traffic throughout the upgrade process, meaning clients should be unaffected by the migration.
  • No need to change client configurations — all of your API clients can use the same API endpoints before and after the upgrade, since new Enterprise Search instances will simply replace the existing ones.
  • No need for new infrastructure — everything can be done in-place and you do not need to provision any new instances of Elasticsearch or Enterprise Search.

The only comparative disadvantage of this upgrade method is that it makes it harder to roll back if your upgrade fails for any reason. If you experience any issues during the upgrade, we do not recommend enabling writes on the older version of Enterprise Search due to potential issues with the partially migrated dataset in Elasticsearch. You would have to restore from a backup to return to your original state, which may require downtime.

Here are the steps you need to take:

  1. (Add to this process any steps described in the breaking changes for the new version of Enterprise Search. See Breaking changes by version and product.)
  2. Stop writes to your Enterprise Search deployment. If possible, stop your indexing requests at the source (disable indexing jobs, etc). Enable read-only mode on your Enterprise Search cluster.
  3. Back up your data from Elasticsearch (using snapshots).
  4. Start a new set of Enterprise Search instances running the latest version of the product. You can do a rolling upgrade, stopping, upgrading and starting one instance at a time. Or you could provision a full new fleet of Enterprise Search instances. In any case, re-use the Enterprise Search configuration (config/enterprise-search.yml) from your original deployment.
  5. New Enterprise Search instances will perform an upgrade and then become available to serve your Search API traffic.
  6. Once new Enterprise Search instances are running, stop your old Enterprise Search instances.
  7. Finally, remove read-only mode mode from the cluster and re-enable your indexing jobs. You may need to retry any indexing requests that may have failed during the upgrade.

Troubleshooting Enterprise Search upgrades

edit

As with any software upgrades, there is always a chance that things will not go according to plan. This means you need to plan and prepare for the possibility of your Enterprise Search upgrade failing. You can review the comparison of different upgrade methods above and ensure you are comfortable with the risks associated with your preferred upgrade method before starting the upgrade process.

If you experience an upgrade failure, the information below should help you identify the causes of the issue and allow you to retry the upgrade if needed. Enterprise Search should be able to recover from many potential failures, meaning you should be able to retry an upgrade if it fails.

If your Enterprise Search upgrade fails, please do the following:

  1. Check the Enterprise Search app server log for the new deployment — app-server.log — to see the details of what happened.
  2. If the upgrade failed due to an issue with Elasticsearch that you can fix, do that and then re-attempt the upgrade by starting a new Enterprise Search instance again.
  3. If the process consistently fails, roll back using a method specific to your preferred upgrade path. Capture the app-server.log file and file a support case with Elastic if you need further help troubleshooting the upgrade process.