Migrating to 9.0

edit

This section discusses the changes that you need to be aware of when migrating your application to Elasticsearch 9.0.

See also What’s new in 9.0 and Release notes.

Coming in 9.0.0.

Breaking changes

edit

The following changes in Elasticsearch 9.0 might affect your applications and prevent them from operating normally. Before upgrading to 9.0, review these changes and take the described steps to mitigate the impact.

There are no notable breaking changes in Elasticsearch 9.0. But there are some less critical breaking changes.

Analysis changes

edit
Snowball stemmers have been upgraded

Details
Lucene 10 ships with an upgrade of its Snowball stemmers. For details see https://github.com/apache/lucene/issues/13209. Users using Snowball stemmers that are experiencing changes in search behaviour on existing data are advised to reindex.

Impact
The upgrade should generally provide improved stemming results. Small changes in token analysis can lead to mismatches with previously index data, so existing indices using Snowball stemmers as part of their analysis chain should be reindexed.

The "german2" snowball stemmer is now an alias for the "german" stemmer

Details
Lucene 10 has merged the improved "german2" snowball language stemmer with the "german" stemmer. For Elasticsearch, "german2" is now a deprecated alias for "german". This may results in slightly different tokens being generated for terms with umlaut substitution (like "ue" for "ü" etc…​)

Impact
Replace usages of "german2" with "german" in analysis configuration. Old indices that use the "german" stemmer should be reindexed if possible.

The persian analyzer has stemmer by default

Details
Lucene 10 has added a final stemming step to its PersianAnalyzer that Elasticsearch exposes as persian analyzer. Existing indices will keep the old non-stemming behaviour while new indices will see the updated behaviour with added stemming. Users that wish to maintain the non-stemming behaviour need to define their own analyzer as outlined in https://www.elastic.co/guide/en/elasticsearch/reference/8.15/analysis-lang-analyzer.html#persian-analyzer. Users that wish to use the new stemming behaviour for existing indices will have to reindex their data.

Impact
Indexing with the persian analyzer will produce slightly different tokens. Users should check if this impacts their search results. If they wish to maintain the legacy non-stemming behaviour they can define their own analyzer equivalent as explained in https://www.elastic.co/guide/en/elasticsearch/reference/8.15/analysis-lang-analyzer.html#persian-analyzer.

The Korean dictionary for Nori has been updated

Details
Lucene 10 ships with an updated Korean dictionary (mecab-ko-dic-2.1.1). For details see https://github.com/apache/lucene/issues/11452. Users experiencing changes in search behaviour on existing data are advised to reindex.

Impact
The change is small and should generally provide better analysis results. Existing indices for full-text use cases should be reindexed though.

Cluster and node setting changes

edit
Minimum shard balancer threshold is now 1.0

Details
Earlier versions of Elasticsearch accepted any non-negative value for cluster.routing.allocation.balance.threshold, but values smaller than 1.0 do not make sense and have been ignored since version 8.6.1. From 9.0.0 these nonsensical values are now forbidden.

Impact
Do not set cluster.routing.allocation.balance.threshold to a value less than 1.0.

Remove cluster.routing.allocation.disk.watermark.enable_for_single_data_node setting

Details
Prior to 7.8, whenever a cluster had only a single data node, the watermarks would not be respected. In order to change this in 7.8+ in a backwards compatible way, we introduced the cluster.routing.allocation.disk.watermark.enable_for_single_data_node node setting. The setting was deprecated in 7.14 and was made to accept only true in 8.0

Impact
No known end user impact

Remove deprecated xpack.searchable.snapshot.allocate_on_rolling_restart setting

Details
The xpack.searchable.snapshot.allocate_on_rolling_restart setting was created as an escape-hatch just in case relying on the cluster.routing.allocation.enable=primaries setting for allocating searchable snapshots during rolling restarts had some unintended side-effects. It has been deprecated since 8.2.0.

Impact
Remove xpack.searchable.snapshot.allocate_on_rolling_restart from your settings if present.

Remove unsupported legacy value for discovery.type

Details
Earlier versions of Elasticsearch had a discovery.type setting which permitted values that referred to legacy discovery types. From v9.0.0 onwards, the only supported values for this setting are multi-node (the default) and single-node.

Impact
Remove any value for discovery.type from your elasticsearch.yml configuration file.

Ingest changes

edit
Remove ecs option on user_agent processor

Details
The user_agent ingest processor no longer accepts the ecs option. (It was previously deprecated and ignored.)

Impact
Users should stop using the ecs option when creating instances of the user_agent ingest processor. The option will be removed from existing processors stored in the cluster state on upgrade.

Remove ignored fallback option on GeoIP processor

Details
The option fallback_to_default_databases on the geoip ingest processor has been removed. (It was deprecated and ignored since 8.0.0.)

Impact
Customers should stop remove the noop fallback_to_default_databases option on any geoip ingest processors.

Mapping changes

edit
Remove support for type, fields, copy_to and boost in metadata field definition

Details
The type, fields, copy_to and boost parameters are no longer supported in metadata field definition

Impact
Users providing type, fields, copy_to or boost as part of metadata field definition should remove them from their mappings.

REST API changes

edit
Apply more strict parsing of actions in bulk API

Details
Previously, the following classes of malformed input were deprecated but not rejected in the action lines of the a bulk request: missing closing brace; additional keys after the action (which were ignored); additional data after the closing brace (which was ignored). They will now be considered errors and rejected.

Impact
Users must provide well-formed input when using the bulk API. (They can request REST API compatibility with v8 to get the previous behaviour back as an interim measure.)

Error JSON structure has changed when detailed errors are disabled

Details
This change modifies the JSON format of error messages returned to REST clients when detailed messages are turned off. Previously, JSON returned when an exception occurred, and http.detailed_errors.enabled: false was set, just consisted of a single "error" text field with some basic information. Setting http.detailed_errors.enabled: true (the default) changed this field to an object with more detailed information. With this change, non-detailed errors now have the same structure as detailed errors. "error" will now always be an object with, at a minimum, a "type" and "reason" field. Additional fields are included when detailed errors are enabled. To use the previous structure for non-detailed errors, use the v8 REST API.

Impact
If you have set http.detailed_errors.enabled: false (the default is true) the structure of JSON when any exceptions occur now matches the structure when detailed errors are enabled. To use the previous structure for non-detailed errors, use the v8 REST API.

Remove cluster state from /_cluster/reroute response

Details
The POST /_cluster/reroute API no longer returns the cluster state in its response. The ?metric query parameter to this API now has no effect and its use will be forbidden in a future version.

Impact
Cease usage of the ?metric query parameter when calling the POST /_cluster/reroute API.

Remove deprecated local attribute from alias APIs

Details
The following APIs no longer accept the ?local query parameter: GET /_alias, GET /_aliases, GET /_alias/{name}, HEAD /_alias/{name}, GET /{index}/_alias, HEAD /{index}/_alias, GET /{index}/_alias/{name}, HEAD /{index}/_alias/{name}, GET /_cat/aliases, and GET /_cat/aliases/{alias}. This parameter has been deprecated and ignored since version 8.12.

Impact
Cease usage of the ?local query parameter when calling the listed APIs.

Remove legacy params from range query

Details
The deprecated range query parameters to, from, include_lower, and include_upper are no longer supported.

Impact
Users should use lt, lte, gt, and gte query parameters instead.

Remove support for deprecated force_source highlighting parameter

Details
The deprecated highlighting force_source parameter is no longer supported.

Impact
Users should remove usages of the force_source parameter from their search requests.

Deprecations

edit

The following functionality has been deprecated in Elasticsearch 9.0 and will be removed in a future version. While this won’t have an immediate impact on your applications, we strongly encourage you to take the described steps to update your code after upgrading to 9.0.

To find out if you are using any deprecated functionality, enable deprecation logging.

Mapping deprecations

edit
Deprecate _source.mode in mappings

Details
Configuring _source.mode in mappings is deprecated and will be removed in future versions. Use index.mapping.source.mode index setting instead.

Impact
Use index.mapping.source.mode index setting instead

REST API deprecations

edit
Document _type deprecated on simulate pipeline API

Details
Passing a document with a _type property is deprecated in the /_ingest/pipeline/{id}/_simulate and /_ingest/pipeline/_simulate APIs.

Impact
Users should already have stopped using mapping types, which were deprecated in Elasticsearch 7. This deprecation warning will fire if they specify mapping types on documents pass to the simulate pipeline API.

[Inference API] Deprecate elser service

Details
The elser service of the inference API will be removed in an upcoming release. Please use the elasticsearch service instead.

Impact
In the current version there is no impact. In a future version, users of the elser service will no longer be able to use it, and will be required to use the elasticsearch service to access elser through the inference API.