WARNING: Version 2.3 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Mapping changes
editMapping changes
editA number of changes have been made to mappings to remove ambiguity and to ensure that conflicting mappings cannot be created.
One major change is that dynamically added fields must have their mapping confirmed by the master node before indexing continues. This is to avoid a problem where different shards in the same index dynamically add different mappings for the same field. These conflicting mappings can silently return incorrect results and can lead to index corruption.
This change can make indexing slower when frequently adding many new fields. We are looking at ways of optimising this process but we chose safety over performance for this extreme use case.
Conflicting field mappings
editFields with the same name, in the same index, in different types, must have
the same mapping, with the exception of the copy_to
, dynamic
,
enabled
, ignore_above
, include_in_all
, and properties
parameters, which may have different settings per field.
PUT my_index { "mappings": { "type_one": { "properties": { "name": { "type": "string" } } }, "type_two": { "properties": { "name": { "type": "string", "analyzer": "english" } } } } }
Elasticsearch will not start in the presence of conflicting field mappings. These indices must be deleted or reindexed using a new mapping.
The ignore_conflicts
option of the put mappings API has been removed.
Conflicts can’t be ignored anymore.
Fields cannot be referenced by short name
editA field can no longer be referenced using its short name. Instead, the full path to the field is required. For instance:
PUT my_index { "mappings": { "my_type": { "properties": { "title": { "type": "string" }, "name": { "properties": { "title": { "type": "string" }, "first": { "type": "string" }, "last": { "type": "string" } } } } } } }
Previously, the two title
fields in the example above could have been
confused with each other when using the short name title
.
Type name prefix removed
editPreviously, two fields with the same name in two different types could sometimes be disambiguated by prepending the type name. As a side effect, it would add a filter on the type name to the relevant query. This feature was ambiguous — a type name could be confused with a field name — and didn’t work everywhere e.g. aggregations.
Instead, fields should be specified with the full path, but without a type
name prefix. If you wish to filter by the _type
field, either specify the
type in the URL or add an explicit filter.
The following example query in 1.x:
GET my_index/_search { "query": { "match": { "my_type.some_field": "quick brown fox" } } }
would be rewritten in 2.0 as:
Field names may not contain dots
editIn 1.x, it was possible to create fields with dots in their name, for instance:
PUT my_index { "mappings": { "my_type": { "properties": { "foo.bar": { "type": "string" }, "foo": { "properties": { "bar": { "type": "string" } } } } } } }
You can no longer create fields with dots in the name.
Type names may not start with a dot
editIn 1.x, Elasticsearch would issue a warning if a type name included a dot,
e.g. my.type
. Now that type names are no longer used to distinguish between
fields in different types, this warning has been relaxed: type names may now
contain dots, but they may not begin with a dot. The only exception to this
is the special .percolator
type.
Type names may not be longer than 255 characters
editMapping type names may not be longer than 255 characters. Long type names will continue to function on indices created before upgrade, but it will not be possible create types with long names in new indices.
Types may no longer be deleted
editIn 1.x it was possible to delete a type mapping, along with all of the documents of that type, using the delete mapping API. This is no longer supported, because remnants of the fields in the type could remain in the index, causing corruption later on.
Instead, if you need to delete a type mapping, you should reindex to a new index which does not contain the mapping. If you just need to delete the documents that belong to that type, then use the delete-by-query plugin instead.
Type meta-fields
editThe meta-fields associated with had configuration options removed, to make them more reliable:
-
_id
configuration can no longer be changed. If you need to sort, use the_uid
field instead. -
_type
configuration can no longer be changed. -
_index
configuration can no longer be changed. -
_routing
configuration is limited to marking routing as required. -
_field_names
configuration is limited to disabling the field. -
_size
configuration is limited to enabling the field. -
_timestamp
configuration is limited to enabling the field, setting format and default value. -
_boost
has been removed. -
_analyzer
has been removed.
Importantly, meta-fields can no longer be specified as part of the document
body. Instead, they must be specified in the query string parameters. For
instance, in 1.x, the routing
could be specified as follows:
PUT my_index { "mappings": { "my_type": { "_routing": { "path": "group" }, "properties": { "group": { "type": "string" } } } } } PUT my_index/my_type/1 { "group": "foo" }
This 1.x mapping tells Elasticsearch to extract the |
|
This indexing request uses a |
In 2.0, the routing must be specified explicitly:
_timestamp
and _ttl
deprecated
editThe _timestamp
and _ttl
fields are deprecated, but will remain functional
for the remainder of the 2.x series.
Instead of the _timestamp
field, use a normal date
field and set
the value explicitly.
The current _ttl
functionality will be replaced in a future version with a
new implementation of TTL, possibly with different semantics, and will not
depend on the _timestamp
field.
Analyzer mappings
editPreviously, index_analyzer
and search_analyzer
could be set separately,
while the analyzer
setting would set both. The index_analyzer
setting has
been removed in favour of just using the analyzer
setting.
If just the analyzer
is set, it will be used at index time and at search time. To use a different analyzer at search time, specify both the analyzer
and a search_analyzer
.
The index_analyzer
, search_analyzer
, and analyzer
type-level settings
have also been removed, as it is no longer possible to select fields based on
the type name.
The _analyzer
meta-field, which allowed setting an analyzer per document has
also been removed. It will be ignored on older indices.
Date fields and Unix timestamps
editPreviously, date
fields would first try to parse values as a Unix timestamp — milliseconds-since-the-epoch — before trying to use their defined date
format
. This meant that formats like yyyyMMdd
could never work, as values
would be interpreted as timestamps.
In 2.0, we have added two formats: epoch_millis
and epoch_second
. Only
date fields that use these formats will be able to parse timestamps.
These formats cannot be used in dynamic templates, because they are indistinguishable from long values.
Default date format
editThe default date format has changed from date_optional_time
to
strict_date_optional_time
, which expects a 4 digit year, and a 2 digit month
and day, (and optionally, 2 digit hour, minute, and second).
A dynamically added date field, by default, includes the epoch_millis
format to support timestamp parsing. For instance:
mapping.date.round_ceil
setting
editThe mapping.date.round_ceil
setting for date math parsing has been removed.
Boolean fields
editBoolean fields used to have a string fielddata with F
meaning false
and T
meaning true
. They have been refactored to use numeric fielddata, with 0
for false
and 1
for true
. As a consequence, the format of the responses of
the following APIs changed when applied to boolean fields: 0
/1
is returned
instead of F
/T
:
In addition, terms aggregations use a custom formatter for boolean (like for
dates and ip addresses, which are also backed by numbers) in order to return
the user-friendly representation of boolean fields: false
/true
:
"buckets": [ { "key": 0, "key_as_string": "false", "doc_count": 42 }, { "key": 1, "key_as_string": "true", "doc_count": 12 } ]
index_name
and path
removed
editThe index_name
setting was used to change the name of the Lucene field,
and the path
setting was used on object
fields to determine whether the
Lucene field should use the full path (including parent object fields), or
just the final name
.
These setting have been removed as their purpose is better served with the
copy_to
parameter.
Murmur3 Fields
editFields of type murmur3
can no longer change doc_values
or index
setting.
They are always mapped as follows:
{ "type": "murmur3", "index": "no", "doc_values": true }
Attachment Fields
editFields of type attachment
used to index their content in the "main"
multi-field — a sub-field with the same name as the main field, e.g.
my_attachment.my_attachment
. This sub-field will be renamed to
my_attachment.content
. Any mapping settings (e.g. analyzer
) on the old
my_attachment.my_attachment
field will be lost. In this case, the index
needs to be reindexed in 2.x.
The attachment
sub-fields do not support copy_to
.
Mappings in config files not supported
editThe ability to specify mappings in configuration files has been removed. To specify default mappings that apply to multiple indexes, use index templates instead.
Along with this change, the following settings have been removed:
-
index.mapper.default_mapping_location
-
index.mapper.default_percolator_mapping_location
Fielddata formats
editNow that doc values are the default for fielddata, specialized in-memory formats have become an esoteric option. These fielddata formats have been removed:
-
fst
on string fields -
compressed
on geo points
The default fielddata format will be used instead.
Posting and doc-values codecs
editIt is no longer possible to specify per-field postings and doc values formats in the mappings. This setting will be ignored on indices created before 2.0 and will cause mapping parsing to fail on indices created on or after 2.0. For old indices, this means that new segments will be written with the default postings and doc values formats of the current codec.
It is still possible to change the whole codec by using the index.codec
setting. Please however note that using a non-default codec is discouraged as
it could prevent future versions of Elasticsearch from being able to read the
index.
Compress and compress threshold
editThe compress
and compress_threshold
options have been removed from the
_source
field and fields of type binary
. These fields are compressed by
default. If you would like to increase compression levels, use the new
index.codec: best_compression
setting instead.
position_offset_gap
editThe position_offset_gap
option is renamed to position_increment_gap. This was
done to clear away the confusion. Elasticsearch’s position_increment_gap now is
mapped directly to Lucene’s position_increment_gap
The default position_increment_gap
is now 100. Indexes created in Elasticsearch
2.0.0 will default to using 100 and indexes created before that will continue
to use the old default of 0. This was done to prevent phrase queries from
matching across different values of the same term unexpectedly. Specifically,
100 was chosen to cause phrase queries with slops up to 99 to match only within
a single value of a field.
copy_to and multi fields
editA copy_to within a multi field is ignored from version 2.0 on. With any version after 2.1 or 2.0.1 creating a mapping that has a copy_to within a multi field will result in an exception.