Discover in Kibana uses the fields API in 7.12
With Elastic 7.12, Discover now uses the fields
API by default. Reading from _source
is still supported through a switch in the Advanced Settings. This change stems from updates made to Elasticsearch in 7.11 with the extension of the Search API to include the new fields
parameter.
When using the new search parameter, both a document’s raw source and the index mappings to load and return values are used. Because it makes use of the mappings, fields
has some advantages over referencing the _source
directly: it accepts multifields and field aliases, and also formats field values like dates in a consistent way.
In short, the fields option simplifies all the nuances of _source
and other places by dealing with multi-mapping and aliases, type coercion, docvalues, and all the other edge cases.
What are the benefits of the new search parameter?
More and more use cases arise where Elasticsearch is treated as a data store. In order to improve the simplicity of retrieving fields without the expectation of expertise in areas like mappings, doc_values, stored fields, etc., we decided to extend the existing API.
Some benefits of the new parameter:
- If a non-standard field like a field alias, multi-field, or constant_keyword is specified in
fields
, mappings will find and return the right value. - The fields are returned in a flat list, as opposed to structured JSON.
- Each value would be returned in a 'canonical' format — for example if a field is mapped as an integer, it will be returned as an integer even if it was specified as a string in the
_source
.
What should I expect?
The most meaningful change is the support for runtime fields. When a runtime field is defined in a mapping, it will show up in Discover like any other field type. The next sections expand on all of the improvements and changes.
Default column is now called Document
If there are no columns configured, the default column changes from _source
to Document
.
Existing saved searches with _source
column will be changed to Document
. Unlike source, the Document
column cannot be combined with other columns in the grid.
Although it is formatted similar to the _source
column, it is a collection of fields as returned from an Elasticsearch response and therefore, combining it with other columns would not make much sense.
Multi-fields are grouped together
Multi-field is the field mapped differently for different purposes. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations.
Due to the changes in the API, multi-sub fields will now have values where before they didn’t. In order to avoid suddenly increasing the number of fields that are shown, we’ve decided to group multi-sub fields under the root field so they are still usable as columns, but not as prominent.
Differences in how object fields are displayed
Object fields usually consist of one or more leaf fields. Example of an object field:
"manager": { "age": 30, "name": { "first": "John", "last": "Smith" } }
Before the introduction of the fields API, Discover would show the object roots, together with the leaf fields.
With the introduction of the fields API, object roots won't be shown in the field list anymore (unless there is a previously saved search containing one as a column). Leaf fields will now be correctly detected as having data, where previously they were not. This will be the case even when the document has an array of objects. Leaf fields will work for all objects, and the values of all objects in that array will be shown flattened.
Differences in how nested fields are displayed
The nested data type is used for indexing arrays of objects when there is a need to maintain the independence of each object in the array. They need to be specified as a type in the mapping, otherwise they would be treated as object
data type.
Before the introduction of the fields API, Discover would show a root field if at least one document loaded has an array of objects for that field. Otherwise, the root field wouldn’t be shown. The leaf fields will be shown if they appear in at least one document. Leaf fields that do not appear in any document or only in documents that have arrays of objects for this object field, will not appear in the field list unless "Hide missing fields" is switched off.
In the table, root fields as columns will show the JSON of the fields’ content ONLY if that document contains an array of objects and will be empty otherwise.
Let’s say we have a mapping as follows:
"products": { "type": "nested", "properties": { "name": { "type": "keyword" }, "price": { "type": "double" } } },
And the following documents:
POST /discover_test/_doc { "products": [ { "name": "Kibana", "price": 42.23 }, { "name": "Faber-Castell Polychromos", "price": 29.95 } ], } POST /discover_test/_doc { "products": { "name": "Product name", "price": 123.21 }, }
In this case, when using _source
, Discover would only correctly display the JSON of the products object in the first document since it is an array. The second document, however, is a single object, meaning it would display as empty:
Leaf fields as columns will only show values if the document did not contain an array of objects and stay empty otherwise:
With the introduction of the fields API, the root of a nested field can be added from the field list and will show the structured JSON of that field in the document. Nested field columns now show the JSON correctly for all documents:
In the case of an existing saved search with a nested leaf field as a column, this will show as empty now:
We want to have this behavior going forward since we assume the nested mapping type is only used when an array of objects is indexed.
Unmapped fields won’t show up if field filters are configured
When a previously unseen field is found in a document, Elasticsearch will add the new field to the type mapping (unless dynamic parameter is set to false). Discover will continue showing these fields as it did previously, except in the case when field filters are configured. In that case, unmapped fields will not show up.
Limitations of _source
Source contains the raw information from the document, exactly as it was ingested. This means users lose the benefit of features built on top of _source
, like runtime fields, multi-fields, date formatting, and alias fields. Also, the whole _source
object must be loaded and parsed even if only a small number of fields are requested.
With the introduction of the fields
parameter, developers can retrieve the field values after the logic from these features is applied. This abstraction removes the need for developers to know implementation details about a field. It doesn’t matter if the field is runtime or not, it will always be accessed in the same way.
Here is an example. Let’s say we have a mapping defined as follows:
PUT /my-index { "mappings": { "properties": { "xlong": { "type": "long" }, "xfloat": { "type": "float" } } } } POST my-index/_doc/1 { "xlong": "1.5", "Xfloat": "1.5" } POST my-index/_doc/2 { "Xlong": "2", "Xfloat": "2" }
The field xlong is mapped as long, which means that internally its value will be rounded towards zero. In that case, 1.5 becomes 1. When executing the following search in Discover,
xlong > 1 and xlong < 2
the search will not return any values, which is expected. However, when viewing documents in Discover, when reading fields from _source
, the value will be displayed exactly as it was ingested, ie. 1.5:
This used to create a lot of confusion. With the fields option enabled, the actual mapped value (1) is displayed:
Wrap up
We are excited to bring the power and flexibility of runtime fields to Discover, including the switch to using the fields API. If you are currently not on Elastic 7.12, but would still like to test out these updates, starting a free Elasticsearch Service trial on Elastic Cloud is a fast and easy way to do so.
To learn more about other recent product updates, be sure to read our Elastic 7.12 release blog.