Elasticsearch highlights
editElasticsearch highlights
editThis list summarizes the most important enhancements in Elasticsearch 7.6.2. For the complete list, go to Elasticsearch release highlights.
New histogram field type
editA new histogram field type has been added. The new histogram
field accepts
pre-aggregated histograms which can later be used directly in the
percentiles and
percentile_ranks aggregations.
This allows users to pre-aggregate histogram data locally and only send the final
data structure, saving storage and network bandwidth while retaining the ability to
aggregate it like any other data.
Optimized sorting on long field types
editSorting on a long
field now internally rewrites into a Lucene DistanceFeatureQuery
.
This lets Elasticsearch skip non-competitive hits, which often improves query speed.
In benchmarking tests, this sped up sorts on long
fields by 10x.
Simplifying and operationalizing machine learning
editWith the release of 7.6 the Elastic Stack delivers an end-to-end machine learning pipeline providing the path from raw data to building, testing, and deploying machine learning models in production. Up to this point machine learning in the Elastic Stack had primarily focused on unsupervised techniques by using sophisticated pattern recognition that builds time series models used for anomaly detection. With the new data frame analytics, you can now use labelled data to train and test your own models, store those models as Elasticsearch indices, and use inference to add predicted values to the indices based on your trained models.
One packaged model that we are releasing in 7.6 is language identification. If you have documents or sources that come in a variety of languages, language identification can be used to determine the language of text so you can improve the overall search relevance. Language identification is a trained model that can provide a prediction of the language of any text field.
Cross-cluster search in transforms
editTransforms can now use cross-cluster search (CCS) for the source index. Now you can have separate clusters (for example, project clusters) build entity-centric or feature indices against a primary cluster.