Elasticsearch 7.14 released
We’re pleased to announce Elasticsearch 7.14.0, based on Apache Lucene 8.9.0.
Version 7.14 is the latest stable release of Elasticsearch and is now available for deployment via Elasticsearch Service on Elastic Cloud or via download for use in your own environment(s).
Ready to roll up your sleeves and get started? We have the links you need:
- Start Elasticsearch on Elastic Cloud
- Download Elasticsearch
- Elasticsearch 7.14.0 release notes
- Elasticsearch 7.14.0 breaking changes
Swing for the geofences with Elasticsearch 7.14
When searching data, “where” is a common and important question. Whether you’re trying to identify whether an outage is a local or global issue, locating adversaries attacking your network, or simply tracking where your food delivery is, Elasticsearch and the Elastic Stack can answer the “where” questions with geospatial data. Elasticsearch 7.14 includes several updates that make the Elastic Stack even more of a geospatial powerhouse.
Query geo shapes by height, width, and centroid with runtime fields
Normally, geo shape information is not available within the Lucene index, but this is information that can be computed from the geo_shapes. By using the new Painless script support for geo_shape field type in Elasticsearch 7.14, you can generate runtime fields that contain information about the geometric characteristics of geo_shapes and query and aggregate on these fields. Using Painless, you can run queries using runtime fields on information like a shape's height, width, and centroid.
Geotile grid aggregation over geo_shape is 15% faster
At Elastic, we are committed to making our software easier to use, lowering the storage footprint, and of course improving performance. In Elasticsearch 7.14, geotile grid aggregation over geo_shape is now significantly faster. By computing the range of tiles that needs to be added to the geotile aggregation and a simple iteration to add the corresponding tiles, we were able to achieve roughly 15% better performance.
Update and reload custom databases at runtime
The geoip processor adds information about the geographical location of an IPv4 or IPv6 address. By default, the processor uses the GeoLite2 City, GeoLite2 Country, and GeoLite2 ASN GeoIP2 databases from MaxMind. Elasticsearch can also run custom databases like other city, country, and ASN GeoIP2 databases. In 7.14, you can now update and reload custom databases at runtime. Updating and reloading at runtime gives you the ability to test the custom database and have a fallback to the default databases if there is an issue.
Elasticsearch is 10% smaller, using new field type match_only_text
Elasticsearch 7.14 introduces a new field type called match_only_text. Match_only_text can be used as a drop-in replacement for the text field type in logging datasets and leads to a 10% reduction of storage requirements. The compromises done to achieve this significant cost reduction are typically irrelevant for logs data: relevance score is calculated by number of matching terms (but who does relevance ranking on logs?), span queries are not supported, and phrase and intervals queries are slower — nothing that logs indices users will miss. Read our dedicated blog post to learn more about this new field type.
Migrate node attributes to node roles new API
Since the introduction of data tiers and node roles, there is a need for an API to migrate old node attributes to node roles. In Elasticsearch 7.14 we are introducing a new API to migrate the existing indices, and ILM policies to use node roles (specialized data tiers) for allocation routing.
New developments to search in Elasticsearch
New range aggregation
With the new range aggregation over histogram fields, you can calculate the number of occurrences within a range of data that is in a histogram. Ranges are a useful way to analyze statistical data, especially in use cases like observability and security. For example, it would be helpful to know the number of high CPU occurrences on an application within a histogram or two-week span in order to identify performance issues.
Aggregation performance enhancements
In addition to faster performance in geotile grid aggregations, we have also optimized single bucket date_histograms. When you have a date_histogram aggregation where there is a single bucket and no sub-aggregations, you’ll now receive results approximately 10 times faster. When the whole index falls within one bucket, we simply relate to the metadata of the index instead of spending the time scanning the index, and these are surprisingly common conditions. Depending on your use case, you might see this performance boost when buckets are larger than a day if you are using daily indices.
Faster performance is always great, but how about lowering the consumption of heap memory? In 7.14 we are improving the heap memory usage of composite aggregation by avoiding global ordinals. Composite aggregations no longer need global ordinals, reducing resource consumption for batch-like jobs such as transform. Elastic’s machine learning transforms help create a data set from converting existing Elasticsearch indices into entity-centric indices that summarize the behavior of data you are interested in.
If you are interested in learning more about recent aggregation optimizations, check out our blog posts on the new aggregations released in 7.13 and date_histogram aggregations in 7.11.
Search enhancements
When entering data into an app, getting help or autocomplete suggestions can be a life saver. The Terms enum API can be used to discover terms in an index that match a partial string, which is useful for implementing autocomplete. Unlike other completion/suggestion APIs that attempt typo correction or using specialized indices with boosts, the new API is focused on raw speed. The result set can be partial and is presented in alphabetical order. The API is designed to provide excellent performance on very large data sets, so results appear while the user types. Imagine you are working in Kibana and you can’t quite think of the value of a field you are entering in. Autocomplete can help you fill out your field values.
Elasticsearch machine learning
Transforms: Support for top metrics
Transforms are now able to support the top metric aggregation. This will improve performance when grouping by many fields. If these fields are descriptive and have the same cardinality (for example, customer first and last name describe their customer_id), then using top metrics significantly reduces the work needed to be done by aggregations. It is also a usability improvement if configuring top or last value, which previously would have required a scripted metric.
Anomaly detection: Reset job API
The ml-reset-job api makes it easier to start anomaly detection jobs again from scratch, to put a job back to the state it was in immediately after creation — equivalent to deleting it and recreating it, but without the need to remember the configuration. It also simplifies support, as you will be able to reset their job by a single click.
That's all folks…
7.14 is another great release for Elasticsearch, and we couldn't cover all of it within this blog. Be sure to check out more in the release highlights.
With today’s release, our Elastic Enterprise Search, Elastic Observability, and Elastic Security solutions also received updates. To learn more, check out our main Elastic 7.14 release blog or read the Elastic Enterprise Search blog, Elastic Observability blog, and Elastic Security blog for more details.
Ready to get your hands dirty and try some of the new functionalities? Spin up a free 14-day trial of Elastic Cloud or download Elasticsearch today. If you’re new to Elastic Cloud, take a look at our Quick Start guides (bite-sized training videos to get you started quickly) or our free fundamentals training courses. Try it out and be sure to let us know what you think on Twitter (@elastic), in our forum, or on our community slack channel.
The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.