Elastic Cloud Enterprise 2.12.0

edit

The following changes are included in this release.

Features

edit

Allow APM and fleet server for agent. Agent supports APM to allow upgrades from legacy to managed.

Feedback link to the Help menu. Added a Give feedback link to the Cloud console header in the Help menu. The link opens a feedback page in a new tab.

Force enroll on container restart. Introduced ephemeral Hosted Agent on container restart.

Migrate to plan level settings, if possible, when editing. When editing a deployment in the UI, user settings are moved from the topology element level to the deployment resource level if it is possible to do so safely.

ECE upgrade to fix 7.14 APM during upgrade to 2.11. As of ECE version 2.11, the the allocator uses different HTTP endpoint for health checks of APM instances. The new endpoint is hosted by Agent process. The previous versions of ECE do not configure the Agent, and so ECE upgrade re-configures all APM of 7.14 and later the way that allocator 2.11 will be able to update the container and start agent.

Enable operator privileges. Enable operator privileges for Elasticsearch clusters. This feature limits access to functionality that should only be used by the operator of the cluster infrastructure.

Enhancements

edit

Validate deployment against template min-version. Additional API validation: Ensure that the stack version used in the deployment is at least the minimum version required by the deployment template.

Improve output of PreInstanceRemovalSafetyChecks. Improved details on activity page when the step Making sure it is safe to remove instances from the cluster fails.

Validate deployment templates: Check referenced observability deployment and snapshot repository. Added validations when creating and updating deployment templates:

  • Observability settings: Make sure the referenced deployments for logging and metrics exist.
  • Snapshot repository settings: Make sure the referenced snapshot repository name is configured.

Updating docs and adding task manager monitoring config. Adds support and docs for:

  • xpack.task_manager.monitored_stats_health_verbose_log.enabled Kibana Task Manager configuration
  • xpack.task_manager.monitored_stats_health_verbose_log.warn_delayed_task_start_in_seconds Kibana Task Manager configuration

Record boot failures. It is now possible to detect boot failure patterns from the Elasticsearch boot log.

Update default deployment name. Default deployment name is My deployment.

Shorten Azure private link endpoint name. Improves the handling of long Azure Private Link Endpoint names. With this release Azure Private Link traffic filters work regardless of the length of their endpoint names.

Make EnsureStorageResources step conditional. The step Preparing to store snapshots will run only when needed.

Consider allocator status when terminating. Plans that terminate deployments will no longer stall out during the step Waiting until instances are terminated if any nodes are on disconnected allocators.

Add support and docs for Manager ephemeral tasks configuration. Adds support and docs for:

  • xpack.task_manager.ephemeral_tasks.enabled Kibana Task Manager configuration
  • xpack.task_manager.ephemeral_tasks.request_capacity Kibana Task Manager configuration
  • xpack.alerting.maxEphemeralActionsPerAlert Kibana Alerting configuration

Rework conditional logic for SuspendSnapshotting. The step Disabling snapshots will no longer run during plans that aren’t stopping nodes. If it fails, the plan is still allowed to continue in some cases.

On-demand heap dump capture UI. Adds on-demand heap dump capture capabilities. Heap dumps can now be captured on-demand from Elasticsearch instances and downloaded from the Cloud UI.

Remove CPU overcommit. ECE no longer supports overriding the CPU overcommit factor for individual deployments, because CPU overcommitting could lead to unpredictable performance.

Include chain status in TLS endpoints responses. Make GET /api/v1/platform/configuration/security/tls/{service_name} responses include a status JSON object with information on the time to life of the service_name certification chain. Check the TLS certificate.

Turn on OOM heap dumps by default. Stop uploading heap dumps. Heap dumps are now taken by default on OOM and are available to download through the API. We’ve stopped uploading heap dumps to S3 and GCS snapshot repositories and heap dump URLs will no longer be returned by the system alerts API.

Add stricter validation to create or update deployment templates. When creating or updating a Deployment template, if it specifies node_roles in the Elasticsearch plan, the template must contain all resource types and all Elasticsearch tiers.

Increase APM and Fleet size to 1GB RAM. We have recently added Fleet Server to the APM component and, as part of that change, we also be updated the deployment templates to set the initial size of the APM and Fleet component to 1GB by default.

Make plan snapshots conditional. A deployment plan no longer creates a snapshot as a first step, unless the plan includes removing data nodes or doing major stack version upgrades.

Elasticsearch: Add settings to enable compression for ingest data transfer. Configure Elasticsearch to use data compression when ingesting data to reduce data transfer rates.

Set use_for_peer_recovery in snapshot settings for 7.15+. The use_for_peer_recovery setting is enabled on snapshot repositories for clusters running stack versions 7.15 or higher. This aims to reduce data transfer costs by using existing snapshots for shard migrations and instance recovery.

Failed autoscaling notification. An email notification has been added to inform you when autoscaling fails to adjust the size of a deployment. This failure will require you to fix your deployment’s health before further autoscaling changes will be attempted.

Added duration or number support for xpack.reporting.queue.timeout. Added the ability to configure xpack.reporting.queue.timeout as a number value (milliseconds) or a string (duration).

Respect maintenance mode in autoscaling service. The constructor autoscaling service will now respect maintenance mode. When a constructor is in maintenance mode it will not process any incoming autoscaling requests.

Skip snapshot steps in termination handler. Terminating an Elasticsearch cluster while using the Skip snapshots option will no longer execute the suspend-snapshotting step. This makes it easier to terminate a cluster that is in an unhealthy state.

Make plan handler LogContext more consistent. Constructor logs will be a bit more consistent in which log attributes are present. Notably, termination plans will now have the is_termination: true log attribute.

Add and default to plan-level user settings editing. For new deployments, or existing deployments without user settings, new user settings will be stored once at the deployment resource level, instead of within each topology element.

Run EnsureRepository less often. Step Ensuring snapshot repository exists will not run as often when it does not need to.

Use restored snapshot name as default. When a deployment is created from a restored snapshot, the snapshot name is used as a placeholder for the new deployment name.

Improve error output for failed snapshot restores. Failures during the Restoring snapshot step that are due to Elasticsearch errors will now have details of the error on the Activity page.

Change in Kibana setting Allow server.maxPayload. Kibana renamed the server.maxPayloadBytes setting to server.maxPayload in 7.13.0. server.maxPayloadBytes will be removed in 8.0.0. With this Elastic Cloud Enterprise release, deployments can start using the new setting.

Warn of upcoming certs expirations. Include Platform warnings in Cloud UI when ui/api or propxy TLS chains are about to become invalid due any of the certificates expiring.

Remove dependencies on python templating for ProxyV2 configuration. The Proxy configuration will be generated by the proxy during startup.

Support Enterprise Search heap dump download and capture in UI. Extends support for on-demand heap dump capture and download to Enterprise Search instances.

Bug fixes

edit

Fixed a bug in ApplyMonitoringConfig. Fixed a bug that could cause monitoring configuration to not be applied after a major version upgrade.

Fixed logs and metrics links when deployment does not have a Kibana instance. Fixed an issue where the logs and metrics links either were disabled or didn’t render when the source or destination deployment did not have a Kibana instance.

Fixed a UI rendering bug. Fixes bug in the UI where new plan attempts would be occasionally render incorrectly. This was usually fixed with a simple refresh.

Propagates custom headers to Elastic Agent . Adds support for custom headers that was added in Beats PR #26275. This PR modifies the fleet-setup config so these headers are propagated to the Elastic Agent similar to APM.

Issues logging in with SSO to Kibana. When logging into Kibana from the Cloud UI, single sign currently required pressing the "Login with Elastic Cloud" button, but now it automatically sends you directly into Kibana.

CTS - Gracefully handle returning keys of non-existent object. The CTS framework implements some JSON-parsing utility methods. One of them, GetKeys returns the keys for a JSON object. This method causes a panic if the JSON object itself is non-existent. This PR fixes this panic by checking for the existence of the JSON object first. If it isn’t found, it returns an empty slice of keys.

Apply enterprise license during 7.8.1+ upgrades. Fixes a bug which causes clusters to not have access to new enterprise features immediately while upgrading to 7.8.1 or higher (from version 7.8.0 or lower).

Hide terminate button for Kibana. Removes a misleading Terminate button that users couldn’t click.

Add breadcrumbs to Getting started page. Adds header breadcrumbs to the Getting started page.

Validate that there is only one resource for each type. Makes sure that it is not possible to create two resources of the same type in the same deployment (for example two Elasticsearch clusters), as this is not supported.

Add role "kibana_system" to EntSearch 7.14+. Add role "kibana_system" to the service user for Enterprise Search at versions 7.14+.

Remove legacy and buggy v0.1 certificates routes. We are removing the deprecated /api/v0.1/regions/local-1/certificates/<proxy|admin-console>/*.pem routes after having detected bugs in their implementation and noticed that they are not used anymore.

Route allocator service traffic to tiebreakers. ECE Only Fixed an issue where heap dumps could not be captured, listed, or downloaded from Elasticsearch tiebreaker nodes.

Breaking changes

edit

Recovery from snapshot is now enabled by default. Starting in Elastic Cloud Enterprise version 2.12 with Elasticsearch versions 7.15 and above, the default configuration for clusters is to use snapshots to repopulate data in new instances on, for example, node vacates or moves to allow greater capacity. You should disable this configuration in your cluster if your snapshot repository has associated download costs. This change can be done per cluster by changing the cluster setting indices.recovery.use_snapshots.

Turn on OOM heap dumps by default. Stop uploading heap dumps. Heap dumps are now taken by default on OOM and are available to download through the API. We’ve stopped uploading heap dumps to S3 and GCS snapshot repositories and heap dump URLs are no longer be returned by the system alerts API.