Increase the disk capacity of data nodes

Disk capacity pressures may cause index failures, unassigned shards, and cluster instability.

Elasticsearch uses disk-based shard allocation watermarks to manage disk space on nodes, which can block allocation or indexing when nodes run low on disk space. Refer to Watermark errors for additional details on how to address this situation.

To increase the disk capacity of the data nodes in your cluster, complete these steps:

Estimate how much disk capacity you need.
Increase the disk capacity.

Estimate the amount of required disk capacity

The following steps explain how to retrieve the current disk watermark configuration of the cluster and how to check the current disk usage on the nodes.

Retrieve the relevant disk thresholds that indicate how much space should be available. The relevant thresholds are the high watermark for all the tiers apart from the frozen one and the frozen flood stage watermark for the frozen tier. The following example demonstrates disk shortage in the hot tier, so only the high watermark is retrieved:
```
				GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high*
		
```
The response looks like this:
```
{
  "defaults": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "high": "90%",
              "high.max_headroom": "150GB"
            }
          }
        }
      }
    }
  }
}
		
```
The above means that in order to resolve the disk shortage, disk usage must drop below the 90% or have more than 150GB available. Read more on how this threshold works here.

Find the current disk usage, which in turn indicates how much extra space is required. For simplicity, our example has one node, but you can apply the same for every node over the relevant threshold.

						GET _cat/allocation?v&s=disk.avail&h=node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards
		
	

The response looks like this:

		node                disk.percent disk.avail disk.total disk.used disk.indices shards
instance-0000000000           91     4.6gb       35gb    31.1gb       29.9gb    111

In this scenario, the high watermark configuration indicates that the disk usage needs to drop below 90%, while the current disk usage is 91%.

Increase the disk capacity of your data nodes

Here are the most common ways to increase disk capacity:

You can expand the disk space of the existing nodes. This is typically achieved by replacing your nodes with ones with higher capacity.
You can add additional data nodes to the data tier that is short of disk space, increasing the overall capacity of that tier and potentially improving performance by distributing data and workload across more resources.

When you add another data node, the cluster doesn't recover immediately and it might take some time until shards are relocated to the new node. You can check the progress with the following API call:

				GET /_cat/shards?v&h=state,node&s=state

If in the response the shards' state is RELOCATING, it means that shards are still moving. Wait until all shards turn to STARTED.

Warning

In ECE, resizing is limited by your allocator capacity.

To increase the disk capacity of the data nodes in your cluster:

Log in to the Elastic Cloud console or ECE Cloud UI.
On the home page, find your deployment and select Manage.
Go to Actions > Edit deployment and check that autoscaling is enabled. Adjust the Enable Autoscaling for dropdown menu as needed and select Save.
If autoscaling is successful, the cluster returns to a healthy status. If the cluster is still out of disk, check if autoscaling has reached its set limits and update your autoscaling settings.

You can also add more capacity by adding more nodes to your cluster and targeting the data tier that may be short of disk. For more information, refer to Add a preferred data tier to a deployment.

To increase the data node capacity in your cluster, you can add more nodes to the cluster, or increase the disk capacity of existing nodes. Disk expansion procedures depend on your operating system and storage infrastructure and are outside the scope of Elastic support. In practice, this is often achieved by removing a node from the cluster and reinstalling it with a larger disk.

To increase the disk capacity of data nodes in your Elastic Cloud on Kubernetes cluster, you can either add more data nodes or increase the storage size of existing nodes.

Option 1: Add more data nodes

Update the count field in your data node NodeSet to add more nodes:

		apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 9.2.4
  nodeSets:
  - name: data-nodes
    count: 5
    config:
      node.roles: ["data"]
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
		
	

Increase from previous count

Apply the changes:
```
kubectl apply -f your-elasticsearch-manifest.yaml
		
```
ECK automatically creates the new nodes and Elasticsearch will relocate shards to balance the load. You can monitor the progress using:
```
				GET /_cat/shards?v&h=state,node&s=state
		
```

Option 2: Increase storage size of existing nodes

If your storage class supports volume expansion, you can increase the storage size in the volumeClaimTemplates:

		apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 9.2.4
  nodeSets:
  - name: data-nodes
    count: 3
    config:
      node.roles: ["data"]
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 200Gi
		
	

Increased from previous size

Apply the changes. If the volume driver supports ExpandInUsePersistentVolumes, the filesystem will be resized online without restarting Elasticsearch. Otherwise, you may need to manually delete the Pods after the resize so they can be recreated with the expanded filesystem.

For more information, refer to Update your deployments and Volume claim templates.