Decrease the disk usage of data nodes
editDecrease the disk usage of data nodes
editIn order to decrease the disk usage in your cluster without losing any data, you can try reducing the replicas of indices.
Reducing the replicas of an index can potentially reduce search throughput and data redundancy. However, it can quickly give the cluster breathing room until a more permanent solution is in place.
Use Kibana
- Log in to the Elastic Cloud console.
-
On the Elasticsearch Service panel, click the name of your deployment.
If the name of your deployment is disabled your Kibana instances might be unhealthy, in which case please contact Elastic Support. If your deployment doesn’t include Kibana, all you need to do is enable it first.
- Open your deployment’s side navigation menu (placed under the Elastic logo in the upper left corner) and go to Stack Management > Index Management.
-
In the list of all your indices, click the
Replicas
column twice to sort the indices based on their number of replicas starting with the one that has the most. Go through the indices and pick one by one the index with the least importance and higher number of replicas.Reducing the replicas of an index can potentially reduce search throughput and data redundancy.
-
For each index you chose, click on its name, then on the panel that appears click
Edit settings
, reduce the value of theindex.number_of_replicas
to the desired value and then clickSave
. - Continue this process until the cluster is healthy again.
In order to estimate how many replicas need to be removed, first you need to estimate the amount of disk space that needs to be released.
-
First, retrieve the relevant disk thresholds that will indicate how much space should be released. The relevant thresholds are the high watermark for all the tiers apart from the frozen one and the frozen flood stage watermark for the frozen tier. The following example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark:
response = client.cluster.get_settings( include_defaults: true, filter_path: '*.cluster.routing.allocation.disk.watermark.high*' ) puts response
GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high*
The response will look like this:
{ "defaults": { "cluster": { "routing": { "allocation": { "disk": { "watermark": { "high": "90%", "high.max_headroom": "150GB" } } } } } } }
The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more on how this threshold works here.
-
The next step is to find out the current disk usage; this will indicate how much space should be freed. For simplicity, our example has one node, but you can apply the same for every node over the relevant threshold.
response = client.cat.allocation( v: true, s: 'disk.avail', h: 'node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards' ) puts response
GET _cat/allocation?v&s=disk.avail&h=node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards
The response will look like this:
node disk.percent disk.avail disk.total disk.used disk.indices shards instance-0000000000 91 4.6gb 35gb 31.1gb 29.9gb 111
- The high watermark configuration indicates that the disk usage needs to drop below 90%. Consider allowing some padding, so the node will not go over the threshold in the near future. In this example, let’s release approximately 7GB.
-
The next step is to list all the indices and choose which replicas to reduce.
The following command orders the indices with descending number of replicas and primary store size. We do this to help you choose which replicas to reduce under the assumption that the more replicas you have the smaller the risk if you remove a copy and the bigger the replica the more space will be released. This does not take into consideration any functional requirements, so please see it as a mere suggestion.
response = client.cat.indices( v: true, s: 'rep:desc,pri.store.size:desc', h: 'health,index,pri,rep,store.size,pri.store.size' ) puts response
GET _cat/indices?v&s=rep:desc,pri.store.size:desc&h=health,index,pri,rep,store.size,pri.store.size
The response will look like:
health index pri rep store.size pri.store.size green my_index 2 3 9.9gb 3.3gb green my_other_index 2 3 1.8gb 470.3mb green search-products 2 3 278.5kb 69.6kb green logs-000001 1 0 7.7gb 7.7gb
-
In the list above we see that if we reduce the replicas to 1 of the indices
my_index
andmy_other_index
we will release the required disk space. It is not necessary to reduce the replicas ofsearch-products
andlogs-000001
does not have any replicas anyway. Reduce the replicas of one or more indices with the index update settings API:Reducing the replicas of an index can potentially reduce search throughput and data redundancy.
PUT my_index,my_other_index/_settings { "index.number_of_replicas": 1 }