WARNING: Version 1.7 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Cluster
editCluster
editShards Allocation
editShards allocation is the process of allocating shards to nodes. This can happen during initial recovery, replica allocation, rebalancing, or handling nodes being added or removed.
The following settings may be used:
-
cluster.routing.allocation.allow_rebalance
-
Allow to control when rebalancing will happen based on the total
state of all the indices shards in the cluster.
always
,indices_primaries_active
, andindices_all_active
are allowed, defaulting toindices_all_active
to reduce chatter during initial recovery. -
cluster.routing.allocation.cluster_concurrent_rebalance
-
Allow to control how many concurrent rebalancing of shards are
allowed cluster wide, and default it to
2
. -
cluster.routing.allocation.node_initial_primaries_recoveries
-
Allow to control specifically the number of initial recoveries
of primaries that are allowed per node. Since most times local
gateway is used, those should be fast and we can handle more of
those per node without creating load. Defaults to
4
. -
cluster.routing.allocation.node_concurrent_recoveries
-
How many concurrent recoveries are allowed to happen on a node.
Defaults to
2
. -
cluster.routing.allocation.enable
-
Controls shard allocation for all indices, by allowing specific kinds of shard to be allocated.
Can be set to:
-
all
- (default) Allows shard allocation for all kinds of shards. -
primaries
- Allows shard allocation only for primary shards. -
new_primaries
- Allows shard allocation only for primary shards for new indices. -
none
- No shard allocations of any kind are allowed for all indices.
-
-
cluster.routing.allocation.same_shard.host
-
Allows to perform a check to prevent allocation of multiple instances
of the same shard on a single host, based on host name and host address.
Defaults to
false
, meaning that no check is performed by default. This setting only applies if multiple nodes are started on the same machine. -
indices.recovery.concurrent_streams
-
The number of streams to open (on a node level) to recover a
shard from a peer shard. Defaults to
3
. -
indices.recovery.concurrent_small_file_streams
-
The number of streams to open (on a node level) for small files (under
5mb) to recover a shard from a peer shard. Defaults to
2
.
Shard Allocation Awareness
editCluster allocation awareness allows to configure shard and replicas allocation across generic attributes associated the nodes. Lets explain it through an example:
Assume we have several racks. When we start a node, we can configure an
attribute called rack_id
(any attribute name works), for example, here
is a sample config:
node.rack_id: rack_one
The above sets an attribute called rack_id
for the relevant node with
a value of rack_one
. Now, we need to configure the rack_id
attribute
as one of the awareness allocation attributes (set it on all (master
eligible) nodes config):
cluster.routing.allocation.awareness.attributes: rack_id
The above will mean that the rack_id
attribute will be used to do
awareness based allocation of shard and its replicas. For example, lets
say we start 2 nodes with node.rack_id
set to rack_one
, and deploy a
single index with 5 shards and 1 replica. The index will be fully
deployed on the current nodes (5 shards and 1 replica each, total of 10
shards).
Now, if we start two more nodes, with node.rack_id
set to rack_two
,
shards will relocate to even the number of shards across the nodes, but,
a shard and its replica will not be allocated in the same rack_id
value. The only exception to the later is when the cluster is left with less
rack_id
values then shard copies. If your cluster spans two racks and one of
them is lost, it is still worth while to spread replicas across the nodes in the
rack that was left (see Forced Awareness for an alternative).
The awareness attributes can hold several values, for example:
cluster.routing.allocation.awareness.attributes: rack_id,zone
NOTE: When using awareness attributes, shards will not be allocated to nodes that don’t have values set for those attributes.
NOTE: Number of primary/replica of a shard allocated on a specific group of nodes with the same awareness attribute value is determined by the number of attribute values. When the number of nodes in groups is unbalanced and there are many replicas, replica shards may be left unassigned.
Forced Awareness
editSometimes, we know in advance the number of values an awareness attribute can have, and more over, we would like never to have more replicas than needed allocated on a specific group of nodes with the same awareness attribute value. For that, we can force awareness on specific attributes.
For example, lets say we have an awareness attribute called zone
, and
we know we are going to have two zones, zone1
and zone2
. Here is how
we can force awareness on a node:
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2 cluster.routing.allocation.awareness.attributes: zone
Now, lets say we start 2 nodes with node.zone
set to zone1
and
create an index with 5 shards and 1 replica. The index will be created,
but only 5 shards will be allocated (with no replicas). Only when we
start more shards with node.zone
set to zone2
will the replicas be
allocated.
Automatic Preference When Searching / GETing
editWhen executing a search, or doing a get, the node receiving the request
will prefer to execute the request on shards that exists on nodes that
have the same attribute values as the executing node. This only happens
when the cluster.routing.allocation.awareness.attributes
setting has
been set to a value.
Realtime Settings Update
editThe settings can be updated using the cluster update settings API on a live cluster.
Shard Allocation Filtering
editAllow to control allocation of indices on nodes based on include/exclude filters. The filters can be set both on the index level and on the cluster level. Lets start with an example of setting it on the cluster level:
Lets say we have 4 nodes, each has specific attribute called tag
associated with it (the name of the attribute can be any name). Each
node has a specific value associated with tag
. Node 1 has a setting
node.tag: value1
, Node 2 a setting of node.tag: value2
, and so on.
We can create an index that will only deploy on nodes that have tag
set to value1
and value2
by setting
index.routing.allocation.include.tag
to value1,value2
. For example:
curl -XPUT localhost:9200/test/_settings -d '{ "index.routing.allocation.include.tag" : "value1,value2" }'
On the other hand, we can create an index that will be deployed on all
nodes except for nodes with a tag
of value value3
by setting
index.routing.allocation.exclude.tag
to value3
. For example:
curl -XPUT localhost:9200/test/_settings -d '{ "index.routing.allocation.exclude.tag" : "value3" }'
index.routing.allocation.require.*
can be used to
specify a number of rules, all of which MUST match in order for a shard
to be allocated to a node. This is in contrast to include
which will
include a node if ANY rule matches.
The include
, exclude
and require
values can have generic simple
matching wildcards, for example, value1*
. A special attribute name
called _ip
can be used to match on node ip values. In addition _host
attribute can be used to match on either the node’s hostname or its ip
address. Similarly _name
and _id
attributes can be used to match on
node name and node id accordingly.
Obviously a node can have several attributes associated with it, and both the attribute name and value are controlled in the setting. For example, here is a sample of several node configurations:
node.group1: group1_value1 node.group2: group2_value4
In the same manner, include
, exclude
and require
can work against
several attributes, for example:
curl -XPUT localhost:9200/test/_settings -d '{ "index.routing.allocation.include.group1" : "xxx", "index.routing.allocation.include.group2" : "yyy", "index.routing.allocation.exclude.group3" : "zzz", "index.routing.allocation.require.group4" : "aaa" }'
The provided settings can also be updated in real time using the update settings API, allowing to "move" indices (shards) around in realtime.
Cluster wide filtering can also be defined, and be updated in real time
using the cluster update settings API. This setting can come in handy
for things like decommissioning nodes (even if the replica count is set
to 0). Here is a sample of how to decommission a node based on _ip
address:
curl -XPUT localhost:9200/_cluster/settings -d '{ "transient" : { "cluster.routing.allocation.exclude._ip" : "10.0.0.1" } }'