- Elasticsearch Guide: other versions:
- Getting Started
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Important System Configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- All permission check
- Starting Elasticsearch
- Stopping Elasticsearch
- Adding nodes to your cluster
- Installing X-Pack
- Set up X-Pack
- Configuring X-Pack Java Clients
- X-Pack Settings
- Bootstrap Checks for X-Pack
- Upgrade Elasticsearch
- API Conventions
- Document APIs
- Search APIs
- Aggregations
- Metrics Aggregations
- Avg Aggregation
- Weighted Avg Aggregation
- Cardinality Aggregation
- Extended Stats Aggregation
- Geo Bounds Aggregation
- Geo Centroid Aggregation
- Max Aggregation
- Min Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Scripted Metric Aggregation
- Stats Aggregation
- Sum Aggregation
- Top Hits Aggregation
- Value Count Aggregation
- Median Absolute Deviation Aggregation
- Bucket Aggregations
- Adjacency Matrix Aggregation
- Auto-interval Date Histogram Aggregation
- Intervals
- Children Aggregation
- Composite Aggregation
- Date Histogram Aggregation
- Date Range Aggregation
- Diversified Sampler Aggregation
- Filter Aggregation
- Filters Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Global Aggregation
- Histogram Aggregation
- IP Range Aggregation
- Missing Aggregation
- Nested Aggregation
- Parent Aggregation
- Range Aggregation
- Reverse nested Aggregation
- Sampler Aggregation
- Significant Terms Aggregation
- Significant Text Aggregation
- Terms Aggregation
- Pipeline Aggregations
- Avg Bucket Aggregation
- Derivative Aggregation
- Max Bucket Aggregation
- Min Bucket Aggregation
- Sum Bucket Aggregation
- Stats Bucket Aggregation
- Extended Stats Bucket Aggregation
- Percentiles Bucket Aggregation
- Moving Average Aggregation
- Moving Function Aggregation
- Cumulative Sum Aggregation
- Bucket Script Aggregation
- Bucket Selector Aggregation
- Bucket Sort Aggregation
- Serial Differencing Aggregation
- Matrix Aggregations
- Caching heavy aggregations
- Returning only aggregation results
- Aggregation Metadata
- Returning the type of the aggregation
- Metrics Aggregations
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Shrink Index
- Split Index
- Rollover Index
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Indices Stats
- Indices Segments
- Indices Recovery
- Indices Shard Stores
- Clear Cache
- Flush
- Refresh
- Force Merge
- cat APIs
- Cluster APIs
- Query DSL
- Mapping
- Analysis
- Anatomy of an analyzer
- Testing analyzers
- Analyzers
- Normalizers
- Tokenizers
- Standard Tokenizer
- Letter Tokenizer
- Lowercase Tokenizer
- Whitespace Tokenizer
- UAX URL Email Tokenizer
- Classic Tokenizer
- Thai Tokenizer
- NGram Tokenizer
- Edge NGram Tokenizer
- Keyword Tokenizer
- Pattern Tokenizer
- Char Group Tokenizer
- Simple Pattern Tokenizer
- Simple Pattern Split Tokenizer
- Path Hierarchy Tokenizer
- Path Hierarchy Tokenizer Examples
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Flatten Graph Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Word Delimiter Graph Token Filter
- Multiplexer Token Filter
- Conditional Token Filter
- Predicate Token Filter Script
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Parsing synonym files
- Synonym Graph Token Filter
- Parsing synonym files
- Compound Word Token Filters
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Exclude mode settings example
- Classic Token Filter
- Apostrophe Token Filter
- Decimal Digit Token Filter
- Fingerprint Token Filter
- Minhash Token Filter
- Remove Duplicates Token Filter
- Character Filters
- Modules
- Index Modules
- Ingest Node
- Pipeline Definition
- Ingest APIs
- Accessing Data in Pipelines
- Conditional Execution in Pipelines
- Handling Failures in Pipelines
- Processors
- Append Processor
- Bytes Processor
- Convert Processor
- Date Processor
- Date Index Name Processor
- Dissect Processor
- Drop Processor
- Dot Expander Processor
- Fail Processor
- Foreach Processor
- Grok Processor
- Gsub Processor
- Join Processor
- JSON Processor
- KV Processor
- Lowercase Processor
- Pipeline Processor
- Remove Processor
- Rename Processor
- Script Processor
- Set Processor
- Set Security User Processor
- Split Processor
- Sort Processor
- Trim Processor
- Uppercase Processor
- URL Decode Processor
- Managing the index lifecycle
- SQL Access
- Monitor a cluster
- Rolling up historical data
- Frozen indices
- Set up a cluster for high availability
- Secure a cluster
- Overview
- Configuring security
- Encrypting communications in Elasticsearch
- Encrypting communications in an Elasticsearch Docker Container
- Enabling cipher suites for stronger encryption
- Separating node-to-node and client traffic
- Configuring an Active Directory realm
- Configuring a file realm
- Configuring an LDAP realm
- Configuring a native realm
- Configuring a PKI realm
- Configuring a SAML realm
- Configuring a Kerberos realm
- FIPS 140-2
- Security settings
- Security files
- Auditing Settings
- How security works
- User authentication
- Built-in users
- Internal users
- Realms
- Realm chains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- User authorization
- Auditing security events
- Encrypting communications
- Restricting connections with IP filtering
- Cross cluster search, tribe, clients, and integrations
- Tutorial: Getting started with security
- Tutorial: Encrypting communications
- Troubleshooting
- Can’t log in after upgrading to 6.5.0
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Alerting on Cluster and Index Events
- Command line tools
- How To
- Testing
- Glossary of terms
- X-Pack APIs
- Info API
- Cross-cluster replication APIs
- Explore API
- Freeze index
- Index lifecycle management API
- Licensing APIs
- Migration APIs
- Machine learning APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create calendar
- Create datafeeds
- Create filter
- Create jobs
- Delete calendar
- Delete datafeeds
- Delete events from calendar
- Delete filter
- Delete forecast
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Find file structure
- Flush jobs
- Forecast jobs
- Get calendars
- Get buckets
- Get overall buckets
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get machine learning info
- Get model snapshots
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Revert model snapshots
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filter
- Update jobs
- Update model snapshots
- Rollup APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete users
- Disable users
- Enable users
- Get application privileges
- Get role mappings
- Get roles
- Get token
- Get users
- Has privileges
- Invalidate token
- SSL certificate
- Unfreeze index
- Watcher APIs
- Definitions
- Release Highlights
- Breaking changes
- Release Notes
- Elasticsearch version 6.6.2
- Elasticsearch version 6.6.1
- Elasticsearch version 6.6.0
- Elasticsearch version 6.5.4
- Elasticsearch version 6.5.3
- Elasticsearch version 6.5.2
- Elasticsearch version 6.5.1
- Elasticsearch version 6.5.0
- Elasticsearch version 6.4.3
- Elasticsearch version 6.4.2
- Elasticsearch version 6.4.1
- Elasticsearch version 6.4.0
- Elasticsearch version 6.3.2
- Elasticsearch version 6.3.1
- Elasticsearch version 6.3.0
- Elasticsearch version 6.2.4
- Elasticsearch version 6.2.3
- Elasticsearch version 6.2.2
- Elasticsearch version 6.2.1
- Elasticsearch version 6.2.0
- Elasticsearch version 6.1.4
- Elasticsearch version 6.1.3
- Elasticsearch version 6.1.2
- Elasticsearch version 6.1.1
- Elasticsearch version 6.1.0
- Elasticsearch version 6.0.1
- Elasticsearch version 6.0.0
- Elasticsearch version 6.0.0-rc2
- Elasticsearch version 6.0.0-rc1
- Elasticsearch version 6.0.0-beta2
- Elasticsearch version 6.0.0-beta1
- Elasticsearch version 6.0.0-alpha2
- Elasticsearch version 6.0.0-alpha1
- Elasticsearch version 6.0.0-alpha1 (Changes previously released in 5.x)
Zen Discovery
editZen Discovery
editZen discovery is the built-in, default, discovery module for Elasticsearch. It provides unicast and file-based discovery, and can be extended to support cloud environments and other forms of discovery via plugins.
Zen discovery is integrated with other modules, for example, all communication between nodes is done using the transport module.
It is separated into several sub modules, which are explained below:
Ping
editThis is the process where a node uses the discovery mechanisms to find other nodes.
Seed nodes
editZen discovery uses a list of seed nodes in order to start off the discovery process. At startup, or when electing a new master, Elasticsearch tries to connect to each seed node in its list, and holds a gossip-like conversation with them to find other nodes and to build a complete picture of the cluster. By default there are two methods for configuring the list of seed nodes: unicast and file-based. It is recommended that the list of seed nodes comprises the list of master-eligible nodes in the cluster.
Unicast
editUnicast discovery configures a static list of hosts for use as seed nodes. These hosts can be specified as hostnames or IP addresses; hosts specified as hostnames are resolved to IP addresses during each round of pinging. Note that if you are in an environment where DNS resolutions vary with time, you might need to adjust your JVM security settings.
The list of hosts is set using the discovery.zen.ping.unicast.hosts
static
setting. This is either an array of hosts or a comma-delimited string. Each
value should be in the form of host:port
or host
(where port
defaults to
the setting transport.profiles.default.port
falling back to
transport.tcp.port
if not set). Note that IPv6 hosts must be bracketed. The
default for this setting is 127.0.0.1, [::1]
Additionally, the discovery.zen.ping.unicast.resolve_timeout
configures the
amount of time to wait for DNS lookups on each round of pinging. This is
specified as a time unit and defaults to 5s.
Unicast discovery uses the transport module to perform the discovery.
File-based
editIn addition to hosts provided by the static discovery.zen.ping.unicast.hosts
setting, it is possible to provide a list of hosts via an external file.
Elasticsearch reloads this file when it changes, so that the list of seed nodes
can change dynamically without needing to restart each node. For example, this
gives a convenient mechanism for an Elasticsearch instance that is run in a
Docker container to be dynamically supplied with a list of IP addresses to
connect to for Zen discovery when those IP addresses may not be known at node
startup.
To enable file-based discovery, configure the file
hosts provider as follows:
discovery.zen.hosts_provider: file
Then create a file at $ES_PATH_CONF/unicast_hosts.txt
in the format described
below. Any time a change is made to the unicast_hosts.txt
file the new
changes will be picked up by Elasticsearch and the new hosts list will be used.
Note that the file-based discovery plugin augments the unicast hosts list in
elasticsearch.yml
: if there are valid unicast host entries in
discovery.zen.ping.unicast.hosts
then they will be used in addition to those
supplied in unicast_hosts.txt
.
The discovery.zen.ping.unicast.resolve_timeout
setting also applies to DNS
lookups for nodes specified by address via file-based discovery. This is
specified as a time unit and defaults to 5s.
The format of the file is to specify one node entry per line. Each node entry
consists of the host (host name or IP address) and an optional transport port
number. If the port number is specified, is must come immediately after the
host (on the same line) separated by a :
. If the port number is not
specified, a default value of 9300 is used.
For example, this is an example of unicast_hosts.txt
for a cluster with four
nodes that participate in unicast discovery, some of which are not running on
the default port:
10.10.10.5 10.10.10.6:9305 10.10.10.5:10005 # an IPv6 address [2001:0db8:85a3:0000:0000:8a2e:0370:7334]:9301
Host names are allowed instead of IP addresses (similar to
discovery.zen.ping.unicast.hosts
), and IPv6 addresses must be specified in
brackets with the port coming after the brackets.
It is also possible to add comments to this file. All comments must appear on
their lines starting with #
(i.e. comments cannot start in the middle of a
line).
Master Election
editAs part of the ping process a master of the cluster is either elected or joined
to. This is done automatically. The discovery.zen.ping_timeout
(which defaults
to 3s
) determines how long the node will wait before deciding on starting an
election or joining an existing cluster. Three pings will be sent over this
timeout interval. In case where no decision can be reached after the timeout,
the pinging process restarts. In slow or congested networks, three seconds
might not be enough for a node to become aware of the other nodes in its
environment before making an election decision. Increasing the timeout should
be done with care in that case, as it will slow down the election process. Once
a node decides to join an existing formed cluster, it will send a join request
to the master (discovery.zen.join_timeout
) with a timeout defaulting at 20
times the ping timeout.
When the master node stops or has encountered a problem, the cluster nodes start pinging again and will elect a new master. This pinging round also serves as a protection against (partial) network failures where a node may unjustly think that the master has failed. In this case the node will simply hear from other nodes about the currently active master.
If discovery.zen.master_election.ignore_non_master_pings
is true
, pings from
nodes that are not master eligible (nodes where node.master
is false
) are
ignored during master election; the default value is false
.
Nodes can be excluded from becoming a master by setting node.master
to
false
.
The discovery.zen.minimum_master_nodes
sets the minimum number of master
eligible nodes that need to join a newly elected master in order for an election
to complete and for the elected node to accept its mastership. The same setting
controls the minimum number of active master eligible nodes that should be a
part of any active cluster. If this requirement is not met the active master
node will step down and a new master election will begin.
This setting must be set to a quorum of your master eligible nodes. It is recommended to avoid having only two master eligible nodes, since a quorum of two is two. Therefore, a loss of either master eligible node will result in an inoperable cluster.
Fault Detection
editThere are two fault detection processes running. The first is by the master, to ping all the other nodes in the cluster and verify that they are alive. And on the other end, each node pings to master to verify if its still alive or an election process needs to be initiated.
The following settings control the fault detection process using the
discovery.zen.fd
prefix:
Setting | Description |
---|---|
|
How often a node gets pinged. Defaults to |
|
How long to wait for a ping response, defaults to
|
|
How many ping failures / timeouts cause a node to be
considered failed. Defaults to |
Cluster state updates
editThe master node is the only node in a cluster that can make changes to the
cluster state. The master node processes one cluster state update at a time,
applies the required changes and publishes the updated cluster state to all the
other nodes in the cluster. Each node receives the publish message, acknowledges
it, but does not yet apply it. If the master does not receive acknowledgement
from at least discovery.zen.minimum_master_nodes
nodes within a certain time
(controlled by the discovery.zen.commit_timeout
setting and defaults to 30
seconds) the cluster state change is rejected.
Once enough nodes have responded, the cluster state is committed and a message
will be sent to all the nodes. The nodes then proceed to apply the new cluster
state to their internal state. The master node waits for all nodes to respond,
up to a timeout, before going ahead processing the next updates in the queue.
The discovery.zen.publish_timeout
is set by default to 30 seconds and is
measured from the moment the publishing started. Both timeout settings can be
changed dynamically through the cluster update
settings api
No master block
editFor the cluster to be fully operational, it must have an active master and the
number of running master eligible nodes must satisfy the
discovery.zen.minimum_master_nodes
setting if set. The
discovery.zen.no_master_block
settings controls what operations should be
rejected when there is no active master.
The discovery.zen.no_master_block
setting has two valid options:
|
All operations on the node—i.e. both read & writes—will be rejected. This also applies for api cluster state read or write operations, like the get index settings, put mapping and cluster state api. |
|
(default) Write operations will be rejected. Read operations will succeed, based on the last known cluster configuration. This may result in partial reads of stale data as this node may be isolated from the rest of the cluster. |
The discovery.zen.no_master_block
setting doesn’t apply to nodes-based apis
(for example cluster stats, node info and node stats apis). Requests to these
apis will not be blocked and can run on any available node.
Single-node discovery
editThe discovery.type
setting specifies whether Elasticsearch should form a multiple-node
cluster. By default, Elasticsearch discovers other nodes when forming a cluster and
allows other nodes to join the cluster later. If discovery.type
is set to
single-node
, Elasticsearch forms a single-node cluster. For more information about when
you might use this setting, see Bootstrap checks.
On this page