- Elasticsearch Guide: other versions:
- Getting Started
- Setup
- Breaking changes
- API Conventions
- Document APIs
- Search APIs
- Search
- URI Search
- Request Body Search
- Search Template
- Search Shards API
- Aggregations
- Min Aggregation
- Max Aggregation
- Sum Aggregation
- Avg Aggregation
- Stats Aggregation
- Extended Stats Aggregation
- Value Count Aggregation
- Percentiles Aggregation
- Percentile Ranks Aggregation
- Cardinality Aggregation
- Geo Bounds Aggregation
- Top hits Aggregation
- Scripted Metric Aggregation
- Global Aggregation
- Filter Aggregation
- Filters Aggregation
- Missing Aggregation
- Nested Aggregation
- Reverse nested Aggregation
- Children Aggregation
- Terms Aggregation
- Significant Terms Aggregation
- Range Aggregation
- Date Range Aggregation
- IPv4 Range Aggregation
- Histogram Aggregation
- Date Histogram Aggregation
- Geo Distance Aggregation
- GeoHash grid Aggregation
- Facets
- Suggesters
- Multi Search API
- Count API
- Search Exists API
- Validate API
- Explain API
- Percolator
- More Like This API
- Indices APIs
- Create Index
- Delete Index
- Get Index
- Indices Exists
- Open / Close Index API
- Put Mapping
- Get Mapping
- Get Field Mapping
- Types Exists
- Delete Mapping
- Index Aliases
- Update Indices Settings
- Get Settings
- Analyze
- Index Templates
- Warmers
- Status
- Indices Stats
- Indices Segments
- Indices Recovery
- Clear Cache
- Flush
- Refresh
- Optimize
- Upgrade
- cat APIs
- Cluster APIs
- Query DSL
- Queries
- Match Query
- Multi Match Query
- Bool Query
- Boosting Query
- Common Terms Query
- Constant Score Query
- Dis Max Query
- Filtered Query
- Fuzzy Like This Query
- Fuzzy Like This Field Query
- Function Score Query
- Fuzzy Query
- GeoShape Query
- Has Child Query
- Has Parent Query
- Ids Query
- Indices Query
- Match All Query
- More Like This Query
- More Like This Field Query
- Nested Query
- Prefix Query
- Query String Query
- Simple Query String Query
- Range Query
- Regexp Query
- Span First Query
- Span Multi Term Query
- Span Near Query
- Span Not Query
- Span Or Query
- Span Term Query
- Term Query
- Terms Query
- Top Children Query
- Wildcard Query
- Minimum Should Match
- Multi Term Query Rewrite
- Template Query
- Filters
- And Filter
- Bool Filter
- Exists Filter
- Geo Bounding Box Filter
- Geo Distance Filter
- Geo Distance Range Filter
- Geo Polygon Filter
- GeoShape Filter
- Geohash Cell Filter
- Has Child Filter
- Has Parent Filter
- Ids Filter
- Indices Filter
- Limit Filter
- Match All Filter
- Missing Filter
- Nested Filter
- Not Filter
- Or Filter
- Prefix Filter
- Query Filter
- Range Filter
- Regexp Filter
- Script Filter
- Term Filter
- Terms Filter
- Type Filter
- Queries
- Mapping
- Analysis
- Analyzers
- Tokenizers
- Token Filters
- Standard Token Filter
- ASCII Folding Token Filter
- Length Token Filter
- Lowercase Token Filter
- Uppercase Token Filter
- NGram Token Filter
- Edge NGram Token Filter
- Porter Stem Token Filter
- Shingle Token Filter
- Stop Token Filter
- Word Delimiter Token Filter
- Stemmer Token Filter
- Stemmer Override Token Filter
- Keyword Marker Token Filter
- Keyword Repeat Token Filter
- KStem Token Filter
- Snowball Token Filter
- Phonetic Token Filter
- Synonym Token Filter
- Compound Word Token Filter
- Reverse Token Filter
- Elision Token Filter
- Truncate Token Filter
- Unique Token Filter
- Pattern Capture Token Filter
- Pattern Replace Token Filter
- Trim Token Filter
- Limit Token Count Token Filter
- Hunspell Token Filter
- Common Grams Token Filter
- Normalization Token Filter
- CJK Width Token Filter
- CJK Bigram Token Filter
- Delimited Payload Token Filter
- Keep Words Token Filter
- Keep Types Token Filter
- Classic Token Filter
- Apostrophe Token Filter
- Character Filters
- ICU Analysis Plugin
- Modules
- Index Modules
- Testing
- Glossary of terms
WARNING: Version 1.4 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Store
editStore
editThe store module allows you to control how index data is stored.
The index can either be stored in-memory (no persistence) or on-disk (the default). In-memory indices provide better performance at the cost of limiting the index size to the amount of available physical memory.
When using a local gateway (the default), file system storage with no in memory storage is required to maintain index consistency. This is required since the local gateway constructs its state from the local index state of each node.
Another important aspect of memory based storage is the fact that Elasticsearch supports storing the index in memory outside of the JVM heap space using the "Memory" (see below) storage type. It translates to the fact that there is no need for extra large JVM heaps (with their own consequences) for storing the index in memory.
Store Level Throttling
editThe way Lucene, the IR library elasticsearch uses under the covers, works is by creating immutable segments (up to deletes) and constantly merging them (the merge policy settings allow to control how those merges happen). The merge process happens in an asynchronous manner without affecting the indexing / search speed. The problem though, especially on systems with low IO, is that the merge process can be expensive and affect search / index operation simply by the fact that the box is now taxed with more IO happening.
The store module allows to have throttling configured for merges (or
all) either on the node level, or on the index level. The node level
throttling will make sure that out of all the shards allocated on that
node, the merge process won’t pass the specific setting bytes per
second. It can be set by setting indices.store.throttle.type
to
merge
, and setting indices.store.throttle.max_bytes_per_sec
to
something like 5mb
. The node level settings can be changed dynamically
using the cluster update settings API. The default is set
to 20mb
with type merge
.
If specific index level configuration is needed, regardless of the node
level settings, it can be set as well using the
index.store.throttle.type
, and
index.store.throttle.max_bytes_per_sec
. The default value for the type
is node
, meaning it will throttle based on the node level settings and
participate in the global throttling happening. Both settings can be set
using the index update settings API dynamically.
File system storage types
editFile system based storage is the default storage used. There are
different implementations or storage types. The best one for the
operating environment will be automatically chosen: mmapfs
on
Windows 64bit, simplefs
on Windows 32bit, and default
(hybrid niofs
and mmapfs
) for the rest.
This can be overridden for all indices by adding this to the
config/elasticsearch.yml
file:
index.store.type: niofs
It can also be set on a per-index basis at index creation time:
curl -XPUT localhost:9200/my_index -d '{ "settings": { "index.store.type": "niofs" } }';
The following sections lists all the different storage types supported.
Simple FS
editThe simplefs
type is a straightforward implementation of file system
storage (maps to Lucene SimpleFsDirectory
) using a random access file.
This implementation has poor concurrent performance (multiple threads
will bottleneck). It is usually better to use the niofs
when you need
index persistence.
NIO FS
editThe niofs
type stores the shard index on the file system (maps to
Lucene NIOFSDirectory
) using NIO. It allows multiple threads to read
from the same file concurrently. It is not recommended on Windows
because of a bug in the SUN Java implementation.
MMap FS
editThe mmapfs
type stores the shard index on the file system (maps to
Lucene MMapDirectory
) by mapping a file into memory (mmap). Memory
mapping uses up a portion of the virtual memory address space in your
process equal to the size of the file being mapped. Before using this
class, be sure your have plenty of virtual address space.
See Virtual memory
Hybrid MMap / NIO FS
editThe default
type stores the shard index on the file system depending on
the file type by mapping a file into memory (mmap) or using Java NIO. Currently
only the Lucene term dictionary and doc values files are memory mapped to reduce
the impact on the operating system. All other files are opened using Lucene NIOFSDirectory
.
Address space settings (Virtual memory) might also apply if your term
dictionaries are large.
Memory
editThe memory
type stores the index in main memory, using Lucene’s
RamIndexStore
.
On this page