WARNING: Version 5.0 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Analyzers
edit
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.
Analyzers
editElasticsearch ships with a wide range of built-in analyzers, which can be used in any index without further configuration:
- Standard Analyzer
-
The
standardanalyzer divides text into terms on word boundaries, as defined by the Unicode Text Segmentation algorithm. It removes most punctuation, lowercases terms, and supports removing stop words. - Simple Analyzer
-
The
simpleanalyzer divides text into terms whenever it encounters a character which is not a letter. It lowercases all terms. - Whitespace Analyzer
-
The
whitespaceanalyzer divides text into terms whenever it encounters any whitespace character. It does not lowercase terms. - Stop Analyzer
-
The
stopanalyzer is like thesimpleanalyzer, but also supports removal of stop words. - Keyword Analyzer
-
The
keywordanalyzer is a “noop” analyzer that accepts whatever text it is given and outputs the exact same text as a single term. - Pattern Analyzer
-
The
patternanalyzer uses a regular expression to split the text into terms. It supports lower-casing and stop words. - Language Analyzers
-
Elasticsearch provides many language-specific analyzers like
englishorfrench. - Fingerprint Analyzer
-
The
fingerprintanalyzer is a specialist analyzer which creates a fingerprint which can be used for duplicate detection.
Custom analyzers
editIf you do not find an analyzer suitable for your needs, you can create a
custom analyzer which combines the appropriate
character filters,
tokenizer, and token filters.