- Plugins and Integrations: other versions:
- Introduction to plugins
- Plugin management
- API extension plugins
- Analysis plugins
- ICU analysis plugin
- Japanese (kuromoji) analysis plugin
kuromoji
analyzerkuromoji_iteration_mark
character filterkuromoji_tokenizer
kuromoji_baseform
token filterkuromoji_part_of_speech
token filterkuromoji_readingform
token filterkuromoji_stemmer
token filterja_stop
token filterkuromoji_number
token filterhiragana_uppercase
token filterkatakana_uppercase
token filterkuromoji_completion
token filter
- Korean (nori) analysis plugin
- Phonetic analysis plugin
- Smart Chinese analysis plugin
- Stempel Polish analysis plugin
- Ukrainian analysis plugin
- Discovery plugins
- Mapper plugins
- Snapshot/restore repository plugins
- Store plugins
- Integrations
- Creating an Elasticsearch plugin
ICU transform token filter
editICU transform token filter
editTransforms are used to process Unicode text in many different ways, such as case mapping, normalization, transliteration and bidirectional text handling.
You can define which transformation you want to apply with the id
parameter
(defaults to Null
), and specify text direction with the dir
parameter
which accepts forward
(default) for LTR and reverse
for RTL. Custom
rulesets are not yet supported.
For example:
PUT icu_sample { "settings": { "index": { "analysis": { "analyzer": { "latin": { "tokenizer": "keyword", "filter": [ "myLatinTransform" ] } }, "filter": { "myLatinTransform": { "type": "icu_transform", "id": "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC" } } } } } } GET icu_sample/_analyze { "analyzer": "latin", "text": "你好" } GET icu_sample/_analyze { "analyzer": "latin", "text": "здравствуйте" } GET icu_sample/_analyze { "analyzer": "latin", "text": "こんにちは" }
This transforms transliterates characters to Latin, and separates accents from their base characters, removes the accents, and then puts the remaining text into an unaccented form. |
|
Returns |
|
Returns |
|
Returns |
For more documentation, Please see the user guide of ICU Transform.