- Plugins and Integrations: other versions:
- Introduction to plugins
- Plugin management
- API extension plugins
- Analysis plugins
- ICU analysis plugin
- Japanese (kuromoji) analysis plugin
kuromoji
analyzerkuromoji_iteration_mark
character filterkuromoji_tokenizer
kuromoji_baseform
token filterkuromoji_part_of_speech
token filterkuromoji_readingform
token filterkuromoji_stemmer
token filterja_stop
token filterkuromoji_number
token filterhiragana_uppercase
token filterkatakana_uppercase
token filterkuromoji_completion
token filter
- Korean (nori) analysis plugin
- Phonetic analysis plugin
- Smart Chinese analysis plugin
- Stempel Polish analysis plugin
- Ukrainian analysis plugin
- Discovery plugins
- Mapper plugins
- Snapshot/restore repository plugins
- Store plugins
- Integrations
- Creating an Elasticsearch plugin
kuromoji_readingform token filter
editkuromoji_readingform
token filter
editThe kuromoji_readingform
token filter replaces the token with its reading
form in either katakana or romaji. It accepts the following setting:
-
use_romaji
-
Whether romaji reading form should be output instead of katakana. Defaults to
false
.
When using the pre-defined kuromoji_readingform
filter, use_romaji
is set
to true
. The default when defining a custom kuromoji_readingform
, however,
is false
. The only reason to use the custom form is if you need the
katakana reading form:
PUT kuromoji_sample { "settings": { "index": { "analysis": { "analyzer": { "romaji_analyzer": { "tokenizer": "kuromoji_tokenizer", "filter": [ "romaji_readingform" ] }, "katakana_analyzer": { "tokenizer": "kuromoji_tokenizer", "filter": [ "katakana_readingform" ] } }, "filter": { "romaji_readingform": { "type": "kuromoji_readingform", "use_romaji": true }, "katakana_readingform": { "type": "kuromoji_readingform", "use_romaji": false } } } } } } GET kuromoji_sample/_analyze { "analyzer": "katakana_analyzer", "text": "寿司" } GET kuromoji_sample/_analyze { "analyzer": "romaji_analyzer", "text": "寿司" }
Was this helpful?
Thank you for your feedback.