- Plugins and Integrations: other versions:
- Introduction to plugins
- Plugin management
- API extension plugins
- Analysis plugins
- ICU analysis plugin
- Japanese (kuromoji) analysis plugin
kuromoji
analyzerkuromoji_iteration_mark
character filterkuromoji_tokenizer
kuromoji_baseform
token filterkuromoji_part_of_speech
token filterkuromoji_readingform
token filterkuromoji_stemmer
token filterja_stop
token filterkuromoji_number
token filterhiragana_uppercase
token filterkatakana_uppercase
token filterkuromoji_completion
token filter
- Korean (nori) analysis plugin
- Phonetic analysis plugin
- Smart Chinese analysis plugin
- Stempel Polish analysis plugin
- Ukrainian analysis plugin
- Discovery plugins
- Mapper plugins
- Snapshot/restore repository plugins
- Store plugins
- Integrations
- Creating an Elasticsearch plugin
kuromoji_baseform token filter
editkuromoji_baseform
token filter
editThe kuromoji_baseform
token filter replaces terms with their
BaseFormAttribute. This acts as a lemmatizer for verbs and adjectives. Example:
PUT kuromoji_sample { "settings": { "index": { "analysis": { "analyzer": { "my_analyzer": { "tokenizer": "kuromoji_tokenizer", "filter": [ "kuromoji_baseform" ] } } } } } } GET kuromoji_sample/_analyze { "analyzer": "my_analyzer", "text": "飲み" }
which responds with:
{ "tokens" : [ { "token" : "飲む", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 0 } ] }
Was this helpful?
Thank you for your feedback.