This documentation contains work-in-progress information for future Elastic Stack and Cloud releases. Use the version selector to view supported release docs. It also contains some Elastic Cloud serverless information. Check out our serverless docs for more details.
kuromoji_number token filter
editkuromoji_number
token filter
editThe kuromoji_number
token filter normalizes Japanese numbers (kansūji)
to regular Arabic decimal numbers in half-width characters. For example:
PUT kuromoji_sample { "settings": { "index": { "analysis": { "analyzer": { "my_analyzer": { "tokenizer": "kuromoji_tokenizer", "filter": [ "kuromoji_number" ] } } } } } } GET kuromoji_sample/_analyze { "analyzer": "my_analyzer", "text": "一〇〇〇" }
Which results in:
{ "tokens" : [ { "token" : "1000", "start_offset" : 0, "end_offset" : 4, "type" : "word", "position" : 0 } ] }