IMPORTANT: No additional bug fixes or documentation updates
will be released for this version. For the latest information, see the
current release documentation.
kuromoji_readingform token filter
edit
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.
kuromoji_readingform token filter
editThe kuromoji_readingform token filter replaces the token with its reading
form in either katakana or romaji. It accepts the following setting:
-
use_romaji -
Whether romaji reading form should be output instead of katakana. Defaults to
false.
When using the pre-defined kuromoji_readingform filter, use_romaji is set
to true. The default when defining a custom kuromoji_readingform, however,
is false. The only reason to use the custom form is if you need the
katakana reading form:
PUT kuromoji_sample
{
"settings": {
"index":{
"analysis":{
"analyzer" : {
"romaji_analyzer" : {
"tokenizer" : "kuromoji_tokenizer",
"filter" : ["romaji_readingform"]
},
"katakana_analyzer" : {
"tokenizer" : "kuromoji_tokenizer",
"filter" : ["katakana_readingform"]
}
},
"filter" : {
"romaji_readingform" : {
"type" : "kuromoji_readingform",
"use_romaji" : true
},
"katakana_readingform" : {
"type" : "kuromoji_readingform",
"use_romaji" : false
}
}
}
}
}
}
GET kuromoji_sample/_analyze
{
"analyzer": "katakana_analyzer",
"text": "寿司"
}
GET kuromoji_sample/_analyze
{
"analyzer": "romaji_analyzer",
"text": "寿司"
}