New

The executive guide to generative AI

Read more
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Lowercase Tokenizer

edit

The lowercase tokenizer, like the letter tokenizer breaks text into terms whenever it encounters a character which is not a letter, but it also lowercases all terms. It is functionally equivalent to the letter tokenizer combined with the lowercase token filter, but is more efficient as it performs both steps in a single pass.

Example output

edit
POST _analyze
{
  "tokenizer": "lowercase",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}

The above sentence would produce the following terms:

[ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]

Configuration

edit

The lowercase tokenizer is not configurable.

Was this helpful?
Feedback