Normalization Token Filter

edit

There are several token filters available which try to normalize special characters of a certain language.

Arabic

arabic_normalization

German

german_normalization [1.3.0] Added in 1.3.0.

Hindi

hindi_normalization [1.3.0] Added in 1.3.0.

Indic

indic_normalization [1.3.0] Added in 1.3.0.

Kurdish (Sorani)

sorani_normalization [1.3.0] Added in 1.3.0.

Persian

persian_normalization

Scandinavian

scandinavian_normalization [1.3.0] Added in 1.3.0. , scandinavian_folding [1.3.0] Added in 1.3.0.