WARNING: Version 0.90 of Elasticsearch has passed its EOL date.
This documentation is no longer being maintained and may be removed. If you are running this version, we strongly advise you to upgrade. For the latest information, see the current release documentation.
Snowball Analyzer
editSnowball Analyzer
editAn analyzer of type snowball
that uses the
standard
tokenizer, with
standard
filter,
lowercase
filter,
stop
filter, and
snowball
filter.
The Snowball Analyzer is a stemming analyzer from Lucene that is originally based on the snowball project from snowball.tartarus.org.
Sample usage:
{ "index" : { "analysis" : { "analyzer" : { "my_analyzer" : { "type" : "snowball", "language" : "English" } } } } }
The language
parameter can have the same values as the
snowball
filter and defaults to English
. Note that not all the language
analyzers have a default set of stopwords provided.
The stopwords
parameter can be used to provide stopwords for the
languages that has no defaults, or to simply replace the default set
with your custom list. A default set of stopwords for many of these
languages is available from for instance
here
and
here.
A sample configuration (in YAML format) specifying Swedish with stopwords:
index : analysis : analyzer : my_analyzer: type: snowball language: Swedish stopwords: "och,det,att,i,en,jag,hon,som,han,på,den,med,var,sig,för,så,till,är,men,ett,om,hade,de,av,icke,mig,du,henne,då,sin,nu,har,inte,hans,honom,skulle,hennes,där,min,man,ej,vid,kunde,något,från,ut,när,efter,upp,vi,dem,vara,vad,över,än,dig,kan,sina,här,ha,mot,alla,under,någon,allt,mycket,sedan,ju,denna,själv,detta,åt,utan,varit,hur,ingen,mitt,ni,bli,blev,oss,din,dessa,några,deras,blir,mina,samma,vilken,er,sådan,vår,blivit,dess,inom,mellan,sådant,varför,varje,vilka,ditt,vem,vilket,sitta,sådana,vart,dina,vars,vårt,våra,ert,era,vilkas"