Analyze API

edit

Analyze Request

edit

An AnalyzeRequest contains the text to analyze, and one of several options to specify how the analysis should be performed.

The simplest version uses a built-in analyzer:

AnalyzeRequest request = new AnalyzeRequest();
request.text("Some text to analyze", "Some more text to analyze");  
request.analyzer("english");    

The text to include. Multiple strings are treated as a multi-valued field

A built-in analyzer

You can configure a custom analyzer:

AnalyzeRequest request = new AnalyzeRequest();
request.text("<b>Some text to analyze</b>");
request.addCharFilter("html_strip");                
request.tokenizer("standard");                      
request.addTokenFilter("lowercase");                

Map<String, Object> stopFilter = new HashMap<>();
stopFilter.put("type", "stop");
stopFilter.put("stopwords", new String[]{ "to" });  
request.addTokenFilter(stopFilter);                 

Configure char filters

Configure the tokenizer

Add a built-in tokenfilter

Configuration for a custom tokenfilter

Add the custom tokenfilter

You can also build a custom normalizer, by including only charfilters and tokenfilters:

AnalyzeRequest request = new AnalyzeRequest();
request.text("<b>BaR</b>");
request.addTokenFilter("lowercase");

You can analyze text using an analyzer defined in an existing index:

AnalyzeRequest request = new AnalyzeRequest();
request.index("my_index");              
request.analyzer("my_analyzer");        
request.text("some text to analyze");

The index containing the mappings

The analyzer defined on this index to use

Or you can use a normalizer:

AnalyzeRequest request = new AnalyzeRequest();
request.index("my_index");                  
request.normalizer("my_normalizer");        
request.text("some text to analyze");

The index containing the mappings

The normalizer defined on this index to use

You can analyze text using the mappings for a particular field in an index:

AnalyzeRequest request = new AnalyzeRequest();
request.index("my_index");
request.field("my_field");
request.text("some text to analyze");

Optional arguments

edit

The following arguments can also optionally be provided:

request.explain(true);                      
request.attributes("keyword", "type");      

Setting explain to true will add further details to the response

Setting attributes allows you to return only token attributes that you are interested in

Synchronous Execution

edit
AnalyzeResponse response = client.indices().analyze(request, RequestOptions.DEFAULT);

Asynchronous Execution

edit

The asynchronous execution of an analyze request requires both the AnalyzeRequest instance and an ActionListener instance to be passed to the asyncronous method:

client.indices().analyzeAsync(request, RequestOptions.DEFAULT, listener);

The asynchronous method does not block and returns immediately. Once it is completed the ActionListener is called back using the onResponse method if the execution successfully completed or using the onFailure method if it failed.

A typical listener for AnalyzeResponse looks like:

ActionListener<AnalyzeResponse> listener = new ActionListener<AnalyzeResponse>() {
    @Override
    public void onResponse(AnalyzeResponse analyzeTokens) {

    }

    @Override
    public void onFailure(Exception e) {

    }
};

Analyze Response

edit

The returned AnalyzeResponse allows you to retrieve details of the analysis as follows:

List<AnalyzeResponse.AnalyzeToken> tokens = response.getTokens();   

AnalyzeToken holds information about the individual tokens produced by analysis

If explain was set to true, then information is instead returned from the detail() method:

DetailAnalyzeResponse detail = response.detail();                   

DetailAnalyzeResponse holds more detailed information about tokens produced by the various substeps in the analysis chain.