Field Collapsing

edit

Allows to collapse search results based on field values. The collapsing is done by selecting only the top sorted document per collapse key. For instance the query below retrieves the best tweet for each user and sorts them by number of likes.

GET /twitter/tweet/_search
{
    "query": {
        "match": {
            "message": "elasticsearch"
        }
    },
    "collapse" : {
        "field" : "user" 
    },
    "sort": ["likes"], 
    "from": 10 
}

collapse the result set using the "user" field

sort the top docs by number of likes

define the offset of the first collapsed result

The total number of hits in the response indicates the number of matching documents without collapsing. The total number of distinct group is unknown.

The field used for collapsing must be a single valued keyword or numeric field with doc_values activated

The collapsing is applied to the top hits only and does not affect aggregations.

Expand collapse results

edit

It is also possible to expand each collapsed top hits with the inner_hits option.

GET /twitter/tweet/_search
{
    "query": {
        "match": {
            "message": "elasticsearch"
        }
    },
    "collapse" : {
        "field" : "user", 
        "inner_hits": {
            "name": "last_tweets", 
            "size": 5, 
            "sort": [{ "date": "asc" }] 
        },
        "max_concurrent_group_searches": 4 
    },
    "sort": ["likes"]
}

collapse the result set using the "user" field

the name used for the inner hit section in the response

the number of inner_hits to retrieve per collapse key

how to sort the document inside each group

the number of concurrent requests allowed to retrieve the inner_hits` per group

See inner hits for the complete list of supported options and the format of the response.

The expansion of the group is done by sending an additional query for each collapsed hit returned in the response. The max_concurrent_group_searches request parameter can be used to control the maximum number of concurrent searches allowed in this phase. The default is based on the number of data nodes and the default search thread pool size.

collapse cannot be used in conjunction with scroll, rescore or search after.