Scrolling
editScrolling
editThe Scrolling functionality of Elasticsearch is used to paginate over many documents in a bulk manner, such as exporting all the documents belonging to a single user. It is more efficient than regular search because it doesn’t need to maintain an expensive priority queue ordering the documents.
Scrolling works by maintaining a "point in time" snapshot of the index which is then used to page over.
This window allows consistent paging even if there is background indexing/updating/deleting. First, you execute a search
request with scroll
enabled. This returns a "page" of documents, and a scroll_id which is used to continue
paginating through the hits.
More details about scrolling can be found in the Link: reference documentation.
This is an example which can be used as a template for more advanced operations:
$client = ClientBuilder::create()->build(); $params = [ "scroll" => "30s", // how long between scroll requests. should be small! "size" => 50, // how many results *per shard* you want back "index" => "my_index", "body" => [ "query" => [ "match_all" => new \stdClass() ] ] ]; // Execute the search // The response will contain the first batch of documents // and a scroll_id $response = $client->search($params); // Now we loop until the scroll "cursors" are exhausted while (isset($response['hits']['hits']) && count($response['hits']['hits']) > 0) { // ** // Do your work here, on the $response['hits']['hits'] array // ** // When done, get the new scroll_id // You must always refresh your _scroll_id! It can change sometimes $scroll_id = $response['_scroll_id']; // Execute a Scroll request and repeat $response = $client->scroll([ "scroll_id" => $scroll_id, //...using our previously obtained _scroll_id "scroll" => "30s" // and the same timeout window ] ); }