- .NET Clients: other versions:
- Introduction
- Installation
- Breaking changes
- API Conventions
- Elasticsearch.Net - Low level client
- NEST - High level client
- Troubleshooting
- Search
- Query DSL
- Full text queries
- Term level queries
- Exists Query Usage
- Fuzzy Date Query Usage
- Fuzzy Numeric Query Usage
- Fuzzy Query Usage
- Ids Query Usage
- Prefix Query Usage
- Date Range Query Usage
- Long Range Query Usage
- Numeric Range Query Usage
- Term Range Query Usage
- Regexp Query Usage
- Term Query Usage
- Terms Set Query Usage
- Terms List Query Usage
- Terms Lookup Query Usage
- Terms Query Usage
- Wildcard Query Usage
- Compound queries
- Joining queries
- Geo queries
- Specialized queries
- Span queries
- NEST specific queries
- Aggregations
- Metric Aggregations
- Average Aggregation Usage
- Boxplot Aggregation Usage
- Cardinality Aggregation Usage
- Extended Stats Aggregation Usage
- Geo Bounds Aggregation Usage
- Geo Centroid Aggregation Usage
- Geo Line Aggregation Usage
- Max Aggregation Usage
- Median Absolute Deviation Aggregation Usage
- Min Aggregation Usage
- Percentile Ranks Aggregation Usage
- Percentiles Aggregation Usage
- Rate Aggregation Usage
- Scripted Metric Aggregation Usage
- Stats Aggregation Usage
- String Stats Aggregation Usage
- Sum Aggregation Usage
- T Test Aggregation Usage
- Top Hits Aggregation Usage
- Top Metrics Aggregation Usage
- Value Count Aggregation Usage
- Weighted Average Aggregation Usage
- Bucket Aggregations
- Adjacency Matrix Usage
- Auto Date Histogram Aggregation Usage
- Children Aggregation Usage
- Composite Aggregation Usage
- Date Histogram Aggregation Usage
- Date Range Aggregation Usage
- Diversified Sampler Aggregation Usage
- Filter Aggregation Usage
- Filters Aggregation Usage
- Geo Distance Aggregation Usage
- Geo Hash Grid Aggregation Usage
- Geo Tile Grid Aggregation Usage
- Global Aggregation Usage
- Histogram Aggregation Usage
- Ip Range Aggregation Usage
- Missing Aggregation Usage
- Multi Terms Aggregation Usage
- Nested Aggregation Usage
- Parent Aggregation Usage
- Range Aggregation Usage
- Rare Terms Aggregation Usage
- Reverse Nested Aggregation Usage
- Sampler Aggregation Usage
- Significant Terms Aggregation Usage
- Significant Text Aggregation Usage
- Terms Aggregation Usage
- Variable Width Histogram Usage
- Pipeline Aggregations
- Average Bucket Aggregation Usage
- Bucket Script Aggregation Usage
- Bucket Selector Aggregation Usage
- Bucket Sort Aggregation Usage
- Cumulative Cardinality Aggregation Usage
- Cumulative Sum Aggregation Usage
- Derivative Aggregation Usage
- Extended Stats Bucket Aggregation Usage
- Max Bucket Aggregation Usage
- Min Bucket Aggregation Usage
- Moving Average Ewma Aggregation Usage
- Moving Average Holt Linear Aggregation Usage
- Moving Average Holt Winters Aggregation Usage
- Moving Average Linear Aggregation Usage
- Moving Average Simple Aggregation Usage
- Moving Function Aggregation Usage
- Moving Percentiles Aggregation Usage
- Normalize Aggregation Usage
- Percentiles Bucket Aggregation Usage
- Serial Differencing Aggregation Usage
- Stats Bucket Aggregation Usage
- Sum Bucket Aggregation Usage
- Matrix Aggregations
- Metric Aggregations
Retries
editRetries
editBy default, NEST will retry a request as many times as there are nodes in the cluster, that the client knows about.
Retries still respects the request timeout however, meaning if you have a 100 node cluster and a request timeout of 20 seconds, the client will retry as many times as it can before giving up at the request timeout of 20 seconds.
Retry behaviour can be demonstrated using NEST’s Virtual cluster test framework. In the following example, a ten node cluster is defined that always fails on all client calls, except on port 9209
var audit = new Auditor(() => VirtualClusterWith .Nodes(10) .ClientCalls(r => r.FailAlways()) .ClientCalls(r => r.OnPort(9209).SucceedAlways()) .StaticConnectionPool() .Settings(s => s.DisablePing()) );
The trace of a client call shows that a bad response is received from nodes 9200 to 9208, finally returning a healthy response from the node on port 9209
audit = await audit.TraceCall( new ClientCall { { BadResponse, 9200 }, { BadResponse, 9201 }, { BadResponse, 9202 }, { BadResponse, 9203 }, { BadResponse, 9204 }, { BadResponse, 9205 }, { BadResponse, 9206 }, { BadResponse, 9207 }, { BadResponse, 9208 }, { HealthyResponse, 9209 } } );
Maximum number of retries
editWhen you have a 100 node cluster for example, you might want to ensure that retries occur only
a fixed number of times. This can be done using MaximumRetries(n)
on ConnectionSettings
the actual number of requests is initial attempt + set number of retries
var audit = new Auditor(() => VirtualClusterWith .Nodes(10) .ClientCalls(r => r.FailAlways()) .ClientCalls(r => r.OnPort(9209).SucceedAlways()) .StaticConnectionPool() .Settings(s => s.DisablePing().MaximumRetries(3)) ); audit = await audit.TraceCall( new ClientCall { { BadResponse, 9200 }, { BadResponse, 9201 }, { BadResponse, 9202 }, { BadResponse, 9203 }, { MaxRetriesReached } } );
Set the maximum number of retries to 3 |
|
The client call trace returns an |
In our previous example we simulated very fast failures, but in the real world, a call might take upwards of a second.
In this next example, we simulate a particularly heavy search that takes 10 seconds to fail, and set a request timeout of 20 seconds. We see that the request is tried twice and gives up before a third call is attempted, since the call takes 10 seconds and thus can be tried twice (initial call and one retry) before the request timeout.
var audit = new Auditor(() => VirtualClusterWith .Nodes(10) .ClientCalls(r => r.FailAlways().Takes(TimeSpan.FromSeconds(10))) .ClientCalls(r => r.OnPort(9209).SucceedAlways()) .StaticConnectionPool() .Settings(s => s.DisablePing().RequestTimeout(TimeSpan.FromSeconds(20))) ); audit = await audit.TraceCall( new ClientCall { { BadResponse, 9200 }, { BadResponse, 9201 }, { MaxTimeoutReached } } );
Maximum retry timeout
editIf you set a smaller request timeout you might not want it to also affect the retry timeout.
In cases like this, you can configure the MaxRetryTimeout
separately.
Here we simulate calls taking 3 seconds, a request timeout of 2 seconds and a max retry timeout of 10 seconds.
We should see 5 attempts to perform this query, testing that our request timeout cuts the query off short and that
our max retry timeout of 10 seconds wins over the configured request timeout
var audit = new Auditor(() => VirtualClusterWith .Nodes(10) .ClientCalls(r => r.FailAlways().Takes(TimeSpan.FromSeconds(3))) .ClientCalls(r => r.OnPort(9209).FailAlways()) .StaticConnectionPool() .Settings(s => s.DisablePing().RequestTimeout(TimeSpan.FromSeconds(2)).MaxRetryTimeout(TimeSpan.FromSeconds(10))) ); audit = await audit.TraceCall( new ClientCall { { BadResponse, 9200 }, { BadResponse, 9201 }, { BadResponse, 9202 }, { BadResponse, 9203 }, { BadResponse, 9204 }, { MaxTimeoutReached } } );
If your retry policy expands beyond the number of available nodes, the client won’t retry the same node twice
var audit = new Auditor(() => VirtualClusterWith .Nodes(2) .ClientCalls(r => r.FailAlways().Takes(TimeSpan.FromSeconds(3))) .ClientCalls(r => r.OnPort(9209).SucceedAlways()) .StaticConnectionPool() .Settings(s => s.DisablePing().RequestTimeout(TimeSpan.FromSeconds(2)).MaxRetryTimeout(TimeSpan.FromSeconds(10))) ); audit = await audit.TraceCall( new ClientCall { { BadResponse, 9200 }, { BadResponse, 9201 }, { MaxRetriesReached }, { FailedOverAllNodes } } );
This makes setting any retry setting on a single node connection pool a no-op by design! Connection pooling and failover is all about trying to fail sanely whilst still utilizing the available resources and not giving up on the fail fast principle; It is NOT a mechanism for forcing requests to succeed.
var audit = new Auditor(() => VirtualClusterWith .Nodes(10) .ClientCalls(r => r.FailAlways().Takes(TimeSpan.FromSeconds(3))) .ClientCalls(r => r.OnPort(9209).SucceedAlways()) .SingleNodeConnection() .Settings(s => s.DisablePing().MaximumRetries(10)) ); audit = await audit.TraceCall( new ClientCall { { BadResponse, 9200 } } );
On this page