Processing and performance

edit

This documentation refers to the standalone (legacy) method of running APM Server. This method of running APM Server will be deprecated and removed in a future release. Please consider upgrading to the Elastic APM integration. If you’ve already upgraded, please see Processing and performance instead.

APM Server performance depends on a number of factors: memory and CPU available, network latency, transaction sizes, workload patterns, agent and server settings, versions, and protocol.

Let’s look at a simple example that makes the following assumptions:

  • The load is generated in the same region as where APM Server and Elasticsearch are deployed.
  • We’re using the default settings in cloud.
  • A small number of agents are reporting.

This leaves us with relevant variables like payload and instance sizes. See the table below for approximations. As a reminder, events are transactions and spans.

Transaction/Instance 512 MB Instance 2 GB Instance 8 GB Instance

Small transactions

5 spans with 5 stack frames each

600 events/second

1200 events/second

4800 events/second

Medium transactions

15 spans with 15 stack frames each

300 events/second

600 events/second

2400 events/second

Large transactions

30 spans with 30 stack frames each

150 events/second

300 events/second

1400 events/second

In other words, a 512 MB instance can process \~3 MB per second, while an 8 GB instance can process ~20 MB per second.

APM Server is CPU bound, so it scales better from 2 GB to 8 GB than it does from 512 MB to 2 GB. This is because larger instance types in Elastic Cloud come with much more computing power.

Don’t forget that the APM Server is stateless. Several instances running do not need to know about each other. This means that with a properly sized Elasticsearch instance, APM Server scales out linearly.

RUM deserves special consideration. The RUM agent runs in browsers, and there can be many thousands reporting to an APM Server with very variable network latency.