Potential Abuse of Resources by High Token Count and Large Response Sizes

Detects potential resource exhaustion or data breach attempts by monitoring for users who consistently generate high input token counts, submit numerous requests, and receive large responses. This behavior could indicate an attempt to overload the system or extract an unusually large amount of data, possibly revealing sensitive information or causing service disruptions.

Rule type: esql
Rule indices:

Rule Severity: medium
Risk Score: 47
Runs every: 10m
Searches indices from: now-60m
Maximum alerts per execution: 100
References:

Tags:

Domain: LLM
Data Source: AWS Bedrock
Data Source: Amazon Web Services
Data Source: AWS S3
Use Case: Potential Overload
Use Case: Resource Exhaustion
Mitre Atlas: LLM04
Resources: Investigation Guide

Version: 6
Rule authors:

Elastic

Rule license: Elastic License v2

Identify the user account that used high prompt token counts and whether it should perform this kind of action.
Investigate large response sizes and the number of requests made by the user account.
Investigate other alerts associated with the user account during the past 48 hours.
Consider the time of day. If the user is a human (not a program or script), did the activity take place during a normal time of day?
Examine the account's prompts and responses in the last 24 hours.
If you suspect the account has been compromised, scope potentially compromised assets by tracking Amazon Bedrock model access, prompts generated, and responses to the prompts by the account in the last 24 hours.

False positive analysis

Verify the user account that used high prompt and large response sizes, has a business justification for the heavy usage of the system.

Response and remediation

Initiate the incident response process based on the outcome of the triage.
Disable or limit the account during the investigation and response.
Identify the possible impact of the incident and prioritize accordingly; the following actions can help you gain context:
- Identify the account role in the cloud environment.
- Identify if the attacker is moving laterally and compromising other Amazon Bedrock Services.
- Identify any regulatory or legal ramifications related to this activity.
- Identify potential resource exhaustion and impact on billing.
Review the permissions assigned to the implicated user group or role behind these requests to ensure they are authorized and expected to access bedrock and ensure that the least privilege principle is being followed.
Determine the initial vector abused by the attacker and take action to prevent reinfection via the same vector.
Using the incident response data, update logging and audit policies to improve the mean time to detect (MTTD) and the mean time to respond (MTTR).

Rule Query

		from logs-aws_bedrock.invocation-*

// keep token usage data
| keep
  user.id,
  gen_ai.usage.prompt_tokens,
  gen_ai.usage.completion_tokens

// Aggregate usage metrics
| stats
    Esql.ml_usage_prompt_tokens_max = max(gen_ai.usage.prompt_tokens),
    Esql.ml_invocations_total_count = count(*),
    Esql.ml_usage_completion_tokens_avg = avg(gen_ai.usage.completion_tokens)
  by
    user.id

// Filter for suspicious usage patterns
| where
  Esql.ml_usage_prompt_tokens_max > 5000
  and Esql.ml_invocations_total_count > 10
  and Esql.ml_usage_completion_tokens_avg > 500

// Calculate a custom risk factor
| eval Esql.ml_risk_score =
    (Esql.ml_usage_prompt_tokens_max / 1000) *
    Esql.ml_invocations_total_count *
    (Esql.ml_usage_completion_tokens_avg / 500)

// Filter on risk score
| where Esql.ml_risk_score > 10

// sort high risk users to top
| sort Esql.ml_risk_score desc
		
	

Potential Abuse of Resources by High Token Count and Large Response Sizes

Setup

Investigation guide

Triage and analysis

Investigating Potential Abuse of Resources by High Token Count and Large Response Sizes

Possible investigation steps

False positive analysis

Response and remediation

Rule Query