Using Machine Learning and Elasticsearch for Security Analytics: A Deep Dive
Editor's Note (August 3, 2021): This post uses deprecated features. Please reference the map custom regions with reverse geocoding documentation for current instructions.
Introduction
In our previous post of our multi-part series on integrating Elasticsearch with ArcSight SIEM, where we used X-Pack alerting features to detect a successful brute force login attack, we hinted that we were excited about the pending arrival of our machine learning features in X-Pack.
Well the time has come, and X-Pack machine learning features are here. Now we want to walk through what it means to use machine learning to detect anomalies, that are associated with cyber threat behaviors, in log data living in Elasticsearch.
Math, not Magic
Before we jump into this, it's probably a good idea to add some context. A common misperception in cybersecurity is that machine learning is a magic box of algorithms that you let loose on your data and they start producing nuggets of brilliant cyber insight for you.
A more enlightened understanding of machine learning in cybersecurity sees it as an arsenal of "algorithmic assistants" to help the security team automate the analysis of security-relevant log data by looking for potentially incriminating anomalies and patterns -- but under the direction of human security experts.
Threat Monitoring or Threat Hunting - A Role for Machine Learning
X-Pack machine learning features can be used for interactive investigation of threat-related anomalies. An "anomaly swimlane" visualization in Kibana is often used as the starting point for threat hunting expeditions, but details about detected anomalies will transparently show the security analyst "why" the detected behavior was anomalous, how unusual it was, why it relates to the elementary attack behavior it attempts to detect, and which entities in the data were influential on the attack behavior.
Because the X-Pack machine learning features are tightly integrated into the Elastic Stack, the alerting techniques we described in our Integrating Elasticsearch with ArcSight SIEM Part 2 and Part 4 posts can now be applied to a new source of insight, the machine learning results index, whose index pattern is called ml-anomalies-*. In this way, results produced by these algorithmic assistants can be used to trigger alerts for ongoing threat monitoring.
Fig. 1 X-Pack machine learning features integrated with Elastic Stack
Machine Learning "Recipes" for Threat Detection
While threshold-based event notification is powerful, such as triggering a notification when a successful login is preceded by multiple unsuccessful logins, the ability to automate the detection of anomalous behavior without having to define specific data conditions simplifies the experience for the security analyst.
That said, as we mentioned above, we're talking about mathematics, not magic, so the machine learning engine must be given its marching orders as settings in its job configuration. Since the engine can model any type of time-series data - numerical or categorical - the types of machine learning jobs that can be configured are unlimited. While this is flexible, it can be a bit too much for a security analyst who really just wants to find threats.
Here we introduce the concept of machine learning "recipes" for security use cases. Recipes describe how to configure machine learning jobs, so that we can use automated anomaly detection to uncover elementary attack behaviors that can be difficult to detect using other means. Elementary attack behaviors include activities such as DNS tunneling, web data exfiltration, suspicious endpoint process execution, and more.
Fig. 2 Example machine learning security use case recipe sheets
Each recipe is contained in a short document that includes sections for theory of operation, description, and specific recipe steps for modeling and observing results. Recipe steps include feature selection, modeling approach, detection target, comparison set, candidate influencers, analysis time period, and results interpretation.
We've introduced four machine learning security use case examples with the launch of V5.4. Available now in this GitHub repo, each example contains a security use case recipe sheet and various configurations, data, and scripts to allow you to try them out.
A Machine Learning Recipe to Detect DNS Tunneling
As an example of a recipe, let's review how to use machine learning to detect DNS tunneling activity.
As a quick background, DNS tunneling refers to any activity whereby the Domain Name Service (DNS) internet protocol is used in an attempt to transfer non-DNS information into and/or out of an organization's network. Required by all internet-connected IT infrastructures, DNS network traffic isn't generally blocked by firewall policies and has therefore become an attractive channel for sending unauthorized and/or malicious communication, tunneled under an organization's existing security defenses. For example, the FrameworkPOS malware uses this technique to exfiltrate stolen cardholder data from retail point of sale terminals.
Let's walk through the machine learning security use case recipe known in Elastic security circles as DNS-EAB02. "DNS" indicates that the type of logs to be analyzed in this recipe are "DNS". "EAB" indicates that this recipe detects elementary attack behaviors. "02" is a unique identifier for this recipe to distinguish it from other DNS recipes. The recipe includes a number of sections:
Theory: An unusual amount of entropy (called "information content") present in the subdomain field of DNS Query Requests can be an indication of exfiltration of data over the DNS protocol.
Please note that there are many ways to detect DNS data exfiltration, this job uses just one such mechanism, and it has proven to be effective in real enterprise environments. If your security team prefers a different, or additional method(s), you can simply clone this job, modify it to your preference, and run it! Or run both of them to compare, or even combine, results.
Description: This use case recipe identifies domains to which DNS query requests containing unusually high values of "information content" are sent, and IP addresses that generate these anomalous requests.
It may not be obvious at first that detecting domains to which requests contain an unusually amount of entropy in the subdomain portion of the field, is the right thing to look for. But if we think of it from a data modeling perspective, we need to identify the data feature that is likely to have unusual characteristics over the course of the analysis. For this analysis, the domain field in the DNS query log is the feature we want to model, and the characteristic is the amount of information content contained its subdomain field.
Effectiveness: This use case recipe is provided as a basic example of how automated anomaly detection can be used to detect DNS data exfiltration. Other recipes, based upon alternative or more complex approaches, may produce more effective detection results.
Reminder that this recipe is an example. By applying your team's subject matter expertise to this type of detection, you may be able to improve upon its effectiveness.
Use Case Type: Elementary Attack Behavior (EAB) - This use case detects anomalies associated with elementary attack behaviors. Each detected anomaly is assigned a normalized Anomaly Score, and is annotated with values of other fields in the data that have statistical influence on the anomaly. Elementary attack behaviors that share common statistical Influencers are often related to a common attack progression.
Just a little context here, this recipe is not performing meta-analysis of risk factors, but is rather detecting elementary attack behaviors that can be correlated with other attack behaviors using X-Pack alerting to detect cyber attack progressions.
Use Case Data Source: DNS query logs (from client to DNS Server)
A reminder of what kind of data we'll need for this recipe
Use Case Recipe:
For: DNS query requests (filtered for question types: A, AAAA, TXT)
Model: Information content within the subdomain string
Detect: Unusually high amounts of information content
Compared to: Population of all (highest registered) domains in query results
Partition by: None
Exclude: domains that occur frequently in the analysis
Duration: Run analysis on DNS queries from period of 2 weeks or longer
Related recipes: Run this EAB use case by itself, or along with DNS-EAB01 DNS DGA Activity
Results: Influencer hosts are likely sources of DNS Tunneling activity
Here's where we pull the recipe together in plain language. This section specifies which log messages we'll pull data from, which features of the data we'll be modeling, what unusual behavior we're trying to detect, in comparison to what, whether or not we want to partition the analysis, whether or not we want to exclude frequent values that might dominate the results, how much data is required to produce a good result, which other jobs might be good to run along with this one, and finally, how to look for and interpret the results of the analysis.
Additional configuration parameters:
These sections (Example Elasticsearch Index Patterns, Example Elasticsearch Query, and Machine Learning Analysis / Detector Config) provide the technical configuration details to make sure the machine learning job works like we intend it to. These details correspond directly to settings in the job configuration view.
Once a machine learning job is configured with this recipe and started, X-Pack machine learning features will begin indexing and analyzing DNS request logs coming from client workstations, creating baselines of normal characteristics of DNS requests sent from each client, and detecting when these characteristics are anomalous.
As we mentioned above, detected anomalies are stored in an Elasticsearch index, with a default name of ml-anomalies-*
We've prepared a few short videos to let you see how X-Pack machine learning features perform in action, but here's a sample of what the X-Pack machine learning Anomaly Explorer view looks like in V5.4 (beta) when viewing results of a DNS Tunneling job.
Conclusion
X-Pack machine learning is making machine learning technology accessible to security analysts and engineers who have security-related log data living in Elasticsearch. The basic element of X-Pack machine learning operation is the anomaly detection job. Security analytics use case recipes describe how to configure jobs to detect attack behaviors. Without any programming, you can become the leader of your army of algorithmic assistants to help in detecting threats and improving overall security coverage.
Where to Learn More
- Check out our Machine Learning Lab Videos: Video 1, Video 2, Video 3.
- Check out more Machine Learning recipe examples for detecting attack behaviors.
- Download a free trial of X-Pack and try it out.
- Get a full product tour in the webinar.