TLS Fingerprint Confusion: How Proxying Hides Clients

Introduction

Akamai has the privilege of carrying a substantial amount of Internet traffic. As such, we're in the unique position to observe a large amount of patterns and behaviors across the Internet.

About a year ago, in the midst of leveraging TLS fingerprinting security research, Akamai noticed that certain fingerprints were being "shared" across too many known clients. Initially, it looked like bots trying to impersonate popular browsers. However, the traffic behavioral pattern looked more benign than exploitative. Indeed, it seemed that these TLS fingerprints seemed to be agnostic to the particular clients, proxying for whomever to wherever.

 

The Case of Mega-Proxies

One oddity was a consistent fingerprint (which was eventually traced to Java's JDK signature) across large CIDR blocks with both unusually large volumes of traffic and hosts.

As it turns out, one example of these mega-proxies is embedded inside viewers of popular cloud-based email platforms. One typical scenario goes like this - you get an email with a link, and you open it in the cloud-based email application. The same application proxies the request through itself, before serving it to the user inline, while echoing the client's original User-Agent and using the TLS fingerprint of that email service instead.

Unusual Chrome TLS Fingerprints

It is well known that Chrome emits very unique TLS fingerprints given its implementation of GREASE.

However, after filtering out bot impersonators, we noticed that benign Chrome traffic was not using GREASE fingerprints. Instead, it was using  a small set of alternative fingerprints.

After some investigation, including reverse DNS of the IPs and large-scale User-Agent correlations, it was obvious that things like cloud-based Secure Gateways, Enterprise Portals, and other security services were proxying Chrome traffic using their own fingerprint.

A typical deployment of this type of service requires trusted, intermediate SSL certificates (which are not unusual for an enterprise to mandate from its users). Given a trusted intermediary, an enterprise can implement in-line security services that funnel all traffic from inside the organization and use the service provider's own agent to proxy outbound traffic to the wider internet.

Typically, these agents are charged with traffic inspection for audit & compliance, firewall duties, or to prevent data exfiltration. Standing outside of the organization, an observer cannot help but notice a large diversity of clients all under one fingerprint. Again, most implementations of these services must echo the original User-Agent (and most other HTTP headers) in order not to break the requests while substituting their own TLS fingerprint

Odd Headers From Typical User-Agents

The next curiosity was easy to explain, given the mounting evidence of the existence of secure proxies and other legitimate man-in-the-middle proxies. 

Many proxying security services act as funnels or NATs for multiple clients. One way they keep track of which response belongs to which internal client is to use special X-Headers. These headers are so uncommon for popular browsers like Chrome, Firefox, and  Safari, and seemingly so tightly bound to specific CIDRs or ASNs (aka specific enterprises), they can only illuminate the presence of secure proxying. Exploiting the common implementation details of these special and unusual headers (like embedding the internal network's IP address), shared TLS fingerprints can easily be detected and pinpoint enterprises using such services like governmental, educational, or financial sites.

Multiplexing Tokens Under A Single Fingerprint

Another idiosyncrasy is seeing multiple, personally-identifiable tokens (like login sessions or security tokens) being used concurrently through a single IP and under a single TLS fingerprint. After controlling for botnets commanding multiple cracked accounts, it was rather unusual to see benign, logged-in traffic for multiple sites under a single fingerprint.

This points to plain-vanilla, water-under-the-bridge behavior for many proxies. Given peak traffic hours, it's somewhat common to see multiple users within an organization use internet services like Google or Facebook accounts at the same time, all exiting through the common enterprise gateway under a single, static fingerprint.

Conclusion

TLS fingerprints can come from more than browsers, varying across multiple versions and operating systems. 

A significant collection of TLS fingerprints come from popular development frameworks like Java's Apache HTTPClient, python requests, and Ruby libraries. While serving dual-purpose as bot development frameworks in many instances, it is important to realize that dismissing popular development frameworks outright could block legitimate, proxied traffic.

Shared TLS fingerprints come in all shapes and sizes, and they are an important part of the larger Internet ecosystem. It is important to acknowledge their presence and not limit traffic to just the fingerprints from common browsers. 

Security Research at Akamai continues to improve the security posture for all customers of its security products. One of the latest improvements is to allow for shared fingerprints in order to lower false positives, while at the same time blocking impersonators trying to leverage the same.

Article Link: https://blogs.akamai.com/sitr/2019/12/tls-fingerprint-confusion-how-proxying-hides-clients.html