Phishing detection via analytic networks

As mentioned in previous Akamai blogs, phishing is an ecosystem of mostly framework developers and buyers who purchase kits to harvest credentials and other sensitive information. Like many framework developers, those focusing on phishing kits want to create an efficient attack flow on their framework, from opening an email or clicking a link on a social media post, to visiting the phishing website, to completing the attack by sharing information, such as passwords.

As phishing has evolved over the years, criminals have learned that technical markers, like browser identification, geo-location, and operating system, can help adjust the phishing website's visibility, and enable more granular targeting. In order to evaluate these metrics, kit developers use third-party analytic products, such as those developed by Google, Bing, or Yandex, to gather the necessary details.

In this blog, we review the use of analytics in phishing and discuss how Akamai is using these identifiers to detect new phishing campaigns.


Today, 56.1% of all Internet websites are using web analytics, with Google Analytics coming in as the leading platform. Most websites are using analytics for generating reports on user behavior, page views, and their journey through the site. These statistics also offer detailed user technical metrics such as OS type, geo-location, browser type, etc.

Figure 1: Analytics usage distribution by platform (according to BuiltWith)

Analytic networks are tied to the back-end server, which receives events from every page and summarizes them into reports presented to the customer. In order to identify each customer, a unique identifier (UID) is used. In the case of Google Analytics, an example of the UID can be seen in Figure 2, as marked by UA-XXXXX-Y.

Figure 2: Example Google Analytics code, which is embedded on all monitored webpages

The UID is mainly comprised of two parts. One part indicates the unique analytics network account ID (XXXXX), and the other part is the view, or property, number (XXXX-Y). This post is based on Google IDs only, but there are more analytic networks that use similar UIDs for tracking. The only difference is the ID extraction from the source code, which in some cases, can lead to false positives in detection if the UID is a general string.

Phishing Analytics

Akamai scanned 62,627 active phishing URLs of which 54,261 are non-blank pages that belong to 28,906 unique domains. We discovered 874 domains with UIDs and 396 of the UIDs were unique Google Analytic accounts. Moreover, 75 of the UIDs were used in more than one website.

By analyzing the source code of these websites, we've concluded that the analytic identifiers' presence could be related to one of the following reasons:

  1. Phishing re-used UID: While attempting to duplicate the original website, the developers used copying tools such as HTTrack or wget to download the source code, reusing the analytic ID shipped with the original code.

  1. Phishing kit UID: Analytic IDs set by the framework developer to monitor the victim's movement through the phishing website.

  1. Legitimate UIDs: Phishing websites that were sinkholed by the targeted company, now redirects to the original website.

These results led to the discovery of various phishing campaigns as well as lists of new domains using the same UID.

For example, UA-3242811 is an old analytic network, related to LinkedIn. It was also used recently for targeting LinkedIn users, between April-July this year.

The campaign registered many misleading domains to lure its victims, but each domain hosted a different variation of the phishing kit's source code, making it hard to detect them all without the Google ID.

Figure 3: Examples of the LinkedIn phishing campaign with multiple pages linked by UID

UA-2725447 is an analytic network targeting AirBnB logins, but this campaign has a twist. It uses generated subdomains to evade blacklists detection (the primary domain is benign, and one can generate custom subdomains), but all of them have the same UID which just so happens to be the original AirBnB ID.The usage of the original UID by criminals makes their phishing attempts standout like a beacon and assists in getting the malicious domains pulled down quicker.


How Analytics Help Defenders

Understanding the full reach of a given phishing campaign is a known problem when it comes to detection. Relying on intelligence feeds and other advanced types of detection isn't enough, as those can be affected by resource issues such as sample size. As we've previously described in our blog, analytics can assist in understanding the full scale of a given phishing attack in some cases.

In addition, tracking UIDs can help cluster the campaigns, which makes locating and tracking easier for defenders. If an attacker uses a target's actual UID, as mentioned, they'll stand out like a beacon, but the real leverage for using UIDs is when you spot the ones used by criminals themselves.

Analytics help criminals focus on victims and narrow their attack to a given area or device type. It isn't at all uncommon to see a phishing attack target iOS devices while, for example, ignoring Android; sometimes this is due to the fact that the criminal has been tracking the most common users to their page and knows that Android users are less likely to be victimized. But when a criminal uses their own UID, they do so across all of their kits, so not only is it possible to track a single phishing campaign, it is sometimes possible to track multiple campaigns at once and tune defenses accordingly.


Using analytics can help you understand the full scale of a phishing campaign, and defenders can use this data to compare with internal signatures, for a more rounded detection and remediation process. Analytical data also helps understand domain targeting approaches. At the same time, analytics are just another brick in the phishing industry wall, representing the operational side used by developers to improve kits, and gather stats on campaign effectiveness. Overall, what we've shown here is another instance where criminals abuse legitimate services for malicious purposes.

Article Link: