A recent report by Thales revealed that almost half of all businesses have experienced a cloud-based data breach within the last year—up 5% from the previous year. More recently, it was reported that the attacker responsible for the 2019 Capital One hack took advantage of misconfigured AWS S3 buckets, which resulted in compromised personal information for 100 million credit applications.
Lacework also observed this trend in cloud storage targeting. AWS S3 can be considered low hanging fruit for attackers and is often one of the first targeted cloud resources. While this activity is often just noise, it can represent a real risk to your data if your buckets are misconfigured. Recently, Lacework Labs observed an uptick in reports of reconnaissance and likely word-list brute forcing of customer S3 buckets (T1580). Typically, S3 probing is done anonymously (Anonymous Principal); however, we have also seen this activity from unknown or “rogue” AWS accounts.
This blog examines recent S3 reconnaissance tactics, including those involving both rogue AWS accounts and Anonymous Principal. We also look at trends in targeted S3 buckets.
Bucket names can often be easily guessed so it’s important to be aware of the most frequently used names to avoid them and limit exposure to S3 probing.
As reported in Lacework Labs’ 2021 Cloud Threat Report (Volume 2), the most common thread among S3 recon is that the majority originates from Tor exit nodes (T1090-003). Among a 6 month sampling, approximately 59% originated from Tor. While this is not unexpected in general scanning, the share of Tor traffic is notably higher than other types of reconnaissance. Other interesting trends in S3 recon source traffic were the use of various Google App Engines (23% of traffic) as well as several hosts with passive DNS records for Synology domains (*.synology.me and *.quickconnect.to). Many Synology Network Attached Storage (NAS) devices are vulnerable to attacks so these hosts may have been recruited by botnets for scanning and exploitation (T1584).
User-Agents within S3 Recon
The high occurrence of Tor can make source IPs a lousy indicator for analysis. Fortunately there are other indicators in CloudTrail; for example, user agents. For S3 reconnaissance seen with Anonymous Principal, the majority was observed with only a handful of user agents. The most prolific of these (aws-sdk-go/1.35.28 (go1.17; linux; amd64) belongs to the internet scanner Censys, which uses their own infrastructure in lieu of Tor. Interestingly, Censys typically self-identifies their scanners in the user-agent; however, this was not the case for their S3 scanning hosts. The following are the most frequently observed user agents over the past 6 months, which were observed in 3% or more of sampled environments.
APIs can also be good behavioral indicators. Frequently observed APIs for S3 recon include GetBucketAcl and ListObject (DS0010). Put APIs such as PutObject and PutBucketAcl are less common but could be considered more invasive because they involve write operations and are likely to be associated with malicious activity as opposed to vulnerability scans.
Unknown Account IDs
In combination with Anonymous Principal, Lacework observed random AWS accounts poking at numerous environments. Some of these can be attributed with a Google and/or Github search on the account ID; however, others remain a mystery and have been dubbed “rogue accounts.” Figure 1 shows the top rogue accounts over the past 6 months that have been observed in 5% or more of monitored environments. The noisiest of these accounts is 596369610332, which was seen in nearly 40% of all monitored environments. These accounts are also being tracked in the Lacework Labs Github. While there is not much difference behaviorally between recon performed from Anonymous Principal versus rogue accounts, these accounts may be more conspicuous especially if using anomaly detection.
S3 Bucket Name Brute Forcing
More insight into the S3 attack surface was gleaned by examining bucket names targeted by rogue accounts. To query an S3 bucket, the endpoint (S3 bucket name + .s3.amazonaws.com) must be identified first. In many cases, these are easily discoverable with Google Dorks or normal web searching. Other scanners may attempt to brute force using combinations of commonly used words in bucket names (T1595.003). (Brute forcing in this context refers to the existence of a bucket, not credential brute forcing). A large portion (28%) of bucket names were also fully qualified domains (FQDNs), meaning subdomain enumeration with passive DNS (T1596.001) could also be used in the discovery process.
Figure 2 shows the most frequently used terms in targeted bucket names, as observed by Lacework Labs. Based on our analysis, words such as assets, prod, and test are among the most common. While many buckets are intended to be public facing, it’s recommended to avoid these words as well as FQDNs for private buckets.
Figure 2. Common words in targeted buckets
This blog covered various S3 reconnaissance methods, all of which begin with identifying buckets and inspecting access controls. The good news is, despite all these attempts, your cloud storage is most likely safe from opportunistic attacks given proper configurations and best practices. Given the volume of data in the cloud, cloud storage technologies will most likely continue to be exploited in the future. And with Amazon’s dominance in the CSP market space, S3 is situated at the top of the target list. Refer to the Lacework Labs Github for data included in this blog. To see more content like this, follow Lacework Labs on LinkedIn, Twitter, and Youtube and stay up to date on our latest research!
Article Link: Recent trends in S3 targeting - Lacework