Our audit of PyPI

By William Woodruff

This is a joint post with the PyPI maintainers; read their announcement here!

This audit was sponsored by the Open Tech Fund as part of their larger mission to secure critical pieces of internet infrastructure. You can read the full report in our Publications repository.

Late this summer, we performed an audit of Warehouse and cabotage, the codebases that power and deploy PyPI, respectively. Our review uncovered a number of findings that, while not critical, could compromise the integrity and availability of both. These findings reflect a broader trend in large systems: security issues largely correspond to places where services interact, particularly where those services have insufficiently specified or weak contracts.

PyPI

PyPI is the Python Package Index: the official and primary packaging index and repository for the Python ecosystem. It hosts half a million unique Python packages uploaded by 750,000 unique users and serves over 26 billion downloads every single month. (That’s over three downloads for every human on Earth, each month, every month!)

Consequently, PyPI’s hosted distributions are essentially the ground truth for just about every program written in Python. Moreover, PyPI is extensively mirrored across the globe, including in countries with limited or surveilled internet access.

Before 2018, PyPI was a large and freestanding legacy application with significant technical debt that accumulated over nearly two decades of feature growth. An extensive rewrite was conducted from 2016 to 2018, culminating in the general availability of Warehouse, the current codebase powering PyPI.

Various significant feature enhancements have been performed since then, including the addition of scoped API tokens, TOTP- and WebAuthn-based MFA, organization accounts, secret scanning, and Trusted Publishing.

Our audit and findings

Under the hood, PyPI is built out of multiple components, including third-party dependencies that are themselves hosted on PyPI. Our audit focused on two of its most central components:

  • Warehouse: PyPI’s “core” back end and front end, including the majority of publicly reachable views on pypi.org, as well as the PEP 503 index, public REST and XML-RPC APIs, and administrator interface
  • cabotage: PyPI’s continuous deployment infrastructure, enabling GitOps-style deployment by the PyPI administrators

Warehouse

We performed a holistic audit of Warehouse’s codebase, including the relatively small amount of JavaScript served to browser clients. Some particular areas of focus included:

  • The “legacy” upload endpoint, which is currently the primary upload mechanism for package submission to PyPI;
  • The administrator interface, which allows admin-privileged users to perform destructive and sensitive operations on the production PyPI instance;
  • All user and project management views, which allow their respectively privileged users to perform destructive and sensitive operations on PyPI user accounts and project state;
  • Warehouse’s AuthN, AuthZ, permissions, and ACL schemes, including the handling and adequate permissioning of different credentials (e.g., passwords, API tokens, OIDC credentials);
  • Third-party service integrations, including integrations with GitHub secret scanning, the PyPA Advisory Database, email delivery and state management through AWS SNS, and external object storages (Backblaze B2, AWS S3);
  • All login and authentication flows, including TOTP and WebAuthn-based MFA flows as well as account recovery and password reset flows.

During our review, we uncovered a number of findings that, while not critical, could potentially compromise Warehouse’s availability, integrity, or the integrity of its hosted distributions. We also uncovered a finding that would allow an attacker to disclose ordinarily private account information. Following a post-audit fix review, we believe that each of these findings has been mitigated sufficiently or does not pose an immediate risk to PyPI’s operations.

Findings of interest include:

  • TOB-PYPI-2, wherein weak signature verification could allow an attacker to manipulate PyPI’s AWS SNS integration, including topic subscriptions and bounce/complaint notices against individual user emails.
  • TOB-PYPI-5, wherein an attacker could use an unintentional information leak on the upload endpoint as a reconnaissance oracle, determining account validity without triggering ordinary login attempt events.
  • TOB-PYPI-14, wherein an attacker with access to one or more of PyPI’s object storage services could cause cache poisoning or confusion due to weak cryptographic hashes.

Our overall evaluation of Warehouse is reflected in our report: Warehouse’s design and development practices are consistent with industry-standard best practices, including the enforcement of ordinarily aspirational practices such as 100% branch coverage, automated quality and security linting, and dependency updates.

cabotage

Like with Warehouse, our audit of cabotage was holistic. Some particular areas of focus included:

  • The handling of GitHub webhooks and event payloads, including container and build dispatching logic based on GitHub events;
  • Container and image build and orchestration;
  • Secrets handling and log filtering;
  • The user-facing cabotage web application, including all form and route logic.

During our review, we uncovered a number of findings that, while not critical, could potentially compromise cabotage’s availability and integrity, as well as the availability and integrity of the containers that it builds and deploys. We also uncovered two findings that could allow an attacker to circumvent ordinary access controls or log filtering mechanisms. Following a post-audit fix review, we believe that these findings have been mitigated sufficiently or do not pose an immediate risk to PyPI’s operations (or other applications deployed through cabotage).

Findings of interest include:

  • TOB-PYPI-17, wherein an attacker with build privileges on cabotage could potentially pivot into backplane control of Caborage itself through command injection.
  • TOB-PYPI-19, wherein an attacker with build privileges on cabotage could potentially pivot into backplane control of cabotage itself through a crafted hosted application Procfile.
  • TOB-PYPI-20, wherein an attacker with deployment privileges on cabotage could potentially deploy a legitimate-looking-but-inauthentic image due to GitHub commit impersonation.

From the report, our overall evaluation is that cabotage’s codebase is not as mature as Warehouse’s. In particular, our evaluation reflects operational deficiencies that are not shared with Warehouse: cabotage has a single active maintainer, has limited available public documentation, does not have a complete unit test suite, and does not use CI/CD system to automatically run tests or evaluate code quality metrics.

Takeaways

Unit testing, automated linting, and code scanning are all necessary components in a secure software development lifecycle. At the same time, as our full report demonstrates, they cannot guarantee the security of a system or design: manual code review remains invaluable for catching interprocedural and systems-level flaws.

We worked closely with the PyPI maintainers and administrators throughout the audit and would like to thank them for sharing their extensive knowledge and expertise, as well as for actively triaging reports submitted to them. In particular, we would like to thank Mike Fiedler, the current PyPI Safety & Security Engineer, for his documentation and triage efforts before, during, and after the engagement period.

Article Link: Our audit of PyPI | Trail of Bits Blog