Pwn2Own Qualcomm DSP

Research By: Slava Makkaveev

Introduction

Snapdragon is a suite of system on a chip (SoC) semiconductor products for mobile devices designed and marketed by Qualcomm Technologies Inc. A single SoC may include multiple CPU cores, an Adreno graphics processing unit (GPU), a Snapdragon wireless modem, a Hexagon Digital Signal Processor (DSP), a Qualcomm Spectra Image Signal Processor (ISP) and other hardware.

Snapdragon product tiers are differentiated by scalable computing resources for the CPU, GPU, and DSP processor. The lowest tiers might contain only a single Hexagon DSP, whereas the premium tier contains up to four Hexagon DSP processors dedicated for specific use cases. For example, the Snapdragon 855 (SM8150) SoC, which is embedded in mobile phones such as Pixel 4, Samsung S10, Xiaomi Mi 9, LG G8, and OnePlus 7, includes a Kryo CPU, an Adreno 640, and four separate DSPs, each devoted to a specific application space: sensor (sDSP), modem (mDSP), audio (aDSP), and compute (cDSP).

In this blog we examine two DSPs:

  • the cDSP, which is intended for compute-intensive tasks such as image processing, computer vision, neural network-related calculations and camera streaming.
  • the aDSP, which is intended for low-power processing of audio and voice data.

In terms of the current research, we look at cDSP and aDSP as one process unit (DSP). The security issues that we discovered are applicable to both.

Communication between the CPU and DSP

FastRPC is the Qualcomm-proprietary Remote Procedure Call (RPC) mechanism used to enable remote function calls between the CPU and DSP. The FastRPC framework is a typical proxy pattern.

Figure 1: FastRPC flow.

In Figure 1, you can see interaction of the FastRPC components:

  1. The user mode process (client) initiates the remote invocation. For example, an Android application in its native code calls one of the stub functions.
  2. The stub is an auto-generated code which converts the function call to an RPC message. Generally, the stub code is compiled as a separate native library, which is then linked with the client. The stub code uses libadsprpc.so and libcdsprpc.so libraries to invoke the DSP RPC driver (/dev/adsprpc-smd or /dev/cdsprpc-smd) on the applications processor (AP) through relevant ioctls.
  3. The DSP RPC kernel driver receives the remote message invocations, sends the queued message to the DSP RPC framework on the DSP through the Shared Memory Driver (SMD) channel, and then waits for the response.
  4. The DSP RPC framework removes the messages from the queue and dispatches them for processing by a skeleton dynamic library.
  5. The skel is an auto-generated library that unmarshals parameters and calls the target method implementation.
  6. The target method (object) is a logic provided by Qualcomm or OEMs that is designed to run on a DSP.

Who can run their own code on DSP?

For security reasons, the DSP is licensed for programming by OEMs and by a limited number of third-party software vendors. The code running on the DSP is signed by Qualcomm. A regular Android application has no permissions to execute its own code on the DSP. The exceptions are Snapdragon 855 and 865 SoCs, where Qualcomm is permitted to execute low-rights signature-free dynamic shared objects on cDSP.

It should be noted that Google enforces protection of Pixel devices through SELinux policy preventing access of third-party apps and adb shell to the DSP RPC drivers.

The publically available Hexagon SDK is responsible for compiling C/C++ source code of DSP objects into Hexagon (QDSP6) bytecode applicable for execution on DSP.  Stub and skel code are generated automatically based on Interface Definition Language (IDL) modules prepared by the developer. Qualcomm IDL is used to define interfaces across memory protection and processor boundaries. IDL exposes only what that object does, but not where it resides or the programming language in which it is implemented.

An Android application developer is able to implement its custom library for DSP, but is not able to execute in full. Only prebuilt DSP libraries can be freely invoked by an Android app. 

Who manages the DSP?

QuRT is a Qualcomm-proprietary multithreaded Real Time OS (RTOS) managing the Hexagon DSP. The integrity of the QuRT is trusted by Qualcomm’s Secure Executable Environment (QSEE). The QuRT executable binary (separate for aDSP and cDSP) is signed and split to several files in the same way as any other trusted application on Qualcomm devices. Its default location is the /vendor/firmware directory.

For each Android process initiating the remote invocation, QuRT creates a separate process on the DSP. The special shell process (/vendor/dsp/fastrpc_shell_0 for aDSP and /vendor/dsp/fastrpc_shell_3 for cDSP) is loaded on the DSP when a user process is spawned. The shell is responsible for invocation of the skeleton and object libraries. In addition, it implements the DSP RPC framework providing the API that may be required for the skel and object libraries.

The DSP software architecture provides different protection domains (PD) to ensure the stability of the kernel software. There are three protection domains in the DSP:

  • Kernel – Has access to all memory of all PDs.
  • Guest OS – Has access to the memory of its own PD, the memory of the User PD, and some system registers.
  • User – Has access only to the memory of its own PD.

Signature-free dynamic shared objects are run inside an Unsigned PD, which is the user PD limited in its access to underlying DSP drivers and thread priorities. An Unsigned PD is designed to support only general computing applications.

Object libraries as well as the FastRPC shell are run in the User PD.

Skipping stub code from the FastRPC flow

libadsprpc.so and libcdsprpc.so libraries are responsible for communication with DSP RPC drivers. These libraries export two functions that are interesting for research:

  • int remote_handle_open(const char* name, remote_handle *ph). This function opens a remote session between the caller process on AP and a new FastRPC shell process on the DSP. This session is used for communication with a skeleton library indicated as the first argument.
  • int remote_handle_invoke(remote_handle h, uint32_t scalars, remote_arg *pra). This function is able to invoke the skeleton library’s exported methods. A session handler should be indicated as the first argument.

Using these two functions, a client can execute the DSP methods implemented in any skeleton library. The stub code provided by Qualcomm or OEMs can be skipped from the chain.

Figure 2: Invoking the DSP directly.

Let’s take a look at the second and the third arguments of the remote_handle_invoke function, that encode the target method and its arguments.

scalars is a word that contains the following metadata information:

  • Method index and attribute (the highest byte, 0xFF000000 mask).
  • Number of input arguments (0x00FF0000 mask).
  • Number of output arguments (0x0000FF00 mask).
  • Number of input and output handles (0x000000FF mask, four bits for the input and four bits for the output). On modern phones, a DSP invocation fails if this byte is not equal to zero.

pra is a pointer to an array of arguments (remote_arg entries) of the target method. The order of the arguments is the following: input arguments, output arguments, input handles, and output handles.

As you can see, each input and output argument is converted to a universal remote_buf entry.

It should be noted that if we prepare more remote_arg array entries than required by the target method, then extra parameters are just ignored by the skeleton library.

scalars and pra parameters are transferred “as is” through the DSP RPC driver and DSP RPC framework, and are used as the first and the second arguments of the special invoke function provided by each skeleton library. For example, libfastcvadsp_skel.so library provides the fastcvadsp_skel_invoke invoke function. The invoke function is only responsible for calling appropriate skel methods by their index. Each skel method by itself verifies received remote arguments, unmarshals the remote_bufs to regular types, and calls the object method.

As you can see, to invoke a method from a skel library, you only need to know its index and wrap each argument by the remote_buf structure. The fact that we do not have to provide the name of the invoking function, types and number of its arguments to perform the call, makes skeleton libraries a very convenient target for fuzzing.

Downgrade vulnerability

There are a lot of skeleton libraries pre-installed by Qualcomm on Android phones. The vast majority of them are proprietary. However, there are open source examples like libdspCV_skel.so and libhexagon_nn_skel.so.

Many skeleton libraries such as libfastcvadsp_skel.so and libscveBlobDescriptor_skel.so can be found on almost all Android devices. However, libraries like libVC1DecDsp_skel.so and libsysmon_cdsp_skel.so are presented only on modern Snapdragon SoCs.

There are libraries implemented by OEMs and only used on devices of specific vendors. For example, libedge_smooth_skel.so can be found on Samsung S7 Edge, and libdepthmap_skel.so is on OnePlus 6T devices.

Generally, all skel libraries are located either in /dsp or /vendor/dsp or /vendor/lib/rfsa/adsp directories. By default, the remote_handle_open function scans exactly these paths. In addition, there is an environment variable ADSP_LIBRARY_PATH into which a new search path can be appended.

As was mentioned previously, all DSP libraries are signed and cannot be patched. However, any Android application can bring a signed by Qualcomm skeleton library in its assets, extract it to the app’s data directory, add the path to the beginning of the ADSP_LIBRARY_PATH, and then open a remote session. The library is successfully loaded on the DSP because its signature is correct.

The fact that there is no version check of loading skeleton libraries opens the possibility to run a very old skel library with a known 1-day vulnerability on the DSP. Even if the updated skeleton library already exists on the device, it is possible to load the old version of this library just by indicating its location in the ADSP_LIBRARY_PATH before the path of the original file. In this way, any DSP patch can simply be bypassed by an attacker. In addition, through analyzing DSP software patches, an attacker can find out an internally fixed vulnerability in a library and then exploit it by loading the unpatched version.

Due to the lack of lists of approved/denied skeleton libraries permitted for the device, it is possible to run a library intended for one device (for example, Sony Xperia) on any other device (for example, Samsung). This means that a vulnerability discovered in one of the OEM libraries compromises all Qualcomm-based Android devices.

Feedback-based fuzzing of Hexagon libraries

DSP libraries are proprietary Hexagon ELFs. The easiest way to instrument a Hexagon executable is to use the open-source Quick emulator (QEMU). Hexagon instruction set support was added in QEMU only at the end of 2019. We fixed a lot of bugs to be able to run real DSP libraries in the user mode of the emulator.

American fuzzy lop (AFL) in combination with QEMU was used to fuzz the skeleton and object DSP libraries on Ubuntu PC.

To execute a library code on the emulator, we prepared a simple program (a Hexagon ELF binary) which is responsible for the following:

  1. Parse a data file received as the first command line parameter into the scalars word and remote_arg array.
  2. dlopen a skeleton library specified in the second command line parameter. The library may depend on other skeleton and object libraries. For example, libfastcvadsp_skel.so depends on libapps_mem_heap.so, libdspCV_skel.so and libfastcvadsp.so lib. All these libraries can be extracted from a firmware or pulled from a real device.
  3. Call the invoke function by its address by providing scalars and a pointer to remote_arg array as arguments. For example, fastcvadsp_skel_invoke is the start point for fuzzing of libfastcvadsp_skel.so library.

We used the following input file format for our program:

  1. scalars value (4 bytes). In an example presented in Figure 3, the scalars is equal to 0x08020200, which means to call the method number 8  by providing two input and two output arguments.
  2. Size of the input arguments (4 bytes for each argument): 0x10 and 0x20.
  3. Size of the output arguments (4 bytes for each argument): 0x80200 and 0x1000.
  4. Value of the input arguments. In the example, the value of the first argument is 0x10 bytes of 0x11 and the value of the second argument is 0x20 bytes of 0x22.
Figure 3: An input data file for fuzzing DSP libraries.

For each output argument, we allocate memory of the indicated size and fill it with the value 0x1F.

Most skeleton libraries widely use DSP framework and system calls. Our simple program cannot handle such requests. Therefore, we had to load the QuRT on the emulator before execution of the rest code. The easiest way to do so is not to use the real QuRT OS but its “lite” version runelf.pbn, adopted by Qualcomm for execution on a Hexagon simulator and included in the Hexagon SDK.

The AFL fuzzer permutes the content of the data file and triggers execution of runelf.pbn on the emulator. The QuRT loads the prepared ELF binary which then calls a target skeleton library. QEMU returns a code coverage matrix to the AFL after execution of the test case.

Figure 4: DSP library fuzzing scheme.

We were surprised by the fuzzing result. Crashes were found in all DSP libraries that we chose to fuzz. Hundreds of unique crashes were detected in the libfastcvadsp_skel.so library alone.

The interesting thing is that most issues were discovered exactly in skeleton libraries but not in object libraries. This means that Hexagon SDK produces vulnerable code.

Automatically generated code

Let’s take a look at the open source hexagon_nn library, which is part of the Hexagon SDK 3.5.1. This library exports a lot of functions intended for neural network-related calculations.

Hexagon SDK automatically generates hexagon_nn_stub.c stub and hexagon_nn_skel.c skel models at the compilation time of the library. Some security issues can be easily detected by manually reviewing the modules. We will show only two of them.

Marshaling a string (char *) argument

int hexagon_nn_op_name_to_id(const char* name, unsigned int* node_id) function requires one input (name) and one output (node_id) argument. The following stub code is generated by the SDK for marshaling these two arguments:

We can see that in addition to the existing two arguments, the third remote_arg entry was created at the beginning of the _pra array. This special _pra[0] argument holds the length of the name string.

The name itself is saved in the second remote_arg entry (_praIn[0]), where its length will be stored again, but this time in the _praIn[0].buf.nLen field.

The skel code extracts both these lengths and compares them as signed int values. This is the bug. An attacker can ignore the stub code and write a negative value (greater than or equal to 0x80000000) into the first remote_arg entry, bypassing this validation. This fake length is then used as a memory offset and causes a crash (read out of the heap boundary).

The same code is generated for all object functions that require string arguments.

Marshaling an in-out buffer

Let’s take a look at the int hexagon_nn_snpprint(hexagon_nn_nn_id id, unsigned char* buf, int bufLen) function that requires a buffer and its length as arguments. The buffer is used for both input and output data. Therefore, it is split into two separate buffers (the input and the output buffers) in the stub code. Once again, lengths of both buffers (_in1Len and _rout1Len) are stored in the additional remote_arg entry (_pra[0]).

The skel function copies (using _MEMMOVEIF macro) the input buffer to the output buffer before calling the object function. The size of data to be copied is the length of the input buffer that was held in the special remote_arg entry (_pra[0]).

An attacker controls this value. All verification checks can simply be bypassed by using a negative input buffer’s length.

Type casting to the signed int type on checking buffer boundaries is a bug leading to heap overflow.

To summarize, the automatically generated code injects vulnerabilities into the libraries of Qualcomm, OEMs and all other third-party developers who use the Hexagon SDK. Dozens of DSP skeleton libraries pre-installed on Android smartphones are vulnerable due to serious bugs in the SDK.

Exploiting a DSP vulnerability

Let’s take a look at one of many vulnerabilities discovered in proprietary DSP skeleton libraries and try to prepare “read-what-where” and “write-what-where” primitives.

libfastcvadsp_skel.so library can be found on most Android devices. In the example below, we use the library with version 1.7.1, extracted from the Sony Xperia XZ Premium device. A malicious Android application can cause the libfastcvadsp_skel.so library to crash by providing specially crafted arguments to the remote_handle_invoke function. The data file in Figure 5 shows an example of such crafted arguments.

Figure 5: The data file to cause the libfastcvadsp_skel.so to crash

As you can see, the 0x3F method is called and provided with one input and three output arguments. The content of the input argument begins with byte 0x14 and contains the following major fields:

  • Red 0x02 shows how many half-words to read (the size).
  • Yellow 0x44332211 shows what to read (the source). This value is the offset relative to the beginning of the first output argument in the DSP heap. Using this offset, we control the start address for reading. The offset can be as long as we want and even be negative.
  • Cyan 0x04 shows where to read (the destination). The value is also the offset.

The crash is caused because the source address is incorrect.

Figure 6: The crash dump.

The abbreviation POC code for reading the primitive is presented below.

Input arguments are always located in the DSP heap right after output arguments. Therefore, in the writing primitive, we need to shift the source address according to the length of the first output argument (all other arguments are empty).

An attacker can manipulate source and destination offsets for reading and writing in the address space of a DSP process (User PD). The offset between the first output argument and libfastcvadsp_skel.so library in memory is a constant value. It is easy to find a pointer in a data segment of a skel or object library to trigger a call. For security reasons, we will not publish the rest of the POC of code execution in the DSP process.

Summary of DSP user domain research

During this security research of skeleton and object libraries that are part of the Qualcomm DSP user domain, we discovered two global security issues:

  • Lack of version control of DSP libraries. This allows a malicious Android application to perform a downgrade attack and run vulnerable libraries on the DSP.
  • Bugs in the Hexagon SDK led to hundreds of hidden vulnerabilities in the Qualcomm-owned and mobile vendors’ code. Almost all DSP skeleton libraries embedded in Snapdragon-based smartphones are vulnerable to attack due to issues in the Hexagon SDK.

We reported to Qualcomm around 400 unique crashes in dozens DSP libraries including the following:

  • libfastcvadsp_skel.so
  • libdepthmap_skel.so
  • libscveT2T_skel.so
  • libscveBlobDescriptor_skel.so
  • libVC1DecDsp_skel.so
  • libcamera_nn_skel.so
  • libscveCleverCapture_skel.so
  • libscveTextReco_skel.so
  • libhexagon_nn_skel.so
  • libadsp_fd_skel.so
  • libqvr_adsp_driver_skel.so
  • libscveFaceRecognition_skel.so
  • libthread_blur_skel.so

To demonstrate, we exploited one of the discovered vulnerabilities and obtained the ability to execute unsigned code on DSP of Snapdragon-based devices, including Samsung, Pixel, LG, Xiaomi, OnePlus, HTC and Sony mobile phones.

An Android application that has access to user domain of a DSP gains the following possibilities:

  • Trigger a DSP kernel panic and reboot the mobile device.
  • Hide malicious code. Antiviruses do not scan the Hexagon instruction set.
  • cDSP is responsible for preprocessing streaming video from camera sensors. An attacker can take over this flow.
  • Access DSP kernel drivers. A vulnerability in a driver can expand the app’s privileges to the rights of the guest OS or DSP kernel.

DSP drivers

QuRT OS implements its own device driver model called QuRT Driver Invocation (QDI). The QDI is inaccessible from Android API. Like POSIX, QDI device drivers operate with higher privileges than user code that requests driver services. QDI provides a simple driver invocation API that hides all implementation details associated with a privileged mode.

The libqurt.a library, which is part of Hexagon SDK, contains the QDI infrastructure. The FastRPC shell is linked statically with the library.

Dozens of QDI drivers can be found in the QuRT executable binary. They are usually named as /dev/.., /qdi/.., /power/.., /drv/.., /adsp/.. or /qos/... The int qurt_qdi_open(const char* drv) function can be used to gain access to a QDI driver. A small integer device handle is returned. This is a direct parallel to the POSIX file descriptors.

The QDI provides only one macro that is the necessary user-visible API. This qurt_qdi_handle_invoke macro is responsible for all generic driver operations. In fact, the qurt_qdi_open is just a special case of this macro. These are the macro arguments:

  1. QDI handle or one of predefined constant values.
  2. Method number that defines the requested action. In the SDK header files, we see that:
    • Methods 1 and 2 are reserved for name registration and name lookup.
    • 3 – 31 are reserved for POSIX-type operations on open handles.
    • 32 – 127 are reserved for the QDI infrastructure.
    • 128 – 255 are reserved for the use of automatically generated methods such as might be generated by an IDL.
    • 256 and higher are private method numbers. Drivers can use these methods as they wish.
  3. Zero to nine optional 32-bit arguments.

The qurt_qdi_handle_invoke macro invokes the relevant device driver invocation function which implements the main driver logic and provides a specified method number and optional arguments.

This is an example of a QDI driver invocation from the user PD code:

A QDI driver uses the int qurt_qdi_devname_register(const char *name, qurt_qdi_obj_t *opener) API function to register itself in the QuRT. The driver provides its name and a pointer to an opener object as arguments.

The first field of the opener object is the driver invocation function. QuRT calls this function to handle driver requests from the user PD or another driver, and provides the following arguments:

  • QDI handle which represents the client that sent the QDI request.
  • The opener object on which this QDI request is made.
  • QDI method provided by the caller.
  • Nine optional arguments provided by the caller.

In general, a driver invocation function is a switch operator by the QDI method ID. Each method can use a different number of arguments than the ones provided. The argument type is qurt_qdi_arg_t.

Note that the driver invocation function is a good target for fuzzing-based vulnerability research because the methods are identified by ID but not by name, and the caller does not need to know the exact number of arguments and their actual type to invoke the driver method.

Feedback-based fuzzing of QDI drivers

To fuzz QDI drivers on an Ubuntu PC, we used the same combination of QEMU Hexagon and AFL as for fuzzing DSP libraries. However, instead of the skel_loader program, we implemented another Hexagon ELF binary qdi_exec which is responsible for these actions:

  1. Parse a data file received as the first command line parameter into the QDI method ID and an array of nine arguments for the driver invocation function.
  2. Call the driver invocation function by its address, which is specified in the second command line parameter, and provide the QDI method ID and the arguments decoded from the data file.

We used the following input file format for the qdi_exec program:

  • The header (4 bytes). It contains three valuable fields:
    • QDI method ID (10 low bits). In the example in Figure 7, it is 0x01.
    • Number of arguments (4 bits). In the example, only one argument was used. The remaining eight arguments are considered to be zero.
    • Mask of argument types (9 bits). As we mentioned previously, each argument is either a number or a pointer to a buffer. In the mask, each argument is represented by one bit. A value of zero means that the argument is a number, and a positive value means that the argument is a buffer.
  • Size of the buffer arguments (4 bytes for each argument). In the example, the /dev/diag string with a length of 0x0A is used as the argument.
  • Content of the buffer arguments.
Figure 7: An input data file for fuzzing QDI drivers.

QDI drivers are implemented as part of the QuRT ELF. They were not included by Qualcomm in the runelf.pbn version of QuRT which we ran on the emulator along with our program. Therefore, we had to patch the runelf.pbn ELF file as follows:

  1. Append program segments of a QuRT ELF that is intended for a real device in the runelf.pbn. We used the aDSP binary extracted from the Pixel 4 device.
  2. Redirect malloc and memcpy kernel functions used by QDI drivers to their user-mode implementation. Kernel memory functions limit some transfers between user and kernel spaces.
Figure 8: QDI driver fuzzing scheme.

The AFL fuzzer permutes the content of the data file and triggers the execution of the patched runelf.pbn on the emulator. The runelf.pbn loads our qdi_exec program which directly calls a QDI driver invocation function.

We found the starting addresses of QDI driver invocation functions manually by reverse-engineering the QuRT binary. The opener object is located in the code next to the driver name.

The fuzzer found many crashes in a dozen QDI drivers built into the Snapdragon 855 aDSP. Most of them are applicable for the cDSP as well.

Exploiting vulnerabilities in QDI drivers

Any failure in QDI drivers can be used to cause the DSP kernel panic and reboot the mobile device. For example, each of the lines of code below will cause a DSP panic and can be used for a DoS attack on the device.

For research purposes, we successfully exploited several arbitrary kernel read and write vulnerabilities in the /dev/i2c QDI driver and two code execution vulnerabilities in the /dev/glink QDI driver. For security reasons, we cannot publish the POC code, but we do note that the exploitation is quite simple. This is an example of the reading primitive:

A malicious Android application can use discovered vulnerabilities in QDI drivers along with the described vulnerabilities in DSP libraries of the user PD to execute a custom code in the context of the DSP guest OS.

Requesting Android services from the guest OS PD

What happens if we try to open an Android-related file from the DSP guest OS code? The answer is that QuRT redirects our request to a special Android daemon. As you can see in Figure 9, on Snapdragon 855 devices, there are two aDSP daemons and one cDSP daemon that operate with different privileges.

Figure 9: DSP Android daemons.

On a Pixel 4 device, startup commands for these daemons can be found in the init.sm8150.rc file.

Figure 10: Pixel 4 init.sm8150.rc init file.

These highly privileged vendor.adsprpcd and vendor.cdsprpcd daemons handle DSP guest OS requests. They operate as the system user but at the same time they are very limited by SELinux. u:r:adsprpcd:s0 and u:r:cdsprpcd:s0 contexts have access only to DSP-related directories and objects.

Conclusion

The aDSP and cDSP subsystems are very promising areas for security research. First of all, the DSP is accessible for invocations from third-party Android applications. Second, the DSP processes personal information such as video and voice data that passes through the device’s sensors. Third, there are many security issues in the DSP components, as we presented in the blog. 

Qualcomm assigned CVE-2020-11201, CVE-2020-11202, CVE-2020-11206, CVE-2020-11207, CVE-2020-11208 and CVE-2020-11209 for disclosed DSP vulnerabilities. For the vulnerabilities discovered in QDI drivers, Qualcomm decided not to assign CVEs. All issues have been successfully fixed with the November 2020 Qualcomm Security Patch.

For research purposes, we exploited a few of the discovered vulnerabilities and gained the ability to execute privileged code on aDSP and cDSP of all Snapdragon-based mobile devices.

//research.checkpoint.com/wp-content/uploads/2021/04/DSP.mp4

The post Pwn2Own Qualcomm DSP appeared first on Check Point Research.

Article Link: Pwn2Own Qualcomm DSP - Check Point Research