Understanding the Root Cause of CVE-2021-21220 – A Chrome Bug from Pwn2Own 2021

In this second blog in the series, ZDI Vulnerability Researcher Hossein Lotfi looks at the root cause of CVE-2021-21220. This bug was used during Pwn2Own Vancouver 2021 to exploit both Chrome and Edge (Chromium) to earn $100,000 at the event. Today’s blog starts with a look at how to trigger the vulnerability and goes on to describe why the bug occurs.

I begin Part 2 of this blog series with a discussion of how to trigger the vulnerability. For clarity, I modified the PoC slightly and came up with the following:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/31cb14ed-8d0d-4aa6-9dff-2d9bc1c93582/Screen+Shot+2021-12-09+at+10.55.14+AM.png?format=1000w" />

I covered lines 3 through 5 in our first blog. Lines 4 and 6 simply use “console.log” to print data. Let’s see what happens in the first and second line:

Line 1: Constructs a Uint32Array (a typed array that can hold 32-bit unsigned integers). The array contains just one element, having the value 231 (2,147,483,648 in decimal or 0x80000000 in hex). The array is assigned to variable arr.

Line 2: A function called “foo” will take the first element of arr (which is 231), XOR it with a constant integer 0, add a constant integer 1, and return the result.

There are some interesting points in these two lines:

        1 - 0x80000000 has its most significant bit set. This is known as the sign bit when handling signed integers.
        2 - XORing any value with zero will return the original value unchanged. If this XOR does not have any effect, then why was it necessary to include it? We will answer this soon.

Save this PoC as “poc.js” and run it with the following command:

$ ./d8 --allow_natives_syntax '/home/lab/Desktop/poc.js'

It should print the following output:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/860fc043-806d-4444-8a1d-2dd9346e8b95/Picture1.png?format=1000w" />

Interesting! Results of the interpreted and JITted versions are different, which should not happen. JIT supposed to speed up the function but should never change the results.

 Now that we are here, let’s have a look at the patch as it may give us some hints as to why this is happening:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/9c0678b8-0186-4d4f-9044-f9266de2fa16/Picture2.png?format=1000w" />

The only change is inside the function InstructionSelector::VisitChangeInt32ToInt64, found within the file src/compiler/backend/x64/instruction-selector-x64.cc. There is also a nice comment, which can provide us an educated guess. As mentioned in the first blog, a JITted function will be compiled to assembly to achieve maximum speed. Before the patch, on the x64 platform, if there was a load of a signed int32 into a 64-bit register, the kX64Movsxlq opcode would be selected. Conversely, when an unsigned int32 was loaded into a 64-bit register, the kX64Movl opcode would be used. This choice between two opcodes is intended to ensure that the upper 32 bits of the destination register are set properly by the load: When loading an unsigned 32-bit value, the upper 32 bits in the destination should be set to all zeros, whereas when loading a signed 32-bit value, the upper 32 bits in the destination should all be set to match the sign bit of the source value. After the patch, the kX64Movsxlq opcode is used in all cases. As the function name denotes, it expects a signed int32 input, so the kX64Movsxlq opcode is always the correct choice.

Apparently, though, the PoC somehow managed to provide an unsigned input to this function! How is this possible? This is what we must investigate next.

Deep Blue Sea of Nodes

To find the root cause of this vulnerability, we can pass the “--trace-turbo-graph” argument to d8 to see generated turbofan graphs:

./d8 --allow_natives_syntax --trace_turbo_graph '/home/lab/Desktop/poc.js'

As this vulnerability has something to do with the type of input, it seems like a good idea to first check how the typer assigned types the nodes. For this purpose, we need to find “Graph after V8.TFTyper” in the graph and check its data:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/f21c0199-b547-4061-a7bb-3b0483c0169a/Picture3.png?format=1000w" />

This is what we see:

LoadTypedElement: This shows loading the element from our typed array. The type is Unsigned32.
SpeculativeNumberBitwiseXor: For the XOR operation. The type is Signed32.
NumberConstant[1]: For the constant number 1.
SpeculativeNumberAdd: For adding 1 to the result of the XOR.

All types make sense. Let’s move on to a later phase called “simplified lowering”:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/dedf15be-c171-453a-81f1-c2e48bf4eb21/Picture4.png?format=1000w" />

After the simplified lowering phase this becomes:

LoadTypedElement: Type is still Unsigned32.
Word32Xor: Type is still Signed32.
ChangeInt32ToInt64 (#31:Word32Xor): This node is new. It takes the result of the XOR and converts it to Int64. Remember that the patch fixed this vulnerability by changing the InstructionSelector::VisitChangeInt32ToInt64 function. That means this node will be important in our analysis. For now, it seems OK as this node takes a Word32Xor node that is signed.
Int64Constant[1]: For the constant number 1.
Int64Add: For adding 1 to the result of the XOR.

The “--trace-turbo-graph” output shows how the engine optimizes the graph by performing numerous transformations. During the early optimization phase, the execution flow reaches a function called MachineOperatorReducer::ReduceWordNXor within v8/src/compiler/machine-operator-reducer.cc to deal with the XOR operation in our PoC:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/8146cb21-9980-4df9-9874-f16b2a8e4ba4/Picture5-2.png?format=1000w" />

Let’s have a quick look at the XOR in our PoC again. We XOR arr[0] by 0, and we know that XOR by 0 has no effect and returns arr[0]. Now check the highlighted section in the picture above. Here the engine checks if the right operand is provably equal to 0 and, if so, it replaces the XOR operation with the left node (arr[0]). In this way, the engine removes the no-op XOR to achieve better speed. How cool! Unfortunately, there is a small problem: the replaced XOR operation had an output type of Signed32, but arr[0] has a types of Unsigned32. The EarlyOptimization phase output shows this clearly:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/8dc63ef0-9377-46d3-a008-0d887cf5b461/Picture6.png?format=1000w" />

The nodes now are:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/900f5117-8fb7-4b7b-b02a-cd761e8ee606/Screen+Shot+2021-12-08+at+7.38.08+PM.png?format=1000w" />

When you compare this output with output of simplified lowering phase, we can see 2 major changes:

         1 - The Word32Xor node is not available anymore. It has been replaced.
         2 - The ChangeInt32ToInt64 (#31:Word32Xor) node has been changed to ChangeInt32ToInt64 (#45:LoadTypedElement). This is where the vulnerability occurs. ChangeInt32ToInt64 needs a Signed32 node. This was ok before, because Word32Xor was signed, but now it gets a LoadTypedElement node, which is unsigned.

As a side note: Now that we know the root cause of this vulnerability, we can develop some variants. For example, we can replace the XOR with a SAR using the “>>” operand (check the “MachineOperatorReducer::ReduceWord64Sar” function) or a SHL using the “<<” data-preserve-html-node="true" data-preserve-html-node="true" operand (check the “MachineOperatorReducer::ReduceWord64Shl” function).

Later, execution reaches the vulnerable function InstructionSelector::VisitChangeInt32ToInt64:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/87c7396e-3b78-46bd-af01-ef6f79455395/Picture7.png?format=1000w" />

It checks if it is a signed load, but we changed the type to unsigned, and thus kX64Movl is chosen.

How can this cause a problem? The kX64Movsxlq opcode translates to an Intel movsxd instruction, while the kX64Movl opcode translates to an intel mov instruction. For a 32-bit source value with the most significant bit not set, there are no differences between these two. However, if the source has a 1 as the most significant bit, these deliver two very different results. Recall that the value stored in the array is 0x80000000, which has the most significant bit set. Let’s illustrate the difference between movsxd and mov by doing a small experiment in x64dbg. We will perform a ‘movsxd’ of a 32-bit value 0x80000000 to ‘rbx” and ‘mov’ of the same 32-bit value 0x80000000 to rcx. Here are the registers before the move instructions:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/57ab2b13-3b7c-4ad7-8f72-0a28ddcf0d5f/Picture8.png?format=1000w" />

And here are the results after the moves:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/11af59af-b957-41ed-8c2f-a20da1ca2aee/Picture9.png?format=1000w" />

As you can see, the value of rbx is very different than rcx. As opposed to the mov instruction, the movsxd instruction sign-extended the value. Now if the engine chooses the wrong instruction, it may load incorrect value into registers causing various problems.

Before finishing this blog, I would like to clarify one more point. Why is it needed to have an “add 1”? In fact, if you remove it, this vulnerability is not triggered anymore, and the PoC does not reach to the vulnerable function! Why is that?

To answer this question, we can remove the “add 1” from the PoC and examine the effect on the graph.

First, the graph if the “add 1” is removed:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/69d12de5-c342-47b7-b324-1be59cf7787d/Screen+Shot+2021-12-08+at+7.43.19+PM.png?format=1000w" />

When the “add 1” is removed, there is no need for a “ChangeInt32ToInt64” node in the graph anymore. Instead, a “ChangeInt32ToTagged” node is used to directly convert the result of the XOR to a tagged value and return.

Compare with the graph of the PoC including the “add 1”:

          <img alt="" src="https://images.squarespace-cdn.com/content/v1/5894c269e4fcb5e65a1ed623/62b9060a-8eb9-4eed-a4c2-f00ef03c025b/Screen+Shot+2021-12-08+at+7.45.18+PM.png?format=1000w" />

By including an “add 1” operation, the result of XOR (which is Signed32) needs to be first converted to int64 using a ChangeInt32ToInt64 node in preparation for the addition. Note that 1 is an Int64Constant. After the add, the result is changed to a tagged value and returned.

Therefore, we conclude that the “add 1” is needed to trigger insertion of a “ChangeInt32ToInt64” node.

Conclusion

In this blog post we identified the root cause of the vulnerability used at Pwn2Own and saw how the contestants chained a series of clever values and operations to trigger an incorrect behavior in the JIT engine. In the final blog in this series, we will explore how this issue was exploited. That blog will be published one week from today.

Until then, you can find me on Twitter at @hosselot and follow the team for the latest in exploit techniques and security patches.

Article Link: Zero Day Initiative — Understanding the Root Cause of CVE-2021-21220 – A Chrome Bug from Pwn2Own 2021