While using the decompiler, sometimes you may have seen the item named Split expression in the context menu. What does it do and where it can be useful? Let’s look at two examples where it can be applied.
Structure field initialization
Modern compilers perform many optimizations to speed up code execution. One of them is merging two or more adjacent memory stores or loads into a single wide one. This often happens when writing to nearby structure fields.
For example, when you decompile a macOS program which uses blocks and use our Objective-C analysis plugin to analyze the supporting code in a function, you may observe pseudocode similar to the following:
block.isa = _NSConcreteStackBlock; *(_QWORD *)&block.flags = 3254779904LL; block.invoke = sub_10000A159; block.descriptor = &stru_10001E0E8; block.lvar1 = self;
The block
variable uses a structure created by the plugin which looks like this:
struct Block_layout_10000A088 { void *isa; int32_t flags; int32_t reserved; void (__cdecl *invoke)(Block_layout_10000A088 *block); Block_descriptor_1 *descriptor; _QWORD lvar1; };
As you can see, the compiler decided to initialize the two 32-bit flags
and reserved
fields in one go using a single 64-bit store. Although technically correct, the pseudocode looks somewhat ugly and not easy to understand at a glance. To tell the decompiler that this write should be treated as two separate ones, right-click the assignment and choose “Split expression”:
Once the pseudocode is refreshed, two separate assignments are displayed:
block.isa = _NSConcreteStackBlock; block.flags = 0xC2000000; block.reserved = 0; block.invoke = sub_10000A159; block.descriptor = &stru_10001E0E8; block.lvar1 = self;
The newly 32-bit constant could, for example, be converted to hex or a set of flags using a custom enum.
This example is rather benign because the reserved
field is set to 0 so the constant was already effectively 32-bit; other situations can be more involved when different distinct values are merged into one big constant.
If necessary, expressions can be split further (e.g. when one value is used to initialize 3 or more fields). You can also revert the split by choosing “Unsplit expression” in the context menu.
64-bit variables in 32-bit programs
When handling 64-bit values on processors with 32-bit registers, the compiler has to work with data in 32-bit pieces. This can lead to very verbose code if translated as-is, so our decompiler detects common patterns such as 64-bit math, comparisons or data manipulations and automatically creates 64-bit variables consisting of two 32-bit registers or memory locations. While our heuristics work well in most cases, there may be false positives, when two actually separate 32-bit variables get merged into a 64-bit one. In such situation, you can use “Split expression” on the 64-bit operations involving the variable to split the pair and recover proper, separate variables.
See also: Hex-Rays interactive operation: Split/unsplit expression
Article Link: Igor’s tip of the week #69: Split expression – Hex Rays