Igor’s tip of the week #43: Annotating the decompiler output

Last week we started improving decompilation of a simple function. While you can go quite far with renaming and retyping, some things need more explanation than a simple renamng could provide.

Comments

When you can’t come up with a good name for a variable or a function, you can add a comment with an explanation or a theory about what’s going on. The following comment types are available in the pseudocode:

  1. Regular end-of-line comments. Use / to add or edit them (easy to remember because in C++ // is used for comments).
  2. Block comments. Similarly to anterior comments in the disassembly view, the Ins shortcut is used (I on Mac). The comment is added before the current statement (not necessarily the current line).
  3. Function comment is added when you use / on the first line of the function.

Due to limitations of the implementation, the first two types can move around or even end up as orphan comments when the pseudocode changes. The function comment is attached to the function itself and is visible also in the disassembly view.

 

Using the comments, we can annotate the function from the previous post to clarify what is going on. On the screenshot below, regular comments are highlighted in blue while block comments are outlined in orange.

In the end, the function seems to be copying bytes from a2 to a1, stopping at the first zero byte. If you know libc, you’ll quickly realize that it’s actually a trivial implementation of strcpy. We can now rename the function and arguments to the canonical names and add a function comment explaining the purpose of the function.

Alas, the existing comments are not updated automatically, so references to a1 and a2 would have to be fixed manually.

Empty lines

 To improve the readability of pseudocode even further, you can add empty lines either manually or automatically. For manual lines, press Enter after or before a statement. For example, here’s the same function with extra empty lines added:

To remove the manual empty lines, edit the anterior comment (Ins or I on Mac) and remove the empty lines from the comment.

To add automatic empty lines, set GENERATE_EMPTY_LINES = YES in hexrays.cfg. This will cause the decompiler to add empty lines between compound statements as well as before labels. This improves readability of long or complex functions. For example, here’s a decompilation of the same function with both settings. You can see that the second one reads easier thanks to extra spacing.

 

Article Link: Igor’s tip of the week #43: Annotating the decompiler output – Hex Rays