Igor’s tip of the week #94: Variable-sized structures

Variable-sized structures is a construct used to handle binary structures of variable size with the advantage of compile-time type checking.

In source code

Usually such structures use a layout similar to following:

struct varsize_t
{
 // some fixed fields at the start
 int id;
 size_t datalen;
 //[more fields]
 unsigned char data[];// variable part
};

In other words, a fixed-layout part at the start and an array of unspecified size at the end.

Some compilers do not like [] syntax so [0] or even [1] may be used too. At runtime, the space for the structure is allocated using the full size, and the array can be accessed as if it had expected size. For example:

struct varsize_t* allocvar(int id, void *data, size_t datalen);
{
 size_t fullsize = sizeof(varsize_t)+datalen+1;
 struct varsize_t *var = (struct varsize_t*) malloc(fullsize);
 var->id = id;
 var->datalen = datalen;
 memcpy(var->data, data, datalen);
 var->data[datalen]=0;
 return var;
}

Can such structs be handled by IDA? Yes, but there are some peculiarities you may need to be aware of.

In the decompiler

In the decompiler everything is pretty simple: just add the struct using C syntax to Local Types and use it for types of local variables and function arguments. The decompiler automatically detects accesses to the variable part and represents them accordingly.

In disassembly

However, disassembly view is trickier. You can import the struct from Local Types to the IDB Structures, or create one manually by explicitly adding an array of 0 elements at the end:

00000000 varsize_t struc ; (sizeof=0x8, align=0x4, copyof_1, variable size)
00000000 id dd ?
00000004 datalen dd ?
00000008 data db 0 dup(?)
00000008 varsize_t ends

But when you have instances of such structs in data area, using this definition only covers the fixed part. To extend the struct, use * (Create/resize array action) and specify the full size of the struct.

Example

Recent Microsoft compilers add so-called “COFF group” info to the PE executables. It is currently not fully parsed by IDA but is labeled in the disassembly listing with the comment IMAGE_DEBUG_TYPE_POGO:

.rdata:004199E4 ; Debug information (IMAGE_DEBUG_TYPE_POGO)
.rdata:004199E4 dword_4199E4 dd 0 ; DATA XREF: .rdata:004196BC↑o
.rdata:004199E8 dd 1000h, 25Fh, 7865742Eh, 74h, 1260h, 0BCh, 7865742Eh, 69642474h, 0
.rdata:00419A0C dd 1320h, 11BE2h, 7865742Eh, 6E6D2474h, 0
.rdata:00419A20 dd 12F10h, 12Ch, 7865742Eh, 782474h, 13040h, 164h, 7865742Eh, 64792474h
.rdata:00419A20 dd 0
.rdata:00419A44 dd 14000h, 11Ch, 6164692Eh, 35246174h, 0
.rdata:00419A58 dd 1411Ch, 4, 6330302Eh, 6766h, 14120h, 4, 5452432Eh, 41435824h, 0
.rdata:00419A7C dd 14124h, 4, 5452432Eh, 41435824h, 41h, 14128h, 1Ch, 5452432Eh, 55435824h

On expanding the array or looking at the hex view, it becomes apparent that it stores info about the original section names of the executable, before they are merged by the linker. So it can be useful to format this info. It seems to consist of a list of following structures:

struct section_info
{
  int start; // RVA
  int size;
  char name[]; // zero-terminated
};

The string is padded with zeroes if necessary to align each struct on a 4-byte boundary.

After creating a local type and importing the struct to IDB, we can undefine the array created by IDA and start creating struct instances in the area using Edit > Struct var… (Alt–Q). However, only the fixed part is covered by default:

To extend the struct, press * and enter full size. For example, the first one should be 14 (8 for the fixed part and 6 for “.text” and terminating zero), although you can also use the suggested 16:

Now the struct has correct size and covers the string but it is printed as hex bytes and not text. Why and how to fix it?

When IDA converts C type to assembly-level (IDB) struct, it only relies on the sizes of C types, because on the assembly level there is no difference between a a byte and character. Thus a char array is the same as a byte array. However, you can still apply additional representation flags to influence formatting of the structure. For example, you can go to the imported definition in Structures list and mark the name field as a string literal, either from context menu or by pressing A:

The field is now commented correspondingly and the data instances show the string as text:

In fact, once you mark the field as string, newly declared instances will be automatically sized by IDA using the zero terminator.

See also:

Variable Length Structures Tutorial
IDA Help: Convert to array
IDA Help: Assembler level and C level types
IDA Help: Structures window

Article Link: Igor’s tip of the week #94: Variable-sized structures – Hex Rays