Update: strings.py Version 0.0.5 Pascal Strings

This new version of strings.py, my tool to extract strings from arbitrary files, adds option -P to add support for Pascal strings.

A Pascal string is a string that is internally stored with a length-prefix: an integer that counts the number of characters inside the string.

The Unix strings command, and my strings.py tool, can extract Pascal strings without any problem, because they just search for a sequence of characters, without looking for a terminating NULL character (C-string) or a length-prefix (P-string ot Pascal string).

But with option -P, you can direct my tool strings.py to only extract Pascal strings, by checking if character sequences are prefixed with an integer that is equal to the number of characters inside the string. Strings that do not match that requirement are ignored.

Since an integer can be represented internally with different byte formats, you have to provide a value to option -P that indicates how the integer is stored internally. I use the same format as Python’s struct module to represent that format. For example, “<I” is a little-endian, unsigned 32-bit integer. That is how a string is represented in Delphi, as can be seen in this example of a Delphi malware sample:

The strings you see here are all found inside the sample, and are prefixed by their length. If you wouldn’t use option -P, then these strings would also be extracted, but they would not stand out amid the other strings that are not prefixed by their length.

Delphi also supports the ShortString type: one byte to encode the length. These can be found with option -P “<B”: little-endian, unsigned 8-bit integer:

strings_V0_0_5.zip (https)
MD5: A4BF314BE0A72972ECA7B14B558610E6
SHA256: 30E9E9BB618006445483AA78F804766D8FFA518974B81F9B68FF534BEA30B072

Article Link: https://blog.didierstevens.com/2020/10/22/update-strings-py-version-0-0-5-pascal-strings/