cut-bytes.py is a tool I use to select (cut) a sequence of bytes out of a file, using a cut-expression. This expression specifies the start of the sequence and the end of the sequence.
In this example, I use a cut-expression to find the first occurrence of MZ (i.e. [‘MZ’]) and select 8 bytes (8l) starting at the position of that occurrence (-a is ASCII dump):
I realized that with a few changes, I could add a binary grep feature to cut-bytes. Option -g activates this binary grep:
In stead of one occurrence (the first), with option -g, all occurrences are selected.
JSON output is now also available with option –jsonoutput:
This JSON output contains all the selected byte sequences (BASE64 encoded and with metadata), and it can be piped into tools that accept this format, like file-magic.py:
file-magic will then identify each byte sequence. As you can guess, I’m looking for PE files embedded in file update.bin. But the byte sequences are too short (8 bytes) for file-magic.py to properly identify file types. By increasing the length to 512 bytes, file-magic.py has enough data to locate 2 PE files (a 32-bit DLL and a 64-bit DLL) inside update.bin:
Option -G is identical to -g, except that the selected byte sequences will not overlap.
And I also added a “run length encoded” ASCII dump (-A). If 2 or more consecutive output lines are identical, the duplicates are suppressed:
cut-bytes_V0_0_8.zip (https)
MD5: 1A69542E7E9D7348101B7E91884674B7
SHA256: 15BC253323FF162F26BEF784172A502383970E63514DF6B88A09952A19DAE826
Article Link: https://blog.didierstevens.com/2018/11/12/update-cut-bytes-py-version-0-0-8/