Malware source code investigation: BlackLotus - part 1

BlackLotus is a UEFI bootkit that targets Windows and is capable of evading security software, persisting once it has infected a system, bypassing Secure Boot on fully patched installations of Windows 11, and executing payloads with the highest level of privileges available in the operating system.

blacklotus

The source code for the BlackLotus UEFI bootkit has been published on GitHub on July, 12, 2023.

blacklotus

Since at least October 2022, BlackLotus is a UEFI bootkit that has been for sale on hacking forums. The dangerous malware is for sale for $5,000, with payments of $200 per update.

In this small research we are detailed investigate the source code of BlackLotus and highlights the main features.

Architecture

Black Lotus is written in assembly and C and is only 80kb in size, the malicious code can be configured to avoid infecting systems in countries in the CIS region (At the time of writing, these countries are Armenia, Azerbaijan, Belarus, Kazakhstan, Kyrgyzstan, Moldova, Russia, Tajikistan and Uzbekistan).

Source code structure looks like this:

blacklotus

blacklotus

The software consists of two major components: the Agent, which is installed on the target device, and the Web Interface, which is used by administrators to administer bots. A bot in this context refers to a device with the Agent installed.

Cryptography

First of all, we paid attention to libraries and cryptographic functions:

blacklotus

blacklotus

At first we wanted to focus on the WinAPI hashing method by CRC32 at malware development. As you can see, nothing out of the ordinary here, CRC32 implementation with constant 0xEDB88320L. You can learn more about how to use it for hashing when developing malware, for example, here.

The implementation of the RC4 algorithm is also standard here, there is nothing complicated about it:

blacklotus

What about XOR? This code appears to implement a custom type of encryption on a given data buffer. The function CryptXor is applied to the buffer using the specified Key and the Cipher Block Chaining (CBC) method. The CBC method is a type of block cipher mode that encrypts plaintext into ciphertext. The encryption of each block depends on the previous block of data:

blacklotus

In summary, this function performs a custom type of encryption on the input buffer. It uses XOR operations with a given key and CBC chaining, with the possibility to skip over pairs of zero DWORDs.

And also we have function to decrypt via XOR:

blacklotus

Then, the next interesting thing is files like ntdll_hash.h, kernel32_hash.h, etc:

blacklotus

Each of which contains hashes of WINAPI functions and DLL names:

blacklotus

AV evasion tactic

Then, malware author just use GetModuleHandleByHash (DWORD Hash) function:

blacklotus

The given C function, GetModuleHandleByHash, is a means of dynamically resolving and obtaining a module handle given a hash of the module name. This is typically seen in malware code, as it helps to avoid static strings (like "kernel32.dll") that could be easily spotted by antivirus heuristic algorithms. This technique increases the difficulty of static analysis.

The function works as follows:

  1. It begins by reading the Thread Environment Block (TEB) via inline assembly code. This is a structure that Windows maintains per thread to store thread-specific information. The structure of the TEB and the offsets used indicate that it’s retrieving the first entry in the InLoadOrderModuleList, which is a doubly linked list of loaded modules in the order they were loaded. This is a common way to get a list of loaded modules without calling any APIs like EnumProcessModules.

  2. Once it has the first module, it enters a loop where it processes each module in turn. For each module, it converts the module name to lower case and computes its CRC32 hash (using the Crc32Hash function).

  3. If the computed hash matches the input hash, it returns the base address of the module (which is effectively the same as the module handle, for the purpose of calling GetProcAddress).

  4. If the hash does not match, it moves to the next module in the InLoadOrderModuleList and repeats the process.

  5. If it has checked all the modules and not found a match, it returns NULL.

Note that LDR_MODULE and its linked list structures are part of the Windows Native API (also known as the “NT API”), which is an internal API used by Windows itself. It’s not officially documented by Microsoft, so using it can be risky: it can change between different versions or updates of Windows. However, it also provides a way to do things that can’t be done with the standard Windows API, so it’s often used in low-level code like device drivers or, in this case, bootkit malware.

Also we have files like advapi32_functions.h, ntdll_functions.h or user32_functions.h:

blacklotus

This piece of code is a C++ header files that defines function pointers to a Windows API functions like: VirtualAlloc, OpenProcess, and Process32FirstW or NT API structures and functions:

blacklotus

These are being defined as function pointers rather than directly calling the functions because this can make it easier to dynamically load these functions at runtime. This can be useful in a few scenarios, such as when writing code that needs to run on multiple versions of Windows and not all functions may be available on all versions, and in our case when trying to evade detection by anti-malware tools (since these tools often flag direct calls to certain API functions as suspicious).

The GetProcAddressByHash function in the given code is designed to look up a function in a DLL using the hash of the function’s name, rather than the name itself. This is typically used in malware to make static analysis harder, as it avoids leaving clear text strings (like "CreateProcess") in the binary that can be easily identified:

blacklotus

This code also assumes that it’s running on the same architecture as the DLL it’s examining, i.e., if the code is compiled for a 64-bit target, it assumes the DLL is also 64-bit, and vice versa for 32-bit.

It’s worth noting that manipulating the PE file format and using hashed function names like this is a common technique used in malware and rootkits to make analysis and detection more difficult.

Also interesting file is nzt.h:

blacklotus

As you can see, function pointer macro: API(Function) is a macro that expands to NzT.Api.p##Function. This is likely used to call function pointers stored in an API_FUNCTIONS structure, which is part of the NzT_T struct.

NzT_T is a structure that bundles together various components of the bot’s functionality, including an API_FUNCTIONS structure for API function pointers, an API_MODULES structure for loaded module information, a CRC type (for checksum calculations), and an INFECTION_TYPE field indicating the infection status of the bot.

Windows Registry

Then, in the registry.c file implements functions for interacting with the Windows Registry:

blacklotus

GetRegistryStartPath(INT Hive) - This function is used to get the start path of the registry hive, based on the hive type passed to it (e.g., HKEY_LOCAL_MACHINE). The path is formatted into the form expected by the Windows kernel functions, which is a bit different from what you might usually see (e.g., "\Registry\Machine" instead of HKEY_LOCAL_MACHINE). The function returns this path as a wide character string (LPWSTR):

blacklotus

RegistryOpenKeyEx(CONST LPWSTR KeyPath, HANDLE RegistryHandle, ACCESS_MASK AccessMask) - This function is used to open a specific key in the registry, given its path, a handle to a pre-existing key (or NULL for the root of the registry), and an access mask specifying what type of access the function caller requires to the key (e.g., KEY_READ, KEY_WRITE). It uses the NtOpenKey API function from the Windows Native API to actually open the key:

blacklotus

RegistryReadValueEx(CONST LPWSTR KeyPath, CONST LPWSTR Name, LPWSTR* Value) - This function reads a value from a given key in the registry. It does this by opening the key with RegistryOpenKeyEx, then querying the value with NtQueryValueKey. The function reads the value’s data into a buffer, which it then returns to the caller. If anything goes wrong (e.g., the key couldn’t be opened, the value couldn’t be queried, there wasn’t enough memory to store the value’s data), the function returns FALSE:

blacklotus

RegistryReadValue(INT Hive, CONST LPWSTR Path, CONST LPWSTR Name, LPWSTR* Value) - This function combines the functionality of the other functions. It reads a value from a specific key in a specific hive of the registry. It constructs the full path to the key by concatenating the start path of the hive (obtained with GetRegistryStartPath) and the rest of the key path passed to the function. It then reads the value from this key with RegistryReadValueEx:

blacklotus

There are also two functions, but they are not used anywhere and are commented out:

blacklotus

Filesystem

There are also separate functions for working with files in Windows OS - file.c:

blacklotus

which implements such functions as, for example FileGetInfo, FileGetSize, FileOpen, FileWrite, etc.

FileGetInfo(HANDLE FileHandle, PFILE_STANDARD_INFORMATION Info) - This function retrieves standard information about a file. The NtQueryInformationFile function is used to retrieve the information. It takes a handle to an open file and a pointer to a FILE_STANDARD_INFORMATION structure to fill with information. The MemoryZero function is used to clear these structures before use.

The FILE_STANDARD_INFORMATION structure includes several file attributes such as the allocation size of the file, the end of the file, the number of links to the file, and flags to indicate if the file is a directory or if it is deleted. If the operation is successful, the function returns TRUE. If the operation fails, it returns FALSE:

FileGetSize(HANDLE FileHandle, PDWORD FileSize) - This function retrieves the size of a file. It does so by calling FileGetInfo to get the standard information of the file, and then sets the value pointed to by FileSize to the AllocationSize.LowPart of the FILE_STANDARD_INFORMATION structure:

blacklotus

Note that AllocationSize is a LARGE_INTEGER (which is a 64-bit value), and this function is only returning the lower 32 bits of it, which may be incorrect for files larger than 4GB.

Injections

Another functions from source code of investigated malware, for injection logic:

blacklotus

For example:

LPVOID InjectData(
	HANDLE Process,
	LPVOID Data,
	DWORD Size
)

blacklotus

Here’s a breakdown of what the function does:

NzT.Api.pVirtualAllocEx(Process, NULL, Size, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE) - It starts by allocating memory within the virtual memory space of a target process. The size of the allocated memory is specified by the Size parameter. The memory is both committed (MEM_COMMIT) and reserved (MEM_RESERVE) for future use. The allocated memory has read, write, and execute permissions (PAGE_EXECUTE_READWRITE). The address of the allocated memory is saved in the Address variable. If this operation fails, the function returns NULL.

NzT.Api.pWriteProcessMemory(Process, Address, Data, Size, NULL) - If memory allocation is successful, the function proceeds to write data into the allocated memory within the target process. It does this using the WriteProcessMemory function. This function copies data from a buffer (Data) in the current process to the allocated memory (Address) in the target process. If the operation fails, it frees the allocated memory using VirtualFreeEx and returns NULL.

If both operations are successful, the function returns the address of the allocated memory in the target process. This can then be used for various purposes, such as executing the injected code.

This type of functionality is often seen in malware that injects malicious code into legitimate processes to hide its activities or gain higher privileges.

What about this injection logic?

DWORD InjectCode(
	HANDLE Process,
	LPVOID Function
)

which also implemented in this file:

blacklotus

This function appears to inject code into a target process by creating a section of memory, copying the code into this section, performing relocations, and finally mapping this section into the target process.

Once all the tasks are performed, the function will clean up by closing any open handles and unmap any mapped views of files. Finally, it will return the address of the injected function in the target process.

As with many other kinds of code injection techniques, this one is also commonly seen in malware.

Pseudo-Random Generator

And there are several functions in this malware guid.c:

blacklotus

These functions are designed to generate a pseudo-random GUID (Globally Unique Identifier). The GUID is built from the values produced by a simple linear congruential generator (LCG), which is a type of pseudorandom number generator.

Here’s what each function does:

GuidRandom(PDWORD Seed) - This is a linear congruential generator (LCG) function that takes a seed as a parameter and generates a pseudorandom number. It’s important to note that this LCG function always produces the same sequence of numbers if the initial seed is the same:

blacklotus

GuidGenerate(GUID * Guid, PDWORD Seed) - This function takes a pointer to a GUID structure and a pointer to a DWORD seed as parameters. It generates a GUID by calling GuidRandom(Seed) to generate pseudorandom numbers and assign them to the four parts of the GUID structure (Data1, Data2, Data3, Data4):

blacklotus

GuidGenerateEx(PDWORD Seed) - This function generates a GUID string. It calls GuidGenerate(&Guid, Seed) to generate a GUID and then converts this GUID to a string format with GuidToString(&Guid). This string is then copied to a newly allocated memory block, and a pointer to this block is returned:

blacklotus

As for the context of malware, the generated GUIDs might be used for a variety of purposes including marking infected systems, communicating with command-and-control (C2) servers, or creating mutexes to avoid multiple instances of the malware. In our case, this functions used for generate Bot ID.

Utils

There is also a file with utilities where there are a lot of auxiliary functions utils.c:

blacklotus

For example, GetProcessIdByHandle (HANDLE Process):

blacklotus

This function, retrieves the unique process ID of a process given a handle to the process.

Or function GetProcessIdByHash(DWORD Hash):

blacklotus

which returns the Process ID (PID) of a process given its hash. This function scans all running processes on the system and returns the PID of the process whose executable name matches the provided hash.

The function creates a snapshot of all processes currently running on the system by calling the CreateToolhelp32Snapshot function. If the snapshot creation fails, it returns -1 to indicate the failure. It then retrieves the first process in the snapshot using the Process32FirstW function. If this function fails, it closes the snapshot handle and returns -1 to indicate the failure. The function then enters a loop, where it calculates the CRC32 hash of the current process’s executable name (szExeFile). It checks whether this calculated hash is equal to the input hash. If it is, the function breaks out of the loop and returns the Process ID (th32ProcessID) of the current process. If the hash doesn’t match, it proceeds to the next process in the snapshot using the Process32NextW function and repeats previous steps. After the loop, it closes the snapshot handle and returns the PID of the process with the matching hash. If no matching process was found, it returns -1.

The CreateMutexOfProcess(DWORD ProcessID) function is attempting to create a mutex (a synchronization object) with a unique name based on the process ID and the serial number of the disk volume (which is obtained by the GetSerialNumber() function):

blacklotus

A mutex can be used to prevent multiple instances of a malware or application from running at the same time. In this case, the mutex name is generated by concatenating the disk volume’s serial number and the process ID, which should provide a unique mutex for each running instance of the process.

Also, interesting logic in destroyOS() function:

blacklotus

but it’s also commented.

That’s all today. In the next part we will investigate another modules.

We hope this post spreads awareness to the blue teamers of this interesting malware techniques, and adds a weapon to the red teamers arsenal.

By Cyber Threat Hunters from MSSPLab:

References

https://github.com/ldpreload/BlackLotus
https://malpedia.caad.fkie.fraunhofer.de/details/win.asyncrat
https://twitter.com/threatintel/status/1679906101838356480
https://twitter.com/TheCyberSecHub/status/1680044350820999168

Thanks for your time happy hacking and good bye!
All drawings and screenshots are MSSPLab’s

Article Link: Malware source code investigation: BlackLotus - part 1 - MSSP Lab