How to analyze Linux malware – A case study of Symbiote


Symbiote is a Linux threat that hooks libc and libpcap functions to hide the malicious activity. The malware hides processes and files that are used during the activity by implementing two functions called hidden_proc and hidden_file. It can also hide network connections based on a list of ports and by hijacking any injected packet filtering bytecode. The malware’s purpose is to steal credentials from the SSH and SCP processes by hooking the libc read function. The extracted credentials are encrypted using RC4, stored in a file on the system, and then exfiltrated to the C2 server via DNS requests.


Technical analysis

SHA256: 121157e0fcb728eb8a23b55457e89d45d76aa3b7d01d3d49105890a00662c924

This is a 64-bit ELF shared object that appears to be an early development build for Symbiote malware. Newer versions of this malware have even more functionalities which are described in BlackBerry’s analysis. The file is not stripped.

The malware hooks the following functions: fopen, fopen64, pam_authenticate, pam_set_item, read, readdir, readdir64, and recvmsg. We will give details about the hooks implementation.

The dlsym function is utilized to obtain the address of fopen, and then the process calls the original function (0xFFFFFFFFFFFFFFFF = RTLD_DEFAULT):

Figure 1

When an application tries to open the “/proc/net/tcp” file, which contains all TCP connections, the execution flow of the hooked function is different:

Figure 2

The ELF file creates a temporary file by calling the tmpfile method, reads the first line from the above file, and writes it to the newly created file:

Figure 3

The file is read line-by-line using the getline function. In the case of returning -1 because of a failure (including end-of-file condition), the process closes the file and frees the memory area allocated to the line:

Figure 4

There is a function called gen_proc_net_port implemented by the malware. The purpose of this function is to retrieve a list of ports that should be hidden. Whether a line read above contains any of the ports, the line is not written to the temporary file, and the process moves to the next line:

Figure 5

The function implementation is displayed in the figure below:

Figure 6

The hooked function returns the file descriptor corresponding to the temporary file. The fopen64 function is hooked in a similar way.

The dlsym function is utilized to obtain the address of pam_authenticate, and then the process calls the original function:

Figure 7

The malware calls the pam_get_item method in order to obtain the following information: 0x1 = PAM_SERVICE – the service name, 0x4 = PAM_RHOST – the requesting hostname, 0x2 = PAM_USER – the username. There is also a function call to getaddrlist, which will be explained in the upcoming paragraphs:

Figure 8

Based on the information extracted above, the process constructs the following string “pam|<getaddrlist result>|<PAM_SERVICE>|<PAM_RHOST>|<PAM_USER>|<cs:pampassword>”. There is a call to a function named saveline with the “/usr/include/linux/usb/usb.h” parameter (see figure 9). This particular function will be dissected in the upcoming paragraphs.

Figure 9

The ELF binary implements an erase function called erasefree. It overwrites an area with zeros, and then the pointer which points to this area will be freed:

Figure 10

Figure 11

The dlsym function is utilized to obtain the address of pam_set_item, and then the process calls the original function:

Figure 12

The process expects that the item_type value is equal to 0x6 (PAM_AUTHTOK), which is the authentication token (usually it’s a password):

Figure 13

The dlsym function is utilized to obtain the address of readdir, as highlighted below:

Figure 14

The malware implements a function called check_proc, which will be explained in a bit. Depending on the boolean value returned by this function, the process calls the original readdir method and then hidden_file or hidden_proc:

Figure 15

The readdir64 function is hooked in a similar way.

The dlsym function is utilized to obtain the address of recvmsg, and then the process calls the original function:

Figure 16

The malicious process expects a specific message structure i.e. message[8] = 0xc, message[16] != 0, as displayed in the figure below:

Figure 17

The ELF binary converts unsigned short integers from host byte order to network byte order using htons. The message is copied to another memory area using the memcpy method:

Figure 18

In the check_proc function, the malware gets the directory stream file descriptor and computes the following path “/proc/self/fd/<File descriptor>”:

Figure 19

The path constructed above points to a symbolic link that is read using the readlink method. The function returns 1 whether the symbolic link contains “/proc” and 0 otherwise:

Figure 20

The malware implements a function called check_ssh_scp. It obtains the location of an executable by calling the readlink function with the “/proc/self/exe” parameter:

Figure 21

The purpose of this function is to detect the presence of the SCP/SSH executable and returns 0 if that’s the case:

Figure 22

Symbiote implements the CRC32b algorithm in a function called crc32b. The algorithm can be identified using the 0xEDB88320 constant:

Figure 23

There is a function called create_file that can be used to create files. It calls the open method (0x441 = O_WRONLY | O_CREAT | O_APPEND):

Figure 24

The process changes the permissions of a file to 0x1B6 = S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH, which means that all users can read and write but cannot execute the file:

Figure 25

In the function called dns, the ELF binary retrieves the current process ID that is converted  from host byte order to network byte order using htons:

Figure 26

The malware calls a function named ChangetoDnsNameFormat that will be explained below:

Figure 27

The malicious process creates a socket that will be used to communicate with the C2 server (0x2 = AF_INET, 0x2 = SOCK_DGRAM, 0x11 = IPPROTO_UDP). The C2 server address is converted from dotted decimal notation to an integer using the inet_addr method:

Figure 28

The sendto function is utilized to send data to the C2 server:

Figure 29

The function called ChangetoDnsNameFormat prepares the structure of the request for DNS data exfiltration:

Figure 30

In the function named getaddrlist, the ELF binary extracts a linked list of structures containing the network interfaces of the local machine using the getifaddrs method:

Figure 31

Based on the structures extracted above, the process extracts the IP addresses by calling the getnameinfo function:

Figure 32

The interfaces IP addresses are concatenated together using the strcat method:

Figure 33

In the getserver function, the malicious binary tries to open a file called “/tmp/resolv.conf”:

Figure 34

The malware is looking for a nameserver in the above file. If there is no nameserver, then the process will use the Google DNS server ( to send the DNS request as a UDP broadcast:

Figure 35

The process compares two strings (file names) in the hidden_file function and returns 0 if they match:

Figure 36

The hidden_proc function expects a process ID as an argument. It calls the strspn and strlen functions in order to ensure that the process ID consists of digits only:

Figure 37

The ELF binary retrieves information about a process from the “/proc/<pid>/status” file, as shown in figure 38.

Figure 38

The purpose of this function is to compare two process names and to return 0 if they match (see figure 39). Symbiote’s objective is to hide some processes that are related to the malware such as: certbotx64, certbotx86, javautils, javaserverx64, javaclientex64, javanodex86 (BlackBerry’s article).

Figure 39

The dlsym function is utilized to obtain the address of the read method. If an SSH or SCP process is calling the libc read function, then hook_read is set to keylogger, which is explained below:

Figure 40

In the keylogger function, the process calls the original read function with a file descriptor corresponding to SSH or SCP. It also performs a call to the isatty method in order to ensure that the file descriptor is referring to a terminal:

Figure 41

The executable calls a function named log_cmd_line and then getaddrlist. The first function will be detailed in the following paragraphs:

Figure 42

The malware constructs a string with the following structure “<getaddrlist result>|<log_cmd_line result>|pw_5673”. It calls the saveline function with the “/usr/include/linux/usb/usb.h” parameter:

Figure 43

The log_cmd_line method is called only for the SSH or SCP process. The command line of one of these processes is read from “/proc/self/cmdline”:

Figure 44

The realloc method is utilized to deallocate the old object and to return a pointer to a new object:

Figure 45

The credentials extracted from the SSH or SCP process are encrypted using the RC4 algorithm (key = “suporte42atendimento53log”). The encrypted content will be used to construct DNS requests in the sendlinedns function:

Figure 46

The malicious process creates a file called “/usr/include/linux/usb/usb.h” by calling create_file:

Figure 47

The encrypted credentials are written to the file created above:

Figure 48

In the sendlinedns function, the malware obtains information about the current kernel using the uname method. Based on the resulting buffer, the process computes a machine ID which consists of 4 bytes generated using crc32b (stored in id_6274):

Figure 49

The encrypted credentials are hex-encoded and splitted to be exfiltrated via DNS requests to a domain owned by the threat actor. The A DNS request has the following format:

  • <Packet number – starts from 0x2B67 = 11111>.<Machine ID – Crc32b hash>.<Hex-encoded data>.px32.nss.atendimento-estilo[.]com

Finally, the executable calls the dns function that will exfiltrate data:

Figure 50

The implementation of the RC4 algorithm can be identified below:

Figure 51

Figure 52


C2 domain: px32.nss.atendimento-estilo[.]com

SHA256: 121157e0fcb728eb8a23b55457e89d45d76aa3b7d01d3d49105890a00662c924

Files created: /usr/include/linux/usb/usb.h

Article Link: How to analyze Linux malware – A case study of Symbiote – CYBER GEEKS