VMware ESXi Logging & Detection Opportunities

MalBot · January 14, 2025, 6:30pm

Flowchart representing defending an ESXi system

ESXi environments, with their lack of AV/EDR support, present a unique challenge to Detection Engineers. Not only are these environments often processing and storing highly critical data and workloads, but with the shift to cloud environments, they’re also being considered legacy and are often lacking effective maintenance and security controls, opening the door to threat actors.

It’s clear that defenders aren’t the only ones aware of the challenges with defending ESXi environments. Numerous ransomware variants such as Qilin, Akira, BlackCat, Royal, and many more either have ESXi-specific variants or contain built-in commands that interact with ESXi hosts.

Let’s explore the nuances with detection engineering in ESXi environments which will include describing the most useful log sources, showing off some common adversary techniques, and finally I’ll share some useful detections

Sneak Peak: I’ve also created a Python-based CLI tool that automates many of the detection engineering tasks in this blog. It’s available here, but I highly recommend reading on to learn more!

Log Sources

ESXi systems by default have a fairly large number of log sources. These include logs that cover Shell Execution, Authentication, ESXi Agent & Host Events, and more. The most important log sources are listed below.

/var/log/shell.log <- Executed Shell Commands & Events
/var/log/auth.log <- Authentication Events
/var/log/hostd.log <- ESXi Host & Agent Events

A full listing of sources can be found here.

Shell Events

Let’s dive into how shell events are logged on ESXi hosts. For the most part, whenever a command is executed in a shell an event is logged to /var/log/shell.log . The log format is pretty typical:

ISO_8601_TIMESTAMP shell[SHELL_PROCESS_ID]: [USERNAME]: SHELL_COMMAND_EXECUTED

Here’s two distinct shell sessions that were logged via shell.log

2024-12-12T01:44:41.902Z shell[71435]: Interactive shell session started
2024-12-12T01:44:43.728Z shell[71435]: [root]: ls -la
2024-12-12T01:44:47.785Z shell[71435]: [root]: cd /var/log
2024-12-12T01:44:48.655Z shell[71435]: [root]: ls
2024-12-12T01:44:56.673Z shell[71435]: [root]: cat auth.log
2024-12-12T01:45:05.064Z shell[71435]: [root]: grep -nr ‘root’ auth.log
2024-12-12T01:45:12.921Z shell[71435]: [root]: cat /var/log/shell.log
2024-12-12T01:47:31.601Z shell[71435]: [root]: ps aux
2024-12-12T01:47:33.621Z shell[71435]: [root]: ps
2024-12-12T01:47:47.328Z shell[71435]: [root]: ps | grep 71435
2024-12-12T01:49:34.969Z shell[71435]: [root]: exit
2024-12-12T01:49:38.390Z shell[71470]: Interactive shell session started
2024-12-12T01:49:48.661Z shell[71470]: [root]: cat /etc/passwd
2024-12-12T01:49:57.075Z shell[71470]: [root]: ls -la
2024-12-12T01:49:59.454Z shell[71470]: [root]: cd /tmp
2024-12-12T01:49:59.943Z shell[71470]: [root]: ls
2024-12-12T01:50:04.107Z shell[71470]: [root]: ls -la
2024-12-12T01:50:06.983Z shell[71470]: [root]: cat vmware-root_68112-1913154220/
2024-12-12T01:50:09.595Z shell[71470]: [root]: cd vmware-root_68112-1913154220/
2024-12-12T01:50:09.965Z shell[71470]: [root]: ls
2024-12-12T01:50:11.300Z shell[71470]: [root]: ls -la
2024-12-12T01:50:13.008Z shell[71470]: [root]: cd ~
2024-12-12T01:50:18.614Z shell[71470]: [root]: cat /var/log/shell.log

Tip: You can identify, track, and group distinct sessions via the SHELL_PROCESS_ID value.

There is one very important consideration to keep in mind when developing detections over this log source. If you execute a command while creating an SSH session, it will not be logged to shell.log . Let’s take a look at this in action.

Typically, when running commands we’d ssh into the ESXi system with something like this:

ssh user@ESXI_HOSTNAME

And then, once we’re authenticated, we’d begin executing our commands and those would indeed be logged to shell.log . Normal right?

Observe what happens when you execute a command using something like ssh root@ESXI_HOST “ls -la”

SSH vs Shell Execution Logging

As you can see, the first ls -la was properly logged in shell.log , while the second one wasn’t. If it isn’t in shell.log , where is it? Turns out out the entire process where I executed ssh root@ESXI_HOST “ls -la” was logged in /var/log/auth.log as:

2024-12-12T04:45:57.669Z sshd[71874]: Connection from 192.168.220.1 port 52415
2024-12-12T04:45:59.543Z sshd[71874]: Accepted keyboard-interactive/pam for root from 192.168.220.1 port 52415 ssh2
2024-12-12T04:45:59.557Z sshd[71874]: pam_unix(sshd:session): session opened for user root by (uid=0)
2024-12-12T04:45:59.566Z sshd[71874]: User ‘root’ running command ‘ls -la’
2024-12-12T04:45:59.581Z sshd[71874]: Received disconnect from 192.168.220.1 port 52415:11: disconnected by user
2024-12-12T04:45:59.581Z sshd[71874]: Disconnected from user root 192.168.220.1 port 52415
2024-12-12T04:45:59.588Z sshd[71874]: pam_unix(sshd:session): session closed for user root

You can see the command that was executed on line #4 as User ‘root’ running command ‘ls -la’ .

Keep this in mind when deploying detections trying to identify suspicious commands executed on ESXi hosts, ensure they cover both shell.log events and auth.log events.

Authentication

Authentication events are pretty self explanatory and thankfully in ESXi hosts, they follow the same format as other Linux distributions. /var/log/auth.log contains events such as remote logins, sudo usage, and other events that require authentication.

I won’t go into much detail here, as the potential detection use-cases for this log source align closely regardless of the underlying system, but you’d want to keep a close eye on failed authentication attempts to identify brute force attempts, especially if you’re ESXi system is exposed to the internet (if it is, stop reading this and change that!).

Remember to keep in mind what I mentioned earlier about auth.log also containing command execution events!

ESXi API

Finally, onto something more ESXi-specific. Did you know that ESXi systems expose an API even if it’s not connected to vCenter? I didn’t before researching this post!

It’s a rather annoying to work with SOAP API but it allows you to perform many administrative tasks such as deleting VM snapshots, enable SSH access, disable autostart of VMs, and much more!

Unfortunately, it appears as though this is an undocumented API (or maybe I’m just bad at Googling) so it requires a lot of try and error to get working. The following Python snippet showcases using this API to delete all snapshots for a given VM.

import requests

# replace with ESXi IP
base_url = “https://ESXI_IP_ADDRESS” 

headers = {‘Content-Type’: ‘text/xml’, ‘Cookie’: ‘vmware_client=VMWare’}
username = “USERNAME_HERE”
password = “PASSWORD_HERE”

# get SOAP session cookie for subsequent requests
body = f"“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-a04a</operationID></Header><Body><Login xmlns=“urn:vim25”><_this type=“SessionManager”>ha-sessionmgr</_this><userName>{username}</userName><password>{password}</password><locale>en-US</locale></Login></Body></Envelope>“”“
response = requests.post(base_url + “/sdk/”, data=body, headers=headers, verify=False)

auth_cookie = response.cookies.get(“vmware_soap_session”)
# add session cookie to headers
headers.update({‘Cookie’: f’vmware_client=VMWare; vmware_soap_session={auth_cookie}'})
# create body that deletes all snapshots for VM ID 1
delete_snapshots_body = “””<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-e243</operationID></Header><Body><RemoveAllSnapshots_Task xmlns=“urn:vim25”><_this type=“VirtualMachine”>1</_this></RemoveAllSnapshots_Task></Body></Envelope>“”“
# delete all snapshots
response = requests.post(base_url + “/sdk/”, data=delete_snapshots_body, headers=headers, verify=False)

After seeing this, your first thought as a Detection Engineer should be “Is this logged?” and thankfully it is! ESXi logs these API interactions in /var/log/hostd.log . Let’s quickly go over the events logged in hostd.log after executing that Python snippet.

2024-12-14T05:18:58.091Z info hostd[69050] [Originator@6876 sub=Default opID=esxui-a04a-5bc3] Accepted password for user root from 192.168.220.1
2024-12-14T05:18:58.091Z warning hostd[69050] [Originator@6876 sub=Vimsvc opID=esxui-a04a-5bc3] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T05:18:58.091Z info hostd[69050] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-a04a-5bc3] Event 128 : User [email protected] logged in as python-requests/2.31.0
2024-12-14T05:18:58.115Z info hostd[68119] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-e243-5bc4 user=root] Task Created : haTask-1-vim.VirtualMachine.removeAllSnapshots-122
2024-12-14T05:18:58.115Z verbose hostd[68119] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test/test.vmx opID=esxui-e243-5bc4 user=root] Removeallsnapshots received. Consolidate: true
2024-12-14T05:18:58.115Z info hostd[68119] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-e243-5bc4 user=root] Task Completed : haTask-1-vim.VirtualMachine.removeAllSnapshots-122 Status success

We can see hostd.log logged when the user logged in (lines #1 & #3), what user-agent was used when logging in ( python-requests/2.31.0 ), the API method that was invoked (lines #4-6), and finally if the request was successful (line #6).

If we modified the script to enable SSH access we’d see something similar.

2024-12-14T05:38:47.392Z info hostd[68122] [Originator@6876 sub=Default opID=esxui-a04a-5c4b] Accepted password for user root from 192.168.220.1
2024-12-14T05:38:47.392Z warning hostd[68122] [Originator@6876 sub=Vimsvc opID=esxui-a04a-5c4b] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T05:38:47.392Z info hostd[68122] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-a04a-5c4b] Event 132 : User [email protected] logged in as python-requests/2.31.0
2024-12-14T05:38:47.410Z info hostd[67227] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-4560-5c4c user=root] Task Created : haTask-ha-host-vim.host.ServiceSystem.start-152
2024-12-14T05:38:47.441Z info hostd[67724] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 133 : SSH access has been enabled.
2024-12-14T05:38:47.497Z info hostd[68131] [Originator@6876 sub=Hostsvc.ServiceSystem opID=esxui-4560-5c4c user=root] TSM-SSH running status is true
2024-12-14T05:38:47.497Z info hostd[68131] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-4560-5c4c user=root] Event 134 : SSH for the host localhost.localdomain has been enabled
2024-12-14T05:38:47.497Z info hostd[68131] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-4560-5c4c user=root] Task Completed : haTask-ha-host-vim.host.ServiceSystem.start-152 Status success

Keep this log source in mind when ingesting logs to your SIEM platform and when developing detections. Don’t rely just on monitoring process executions to detect abnormal behavior. We’ll be diving more into this topic later on.

You can find Sigma detections focused on some of the higher impact API operations in the Detection Use Cases section of this post.

Built In ESXi Utilities

ESXi systems have a few built in utilities that, while providing value to system administrators, have also seen use by threat actors. I won’t go into every utility here, but you can find a full listing in detail at the wonderful LOLESXI project.

ESXCLI

The first one I’ll be discussing is esxcli. VMWare gives the following description for esxcli.

esxcli is a command-line interface tool used to manage VMware ESXi hosts. Using esxcli, administrators can perform various tasks relating to ESXi host management, including network configuration, storage management, and VM operations.

Threat actors have been identified abusing esxcli to disable the ESXi firewall, manipulate logging settings, install malicious VIBs, and more. If your system administrators do not utilize esxcli, I recommend deploying a blanket detection that alerts on any usage. If they do, I’d recommend deploying detections focused on the highest impact functions of esxcli. You can find Sigma detections focused on this utility in the Detection Use Cases section of this post.

VIM-CMD

VMWare gives the following description for vim-cmd:

A command-line utility in VMware ESXi that provides an interface to interact with the VMware Infrastructure (VI) API, allowing users to manage and automate tasks on ESXi host and its virtual machines (VMs)

You can think of vim-cmd as a wrapper for the SOAP API we played around with earlier.

Just like esxcli, threat actors have also been abusing vim-cmd for their own objectives. Similarly, if your system administrators do not use vim-cmd I suggest deploying a blanket detection to alert on its use. If they do, I again recommend deploying detections to cover the highest impact actions its capable of. You can find Sigma detections focused on this utility in the Detection Use Cases section of this post.

Focus on Shell Events or API Logs?

So far I’ve gone over two methods of accomplishing objectives in ESXi environments, either through built-in administrator utilities or via the SOAP ESXi API and their respective logs.

You may be asking yourself, should I focus on analyzing shell events, which indicate usage of things like vim-cmd and esxcli, or API logs, which indicate usage of the SOAP ESXi API. If you HAVE to choose just one log source, either due to cost or manpower concerns, focus your efforts on API logs in hostd.log . You’ll be able to catch everything ESXi-specific as the built-in administrator utilities are essentially wrappers for the API, and will generate API logs when used. For example, disabling the ESXi firewall with esxcli network firewall set --enabled false creates an event in shell.log

2024-12-14T04:59:16.556Z shell[68791]: [root]: esxcli network firewall set --enabled false

but it also creates an event in hostd.log :

2024-12-14T04:59:16.706Z info hostd[67225] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 121 : Firewall has been disabled.

Creating our detections over hostd.log allows us to be more “method agnostic” when it comes to creating more resilient detections. We don’t care if an adversary using the SOAP API itself, esxcli, or some unknown method to disable the firewall, we’ll still detect it.

hostd.log vs shell.log

As you can see by this diagram, if we just focused on shell.log we’d miss the firewall being disabled via the SOAP API, whereas hostd.log provides coverage for both techniques.

If you’ve got the time, I still suggest looking at both log sources for two reasons. Number one is that as ESXi systems are based on FreeBSD, there are some OS-level commands like pkill , systemctl and more that have the ability to impact ESXi services without leaving detailed trace in API logs. For example, if we used pkill to terminate all VMs just before deploying ransomware, we’d just see a bunch of the following events in hostd.log which on their own, aren’t that significant.

2024-12-14T05:06:37.587Z info hostd[68117] [Originator@6876 sub=Hbrsvc opID=esxui-d55b-5b82] Replicator: Poweroff for VM: (id=1)

While shell.log would give us more context that pkill was used, which definitely is suspicious.

2024-12-14T05:06:37.563Z shell[68791]: [root]: pkill vmx-*

The second is that evidence of suspicious actions in shell.log indicates an adversary is already on the ESXi system itself and further along the kill-chain, which may change our incident response strategy.

Example Techniques

Okay, so far I’ve gone over the most useful data sources available natively in ESXi environments. Let’s create some noise and showcase some common adversary techniques and investigate what evidence they leave behind. If you don’t care about emulation and just want to get straight to the detections, skip to the Detection Use Cases section of this post.

Enable SSH

Once an adversary gains privileged access to an ESXi host, they’ll usually want to start creating more permanent access methods. One such method is enabling SSH access directly to the ESXi host. This can be accomplished either via vim-cmd hostsvc/enable_ssh or by using the SOAP API while calling the StartService method on the TSM-SSH service.

Let’s enable SSH access both ways and examine what evidence gets left behind. If we execute the following command:

vim-cmd hostsvc/enable_ssh

The shell.log artifacts are fairly simple, we get a record of the command that was executed.

2024-12-14T19:24:56.908Z shell[689111]: [root]: vim-cmd hostsvc/enable_ssh

Looking at hostd.log we get the following record:

2024-12-14T19:32:29.810Z info hostd[68929] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 181 : SSH access has been enabled.
2024-12-14T19:32:29.866Z info hostd[68306] [Originator@6876 sub=Hostsvc.ServiceSystem opID=esxui-7e63-f8ee user=root] TSM-SSH running status is true
2024-12-14T19:32:29.866Z info hostd[68306] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-7e63-f8ee user=root] Event 182 : SSH for the host localhost.localdomain has been enabled

Now let’s try the same but with the SOAP API.

import requests

# replace with ESXi IP
base_url = “https://ESXI_IP_ADDRESS” 
headers = {‘Content-Type’: ‘text/xml’, ‘Cookie’: ‘vmware_client=VMWare’}
username = “USERNAME”
password = “PASSWORD”
body = f”“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-a04a</operationID></Header><Body><Login xmlns=“urn:vim25”><_this type=“SessionManager”>ha-sessionmgr</_this><userName>{username}</userName><password>{password}</password><locale>en-US</locale></Login></Body></Envelope>“”“
# get SOAP session cookie for subsequent requests
response = requests.post(base_url + “/sdk/”, data=body, headers=headers, verify=False)
auth_cookie = response.cookies.get(“vmware_soap_session”)
# add session cookie to headers
headers.update({‘Cookie’: f’vmware_client=VMWare; vmware_soap_session={auth_cookie}'})
# create body that enables SSH access
enable_ssh_access = “””<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-4560</operationID></Header><Body><StartService xmlns=“urn:vim25”><_this type=“HostServiceSystem”>serviceSystem</_this><id>TSM-SSH</id></StartService></Body></Envelope>“”“
# enable SSH access
response = requests.post(base_url + “/sdk/”, data=enable_ssh_access, headers=headers, verify=False)
print(response.text)
print(response.status_code)

Understandably, this method of enabling SSH leaves no artifacts in shell.log but leaves the same evidence as vim-cmd hostsvc/enable_ssh in hostd.log , reinforcing the point I made earlier about focusing on that log source.

2024-12-14T19:39:24.724Z info hostd[67548] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 192 : SSH access has been enabled.
2024-12-14T19:39:24.796Z info hostd[68310] [Originator@6876 sub=Hostsvc.ServiceSystem opID=esxui-4560-f989 user=root] TSM-SSH running status is true
2024-12-14T19:39:24.797Z info hostd[68310] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-4560-f989 user=root] Event 193 : SSH for the host localhost.localdomain has been enabled

If you’d like to try this technique yourself and test your detections, you can emulate vim-cmd being used to enable SSH with the following Atomic Red Team test: https://www.atomicredteam.io/atomic-red-team/atomics/T1021.004#atomic-test-2—esxi—enable-ssh-via-vim-cmd. For testing the SOAP API method just use the Python snippet shared above, modifying it for your environment.

Disable Firewall

The ESXi firewall is one of the last lines of defense for your host. If an adversary has somehow gained network access to the host, the firewall can help prevent access attempts. If an adversary somehow still gets access to the system, either via a vulnerability or misconfiguration, they may disable the firewall so it doesn’t interfere with their operations.

We can use either esxcli or the SOAP API to disable the firewall. Let’s go over esxcli first.

There’s actually two methods of disabling the firewall using esxcli. You can either simply disable the firewall completely with:

esxcli network firewall set --enabled false

Or you can set the default firewall action to PASS instead of DROP, effectively allowing all incoming and outgoing traffic:

esxcli network firewall set --default-action true

Both of these leave evidence in shell.log & hostd.log that’s similar to what we’ve seen before.

Disabling the firewall completely:

2024-12-14T20:21:34.888Z shell[69732]: [root]: esxcli network firewall set --enabled false

2024-12-14T20:21:35.017Z info hostd[68310] [Originator@6876 sub=Default opID=esxcli-67-f9d2] Accepted password for user root from 127.0.0.1
2024-12-14T20:21:35.017Z info hostd[68310] [Originator@6876 sub=Vimsvc opID=esxcli-67-f9d2] [Auth]: User root
2024-12-14T20:21:35.017Z warning hostd[68310] [Originator@6876 sub=Vimsvc opID=esxcli-67-f9d2] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T20:21:35.017Z info hostd[68310] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxcli-67-f9d2] Event 197 : User [email protected] logged in as pyvmomi Python/3.8.13 (VMkernel; 7.0.3; x86_64)
2024-12-14T20:21:35.046Z info hostd[68929] [Originator@6876 sub=Solo.VmwareCLI opID=esxcli-67-f9d8 user=root] Dispatch set
2024-12-14T20:21:35.049Z info hostd[68305] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 198 : Firewall has been disabled.

Setting default action to PASS instead of DROP:

2024-12-14T20:24:22.029Z shell[69732]: [root]: esxcli network firewall set --default-action true

2024-12-14T20:24:22.161Z info hostd[68304] [Originator@6876 sub=Default opID=esxcli-34-f9df] Accepted password for user root from 127.0.0.1
2024-12-14T20:24:22.161Z info hostd[68304] [Originator@6876 sub=Vimsvc opID=esxcli-34-f9df] [Auth]: User root
2024-12-14T20:24:22.161Z warning hostd[68304] [Originator@6876 sub=Vimsvc opID=esxcli-34-f9df] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T20:24:22.161Z info hostd[68304] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxcli-34-f9df] Event 200 : User [email protected] logged in as pyvmomi Python/3.8.13 (VMkernel; 7.0.3; x86_64)
2024-12-14T20:24:22.189Z info hostd[68302] [Originator@6876 sub=Solo.VmwareCLI opID=esxcli-34-f9e5 user=root] Dispatch set
2024-12-14T20:24:22.191Z info hostd[68302] [Originator@6876 sub=Solo.VmwareCLI opID=esxcli-34-f9e5 user=root] Dispatch set done
2024-12-14T20:24:22.195Z info hostd[68306] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxcli-34-f9e6 user=root] Event 202 : User [email protected] logged out (login time: Saturday, 14 December, 2024 08:24:22 PM, number of API invocations: 7, user agent: pyvmomi Python/3.8.13 (VMkernel; 7.0.3; x86_64))
2024-12-14T20:24:22.191Z info hostd[68011] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 201 : Firewall has been disabled.

As you can see, both methods leave Firewall has been disabled messages in hostd.log

Onto the SOAP API, you aren’t able to disable the firewall completely or set the default action to pass, but you are able to disable individual firewall rules. The following example disables the WOL firewall rule.

import requests

# replace with ESXi IP
base_url = “https://ESXI_HOST” 
headers = {‘Content-Type’: ‘text/xml’, ‘Cookie’: ‘vmware_client=VMWare’}
username = “USERNAME”
password = “PASSWORD”
body = f”“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-a04a</operationID></Header><Body><Login xmlns=“urn:vim25”><_this type=“SessionManager”>ha-sessionmgr</_this><userName>{username}</userName><password>{password}</password><locale>en-US</locale></Login></Body></Envelope>“”“
# get SOAP session cookie for subsequent requests
response = requests.post(base_url + “/sdk/”, data=body, headers=headers, verify=False)
auth_cookie = response.cookies.get(“vmware_soap_session”)
# add session cookie to headers
headers.update({‘Cookie’: f’vmware_client=VMWare; vmware_soap_session={auth_cookie}'})
# create body that disables the WOL firewall rule
disable_firewall_rule = “””<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-341a</operationID></Header><Body><DisableRuleset xmlns=“urn:vim25”><_this type=“HostFirewallSystem”>firewallSystem</_this><id>WOL</id></DisableRuleset></Body></Envelope>“”“
# delete all snapshots
response = requests.post(base_url + “/sdk/”, data=disable_firewall_rule, headers=headers, verify=False)
print(response.text)
print(response.status_code)

We can see evidence of this in hostd.log :

2024-12-14T20:34:08.731Z info hostd[67517] [Originator@6876 sub=Default opID=esxui-a04a-face] Accepted password for user root from 192.168.220.1
2024-12-14T20:34:08.731Z warning hostd[67517] [Originator@6876 sub=Vimsvc opID=esxui-a04a-face] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T20:34:08.731Z info hostd[67517] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-a04a-face] Event 214 : User [email protected] logged in as python-requests/2.31.0
2024-12-14T20:34:08.751Z info hostd[67516] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-341a-facf user=root] Task Created : haTask-ha-host-vim.host.FirewallSystem.disableRuleset-453
2024-12-14T20:34:08.753Z info hostd[67516] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-341a-facf user=root] Task Completed : haTask-ha-host-vim.host.FirewallSystem.disableRuleset-453 Status success
2024-12-14T20:34:08.754Z info hostd[67517] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 215 : Firewall configuration has changed. Operation ‘disable’ for rule set WOL succeeded.

You can use the following Atomic Red Team tests and the above Python snipped to validate your detections after they’ve been deployed:

https://www.atomicredteam.io/atomic-red-team/atomics/T1562.004#atomic-test-23—esxi—disable-firewall-via-esxcli
https://www.atomicredteam.io/atomic-red-team/atomics/T1562.004#atomic-test-25—esxi—set-firewall-to-pass-traffic

Deleting Snapshots

In my opinion, deleting all snapshots for a VM is one of the highest impact operations that can happen on an ESXi host, just behind actual ransomware encryption. This action makes recovering from any ransomware attack extremely difficult if you aren’t storing secondary backups offsite. Unfortunately, both vim-cmd and the SOAP API provide simple ways of deleting all of the snapshots for a given VM.

We can simply pass the ID of a VM to vim-cmd vmsvc/snapshot.removeall $VM_ID_HERE to remove all snapshots. If we want to be particularly nefarious, we could enumerate all VMs on a system using vim-cmd vmsvc/getallvms and then pass those IDs to another vim-cmd snapshot.removeall command to remove all snapshots for every VM on the system. This can be achieved in a simple bash for loop:

for i in vim-cmd vmsvc/getallvms | awk 'NR&gt;1 {print $1}'; do vim-cmd vmsvc/snapshot.removeall $i & done

We can see this in shell.log as:

2024-12-14T21:04:58.700Z shell[69732]: [root]: for i in vim-cmd vmsvc/getallvms | awk 'NR&gt;1 {print $1}'; do vim-cmd vmsvc/snapshot.removeall $i & done

But in hostd.log we get a lot more detail. We can see exactly which VMs had their snapshots removed ( test_vm.vmx & test_vm2.vmx ), each individual snapshot removal operation, and the user-agent that made these requests VMware-client/6.5.0 . I’d recommend focusing the the Removeallsnapshots received events for detections as they include both the removal operation and which VM was impacted.

2024-12-14T21:06:50.374Z info hostd[68310] [Originator@6876 sub=Default opID=vim-cmd-9e-fbc3] Accepted password for user root from 127.0.0.1
2024-12-14T21:06:50.374Z warning hostd[68310] [Originator@6876 sub=Vimsvc opID=vim-cmd-9e-fbc3] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T21:06:50.374Z info hostd[68310] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=vim-cmd-9e-fbc3] Event 225 : User [email protected] logged in as VMware-client/6.5.0
2024-12-14T21:06:50.377Z info hostd[68310] [Originator@6876 sub=Vimsvc.TaskManager opID=vim-cmd-9e-fbc6 user=root] Task Created : haTask-2-vim.VirtualMachine.removeAllSnapshots-504
2024-12-14T21:06:50.377Z verbose hostd[68310] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmx opID=vim-cmd-9e-fbc6 user=root] Removeallsnapshots received. Consolidate: true
2024-12-14T21:06:50.377Z info hostd[68310] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmx opID=vim-cmd-9e-fbc6 user=root] State Transition (VM_STATE_OFF -> VM_STATE_REMOVEALL_SNAPSHOT)
2024-12-14T21:06:50.387Z info hostd[68310] [Originator@6876 sub=Libs opID=vim-cmd-9e-fbc6 user=root] SNAPSHOT: SnapshotDeleteWork ‘/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmx’ : 1
2024-12-14T21:06:50.393Z info hostd[68011] [Originator@6876 sub=Default opID=vim-cmd-d1-fbcd] Accepted password for user root from 127.0.0.1
2024-12-14T21:06:50.393Z warning hostd[68011] [Originator@6876 sub=Vimsvc opID=vim-cmd-d1-fbcd] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T21:06:50.393Z info hostd[68011] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=vim-cmd-d1-fbcd] Event 226 : User [email protected] logged in as VMware-client/6.5.0
2024-12-14T21:06:50.396Z info hostd[67517] [Originator@6876 sub=Vimsvc.TaskManager opID=vim-cmd-d1-fbd0 user=root] Task Created : haTask-1-vim.VirtualMachine.removeAllSnapshots-505
2024-12-14T21:06:50.396Z verbose hostd[68306] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx opID=vim-cmd-d1-fbd0 user=root] Removeallsnapshots received. Consolidate: true
2024-12-14T21:06:50.396Z info hostd[68306] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx opID=vim-cmd-d1-fbd0 user=root] State Transition (VM_STATE_OFF -> VM_STATE_REMOVEALL_SNAPSHOT)
2024-12-14T21:06:50.398Z info hostd[68306] [Originator@6876 sub=Libs opID=vim-cmd-d1-fbd0 user=root] SNAPSHOT: SnapshotDeleteWork ‘/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx’ : 1
2024-12-14T21:06:50.401Z verbose hostd[68310] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmx opID=vim-cmd-9e-fbc6 user=root] Consolidate disks after snapshot removal.
2024-12-14T21:06:50.402Z info hostd[68310] [Originator@6876 sub=Libs opID=vim-cmd-9e-fbc6 user=root] SNAPSHOT: SnapshotConfigInfoOpenVmsd: Creating new snapshot dictionary, ‘/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmsd.usd’.
2024-12-14T21:06:50.411Z info hostd[68310] [Originator@6876 sub=Libs opID=vim-cmd-9e-fbc6 user=root] SNAPSHOT: SnapshotCombineDisks: Consolidating from ‘/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2-000001.vmdk’ to ‘/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmdk’.
2024-12-14T21:06:50.426Z verbose hostd[68306] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx opID=vim-cmd-d1-fbd0 user=root] Consolidate disks after snapshot removal.

We can do the same with the SOAP API by getting a listing of all VMs and then calling the RemoveAllSnapshots method.

import requests
import xmltodict
# replace with ESXi IP
base_url = “https://ESXI_HOST_IP” 
headers = {‘Content-Type’: ‘text/xml’, ‘Cookie’: ‘vmware_client=VMWare’, ‘SOAPAction’: ‘urn:vim25/7.0.3.0’}
# replace with ESXi credentials
username = “USERNAME”
password = “PASSWORD”
body = f”“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-a04a</operationID></Header><Body><Login xmlns=“urn:vim25”><_this type=“SessionManager”>ha-sessionmgr</_this><userName>{username}</userName><password>{password}</password><locale>en-US</locale></Login></Body></Envelope>“”“
# get SOAP session cookie for subsequent requests
response = requests.post(base_url + “/sdk/”, data=body, headers=headers, verify=False)
auth_cookie = response.cookies.get(“vmware_soap_session”)
# add session cookie to headers
headers.update({‘Cookie’: f’vmware_client=VMWare; vmware_soap_session={auth_cookie}'})
# get list of all VMs
enumerate_vms = “””<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-90za</operationID></Header><Body><RetrievePropertiesEx xmlns=“urn:vim25”><_this type=“PropertyCollector”>ha-property-collector</_this><specSet><propSet><type>Folder</type><all>false</all><pathSet>childEntity</pathSet></propSet><objectSet><obj type=“Folder”>ha-folder-vm</obj><skip>false</skip></objectSet></specSet><options/></RetrievePropertiesEx></Body></Envelope>“”“
response = requests.post(base_url + “/sdk/”, data=enumerate_vms, headers=headers, verify=False)
vm_ids = 
vm_id_response = xmltodict.parse(response.text)
# have I said how much I hate XML?
for vm_id in vm_id_response[“soapenv:Envelope”][“soapenv:Body”][“RetrievePropertiesExResponse”][“returnval”][“objects”][“propSet”][“val”][“ManagedObjectReference”]:
    vm_ids.append(vm_id[”#text"])
for vms in vm_ids:
    # delete all snapshots for each VM in the ESXi host.
    delete_snapshots_body = f"“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-e243</operationID></Header><Body><RemoveAllSnapshots_Task xmlns=“urn:vim25”><_this type=“VirtualMachine”>{vms}</_this></RemoveAllSnapshots_Task></Body></Envelope>“”“
    response = requests.post(base_url + “/sdk/”, data=delete_snapshots_body, headers=headers, verify=False)
    print(response.status_code)

This leaves considerably more information in hostd.log due to that first request to enumerate all VMs, but we can still find evidence of the Snapshotremoveall method being invoked.

2024-12-14T21:54:15.234Z info hostd[67514] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-a04a-fd67] Event 259 : User [email protected] logged in as python-requests/2.31.0
2024-12-14T21:54:15.268Z info hostd[67514] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-e243-fd69 user=root] Task Created : haTask-1-vim.VirtualMachine.removeAllSnapshots-580
2024-12-14T21:54:15.269Z verbose hostd[67937] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx opID=esxui-e243-fd69 user=root] Removeallsnapshots received. Consolidate: true
2024-12-14T21:54:15.269Z info hostd[67937] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx opID=esxui-e243-fd69 user=root] State Transition (VM_STATE_OFF -> VM_STATE_REMOVEALL_SNAPSHOT)
2024-12-14T21:54:15.279Z info hostd[67937] [Originator@6876 sub=Libs opID=esxui-e243-fd69 user=root] SNAPSHOT: SnapshotDeleteWork ‘/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx’ : 2
2024-12-14T21:54:15.286Z info hostd[68304] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-e243-fd71 user=root] Task Created : haTask-2-vim.VirtualMachine.removeAllSnapshots-581
2024-12-14T21:54:15.287Z verbose hostd[68304] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm2/test_vm2.vmx opID=esxui-e243-fd71 user=root] Removeallsnapshots received. Consolidate: true

And of course, this method leaves no trace in shell.log

You can use the following Atomic Red Team tests and the above Python snipped to validate your detections after they’ve been deployed: https://www.atomicredteam.io/atomic-red-team/atomics/T1485#atomic-test-5—esxi—delete-vm-snapshots

Killing Virtual Machines

In order for ransomware to encrypt VMs, they must first be terminated. There’s numerous ways to perform this action, you can use vim-cmd , esxcli , pkill or the PowerOffVM_Task SOAP API method.

The vim-cmd , esxcli , and pkill methods have all been documented and have existing Atomic Red Team tests so we’ll be skipping their implementation.

If we take a look at the SOAP API method, it’s similar to removing all snapshots for a given VM. We first enumerate all VMs on the ESXi host and then make individual requests to power them off.

import requests
import xmltodict
# replace with ESXi IP
base_url = “https://ESXI_IP” 
headers = {‘Content-Type’: ‘text/xml’, ‘Cookie’: ‘vmware_client=VMWare’, ‘SOAPAction’: ‘urn:vim25/7.0.3.0’}
username = “USERNAME”
password = “PASSWORD”
body = f”“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-a04a</operationID></Header><Body><Login xmlns=“urn:vim25”><_this type=“SessionManager”>ha-sessionmgr</_this><userName>{username}</userName><password>{password}</password><locale>en-US</locale></Login></Body></Envelope>“”“
# get SOAP session cookie for subsequent requests
response = requests.post(base_url + “/sdk/”, data=body, headers=headers, verify=False)
auth_cookie = response.cookies.get(“vmware_soap_session”)
# add session cookie to headers
headers.update({‘Cookie’: f’vmware_client=VMWare; vmware_soap_session={auth_cookie}'})
# get list of all VMs
enumerate_vms = “””<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-90za</operationID></Header><Body><RetrievePropertiesEx xmlns=“urn:vim25”><_this type=“PropertyCollector”>ha-property-collector</_this><specSet><propSet><type>Folder</type><all>false</all><pathSet>childEntity</pathSet></propSet><objectSet><obj type=“Folder”>ha-folder-vm</obj><skip>false</skip></objectSet></specSet><options/></RetrievePropertiesEx></Body></Envelope>“”“
response = requests.post(base_url + “/sdk/”, data=enumerate_vms, headers=headers, verify=False)
vm_ids = 
vm_id_response = xmltodict.parse(response.text)
# have I said how much I hate XML?
for vm_id in vm_id_response[“soapenv:Envelope”][“soapenv:Body”][“RetrievePropertiesExResponse”][“returnval”][“objects”][“propSet”][“val”][“ManagedObjectReference”]:
    vm_ids.append(vm_id[”#text"])
for vms in vm_ids:
    # power off all VMs
    power_off_vms_body = f"“”<Envelope xmlns=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”><Header><operationID>esxui-d3</operationID></Header><Body><PowerOffVM_Task xmlns=“urn:vim25”><_this type=“VirtualMachine”>{vms}</_this></PowerOffVM_Task></Body></Envelope>“”"
    response = requests.post(base_url + “/sdk/”, data=power_off_vms_body, headers=headers, verify=False)
    print(response.status_code)

This of course leaves evidence in hostd.log . We can see which VMs were terminated and the user who made the API request.

2024-12-14T22:30:42.333Z info hostd[69711] [Originator@6876 sub=Default opID=esxui-a04a-fe8f] Accepted password for user root from 192.168.220.1
2024-12-14T22:30:42.333Z warning hostd[69711] [Originator@6876 sub=Vimsvc opID=esxui-a04a-fe8f] Refresh function is not configured.User data can’t be added to scheduler.User name: root
2024-12-14T22:30:42.333Z info hostd[69711] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-a04a-fe8f] Event 305 : User [email protected] logged in as python-requests/2.31.0
2024-12-14T22:30:42.368Z info hostd[68305] [Originator@6876 sub=Vimsvc.TaskManager opID=esxui-d3-fe91 user=root] Task Created : haTask-1-vim.VirtualMachine.powerOff-639
2024-12-14T22:30:42.369Z verbose hostd[68305] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/673fd53b-ef4df1e9-b63e-000c2994365f/test_vm/test_vm.vmx opID=esxui-d3-fe91 user=root] Power off request received
2024-12-14T22:30:42.369Z info hostd[68305] [Originator@6876 sub=Vimsvc.ha-eventmgr opID=esxui-d3-fe91 user=root] Event 306 : test_vm on  localhost.localdomain in ha-datacenter is stopping

You can use the following Atomic Red Team tests and the above Python snipped to validate your detections after they’ve been deployed:

ESXi Testing Toolkit

I was all ready to publish this post until one night, lying in my bed thinking about it (as most sane people do), I kept thinking about how annoying it was dealing with ESXi’s SOAP API and needing to keep track of every test and it’s corresponding utility. I kept thinking no one should go through this again.

So I decided to fix that! I created ESXi Testing Toolkit. It’s a simple, easy to use Python-based CLI tool that packages 21 adversarial tests from LOLESXi into easy to use commands.

In addition to containing 21 tests, it also contains:

A verbose mode that automatically pulls down hostd.log or shell.log from the ESXi host for analysis, making the detection engineering process easier.
19 (!) Sigma rules that cover every technique it emulates.
The ability to execute tests either via SSH or API, depending on the technique.

The toolkit being used to enumerate all VM ids and then delete all snapshots for a specific VM

Changing the ESXi DCUI Welcome Message

Specifying either ESXCLI or VIM-CMD for a given test

It contains a list command the gives a detailed list of available tests, their dependencies, risk level, module, and more!

Verbose mode that returns logs directly from the ESXi system!

Refer to the tools GitHub page for installation instructions, configuration, and a more detailed overview.

https://github.com/AlbinoGazelle/esxi-testing-toolkit

The toolkit is completely open source under the MIT license, so feel free to use it in enterprise, community, or any other unforeseen context!

Detection Use Cases

Through the course of writing this post and developing ESXi Testing Toolkit, I created quite a few detections. I’ve publicly shared these detection rules, in Sigma format, in the ESXi Testing Toolkit GitHub repository under the /detections folder.

These detections are in the process of being merged into the main Sigma rule repository so more people can benefit!

Final Thoughts

If you made it this far, I want to say thanks for reading the whole post!

I spent quite a few hours of my own time not only writing it but also researching and playing around with ESXi systems. If you enjoyed this post you can “clap” it and if you have any thoughts feel free to share them with me on LinkedIn or BlueSky.

For those who skipped ahead (I won’t blame you, it’s a long post!), here’s a summary of the important points in this post:

When developing detections for ESXi environments, I suggest putting more time and effort into analyzing API event logs in hostd.log instead of shell.log .
Be aware of instances where command execution logs may end up in auth.log instead of shell.log when executed using ssh root@ESXI_HOST “COMMAND_HERE”
The ESXi-specific utilities on ESXi hosts are mainly just wrappers for an undocumented API and if you focus on hostd.log you’ll catch most of them, but you’ll miss non-ESXi specific utilities that will be in shell.log and not hostd.log
Use ESXi Testing Toolkit to automate all the manually processes described in this post! It contains every test I’ve talked about and more!

Open Source Projects

This blog post wouldn’t have been possible without the following open source projects. Any new findings I discovered while in the process of researching for this post have been submitted to these projects as a way of giving back.

References

VMware ESXi Logging & Detection Opportunities was originally published in Detect FYI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Introduction to Malware Binary Triage (IMBT) Course

Looking to level up your skills? Get 10% off using coupon code: MWNEWS10 for any flavor.

Enroll Now and Save 10%: Coupon Code MWNEWS10

Note: Affiliate link – your enrollment helps support this platform at no extra cost to you.

Article Link: VMware ESXi Logging & Detection Opportunities | by Nathan Burns | Jan, 2025 | Detect FYI