Overview
Elastic Security Labs continues to monitor active threats such as GULOADER, also known as CloudEyE – an evasive shellcode downloader that has been highly active for years while under constant development. One of these recent changes is the addition of exceptions to its Vectored Exception Handler (VEH) in a fresh campaign, adding more complexity to its already long list of anti-analysis tricks.
While GULOADER’s core functionality hasn’t changed drastically over the past few years, these constant updates in their obfuscation techniques make analyzing GULOADER a time-consuming and resource-intensive process. In this post, we will touch on the following topics when triaging GULOADER:
- Reviewing the initial shellcode and unpacking process
- Finding the entrypoint of the decrypted shellcode
- Discuss update to GULOADER’s VEH that obfuscates control flow
- Provide a methodology to patch out VEH
Initial Shellcode
In our sample, GULOADER comes pre-packaged inside an NSIS (Nullsoft Scriptable Install System) installer. When the installer is extracted, the main components are:
- NSIS Script - This script file outlines all the various configuration and installation aspects.
- System.dll - Located under the
$PLUGINSDir
. This file is dropped in a temporary folder to allocate/execute the GULOADER shellcode.
- Shellcode - The encrypted shellcode is buried into a nested folder.
One quick methodology to pinpoint the file hosting the shellcode can be done by monitoring ReadFile
events from SysInternal’s Process Monitor after executing GULOADER. In this case, we can see that the shellcode is read in from a file (Fibroms.Hag
).
GULOADER executes shellcode through callbacks using different Windows API functions. The main reasoning behind this is to avoid detections centered around traditional Windows APIs used for process injection, such as CreateRemoteThread
or WriteProcessMemory
. We have observed EnumResourceTypesA
and CallWindowProcW
used by GULOADER.
By reviewing the MSDN documentation for EnumResourceTypesA
, we can see the second parameter expects a pointer to the callback function. From the screenshot above, we can see that the newly allocated shellcode is placed into this argument.
Finding Main Shellcode Entrypoint
In recent samples, GULOADER has increased the complexity at the start of the initial shellcode by including many different junk instructions and jumps. Reverse engineering of the downloader can require dealing with a long process of unwinding code obfuscation designed to break disassembly and control flow in some tooling, making it frustrating to find the actual start of the core GULOADER shellcode.
One methodology for finding the initial call can be leveraging graph view inside x64dbg and using a bottom-to-top approach to look for the call eax
instruction.
Another technique to trace the initial control flow involves leveraging the reversing engineering framework Miasm. Below is a quick example where we can pass in the shellcode and disassemble the instructions to follow the flow:
from miasm.core.locationdb import LocationDB
from miasm.analysis.binary import Container
from miasm.analysis.machine import Machine
with open("proctoring_06BF0000.bin", "rb") as f:
code = f.read()
loc_db = LocationDB()
c = Container.from_string(code, loc_db)
machine = Machine('x86_32')
mdis = machine.dis_engine(c.bin_stream, loc_db=loc_db)
mdis.follow_call = True
mdis.dontdis_retcall = True
asm_cfg = mdis.dis_multiblock(offset=0x1400)
Miasm cuts through the 142 jmp
instructions and navigates through the junk instructions where we have configured it to stop on the call instruction to EAX (address: 0x3bde
).
JMP loc_3afd
-> c_to:loc_3afd
loc_3afd
MOV EBX, EAX
FADDP ST(3), ST
PANDN XMM7, XMM2
JMP loc_3b3e
-> c_to:loc_3b3e
loc_3b3e
SHL CL, 0x0
PSRAW MM1, MM0
PSRLD XMM1, 0xF1
JMP loc_3b97
-> c_to:loc_3b97
loc_3b97
CMP DL, 0x3A
PADDW XMM3, XMM5
PXOR MM3, MM3
JMP loc_3bde
-> c_to:loc_3bde
loc_3bde
CALL EAX
Tail end of Miasm
GULOADER’s VEH Update
One of GULOADER’s hallmark techniques is centered around its Vectored Exception Handling (VEH) capability. This feature gives Windows applications the ability to intercept and handle exceptions before they are routed through the standard exception process. Malware families and software protection applications use this technique to make it challenging for analysts and tooling to follow the malicious code.
GULOADER starts this process by adding the VEH using RtlAddVectoredExceptionHandler
. Throughout the execution of the GULOADER shellcode, there is code purposely placed to trigger these different exceptions. When these exceptions are triggered, the VEH will check for hardware breakpoints. If not found, GULOADER will modify the EIP directly through the CONTEXT structure using a one-byte XOR key (changes per sample) with a one-byte offset from where the exception occurred. We will review a specific example of this technique in the subsequent section. Below is the decompilation of our sample’s VEH:
Although this technique is not new, GULOADER continues to add new exceptions over time; we have recently observed these two exceptions added in the last few months:
EXCEPTION_PRIV_INSTRUCTION
EXCEPTION_ILLEGAL_INSTRUCTION
As new exceptions get added to GULOADER, it can end up breaking tooling used to expedite the analysis process for researchers.
EXCEPTION_PRIV_INSTRUCTION
Let’s walk through the two recently added exceptions to follow the VEH workflow. The first exception (EXCEPTION_PRIV_INSTRUCTION
), occurs when an attempt is made to execute a privileged instruction in a processor’s instruction set at a privilege level where it’s not allowed. Certain instructions, like the example below with WRSMR expect privileges from the kernel level, so when the program is run from user mode, it will trigger the exception due to incorrect permissions.
EXCEPTION_ILLEGAL_INSTRUCTION
This exception is invoked when a program attempts to execute an invalid or undefined CPU instruction. In our sample, when we run into Intel virtualization instructions such as vmclear
or vmxon
, this will trigger an exception.
Once an exception occurs, the GULOADER VEH code will first determine which exception code was responsible for the exception. In our sample, if the exception matches any of the five below, the code will take the same path regardless.
EXCEPTION_ACCESS_VIOLATION
EXCEPTION_ILLEGAL_INSTRUCTION
EXCEPTION_PRIV_INSTRUCTION
EXCEPTION_SINGLE_STEP
EXCEPTION_BREAKPOINT
GULOADER will then check for any hardware breakpoints by walking the CONTEXT record found inside the EXCEPTION_POINTERS structure. If hardware breakpoints are found in the different debug registers, GULOADER will return a 0
into the CONTEXT record, which will end up causing the shellcode to crash.
If there are no hardware breakpoints, GULOADER will retrieve a single byte which is 7 bytes away from the address that caused the exception. When using the last example with vmclear
, it would retrieve byte (0x8A
).
Then, using that byte, it will perform an XOR operation with a different hard-coded byte. In our case (0xB8
), this is unique per sample. Now, with a derived offset 0x32
(0xB8 ^ 0x8A
), GULOADER will modify the EIP address directly from the CONTEXT record by adding 0x32
to the previous address (0x7697630
) that caused the exception resulting in the next code to execute from address (0x7697662
).
With different junk instructions in between, and repeatedly hitting exceptions (we counted 229 unique exceptions in our sample), it’s not hard to see why this can break different tooling and increase analyst time.
Control Flow Cleaning
To make following the control flow easier, an analyst can bypass the VEH by tracing the execution, logging the exceptions, and patching the shellcode using the previously discussed EIP modification algorithm. For this procedure, we leveraged TinyTracer, a tool written by @hasherezade that leverages Pin, a dynamic binary instrumentation framework. This will allow us to catch the different addresses that triggered the exception, so using the example above with vmclear
, we can see the address was 0x7697630
, generated an exception calling KiUserExceptionDispatcher
, a function responsible for handling user-mode exceptions.
Once all the exceptions are collected and filtered, these can be passed into an IDAPython script where we walk through each address, calculate the offset using the 7th byte over and XOR key (0xB8
), then patch out all the instructions generating exceptions with short jumps.
The following image is an example of patching instructions that trigger exceptions at addresses 0x07697630
and 0x0769766C
.
Below is a graphic representing the control flow graph before the patching is applied globally. Our basic block with the vmclear
instruction is highlighted in orange. By implementing the VEH, GULOADER flattens the control flow graph, making it harder to trace the program logic.
After patching the VEH with jmp
instructions, this transforms the basic blocks by connecting them together, reducing the complexity behind the flow of the shellcode.
Using this technique can accelerate the cleaning process, yet it’s important to note that it isn’t a bulletproof method. In this instance, there still ends up being a good amount of code/functionality that will still need to be analyzed, but this definitely goes a long way in simplifying the code by removing the VEH. The full POC script is located here.
Conclusion
GULOADER has many different features that can break disassembly, hinder control flow, and make analysis difficult for researchers. Despite this and the process being imperfect, we can counter these traits through different static or dynamic processes to help reduce the analysis time. For example, we observed that with new exceptions in the VEH, we can still trace through them and patch the shellcode. This process will set the analyst on the right path, closer to accessing the core functionality with GULOADER.
By sharing some of our workflow, we hope to provide multiple takeaways if you encounter GULOADER in the wild. Based on GULOADER’s changes, it's highly likely that future behaviors will require new and different strategies. For detecting GULOADER, the following section includes YARA rules, and the IDAPython script from this post can be found here. For new updates on the latest threat research, check out our malware analysis section by the Elastic Security Labs team.
YARA
Elastic Security has created different YARA rules to identify this activity. Below is an example of one YARA rule to identify GULOADER.
rule Windows_Trojan_Guloader {
meta:
author = "Elastic Security"
creation_date = "2023-10-30"
last_modified = "2023-11-02"
reference_sample = "6ae7089aa6beaa09b1c3aa3ecf28a884d8ca84f780aab39902223721493b1f99"
severity = 100
arch = "x86"
threat_name = "Windows.Trojan.Guloader"
license = "Elastic License v2"
os = "windows"
strings:
$djb2_str_compare = { 83 C0 08 83 3C 04 00 0F 84 [4] 39 14 04 75 }
$check_exception = { 8B 45 ?? 8B 00 38 EC 8B 58 ?? 84 FD 81 38 05 00 00 C0 }
$parse_mem = { 18 00 10 00 00 83 C0 18 50 83 E8 04 81 00 00 10 00 00 50 }
$hw_bp = { 39 48 0C 0F 85 [4] 39 48 10 0F 85 [4] 39 48 14 0F 85 [7] 39 48 18 }
$scan_protection = { 39 ?? 14 8B [5] 0F 84 }
condition:
2 of them
}
Observations
All observables are also available for download in both ECS and STIX format.
The following observables were discussed in this research.
Observable | Type | Name | Reference |
---|---|---|---|
6ae7089aa6beaa09b1c3aa3ecf28a884d8ca84f780aab39902223721493b1f99 | SHA-256 | Windows.Trojan.Guloader | GULOADER downloader |
101.99.75[.]183/MfoGYZkxZIl205.bin | url | NA | GULOADER C2 URL |
101.99.75[.]183 | ipv4-addr | NA | GULOADER C2 IP |