Data Talks: Deeper Down the Rabbit Hole: Second-Stage Attack and a Fileless Finale
In our last blog, “Following a Trail of Confusion: PowerShell in Malicious Office Documents”, we systematically unraveled multiple layers of obfuscation initiated by a weaponized first-stage Microsoft Word document to reveal a surreptitious download script and a malicious second-stage binary file dropped onto the victim PC. For those who wish to follow the analysis through to its conclusion, the sample MD5 is 6c8e800f14f927de051a3788083635e5 and a VirusTotal report is here.
Picking Up Where Word Drops Off
Suppose the weaponized Word document was successful, bypassing all existing layered defenses, and now the next stage begins. This is the native code program that is now running in memory, and with it come additional capabilities to compromise the host computer. As with our previous analysis, we have to figure out what type of code obfuscation we’re dealing with it. With native code programs—portable executable (PE) files in the case of Microsoft Windows—the first layer is usually packing. Packing is a well-known technique that essentially takes the malicious program and wraps it inside another program. You can think of it like a zip or another archive, where if we analyze the zip file, we won’t get any information about the content it contains.
Bromium Secure Platform shows the original malicious document, the request to retrieve this sample, and the process it invoked.
Signs of Packing
Before jumping right into IDA Pro and tackling the disassembly, it’s often worthwhile to perform initial static analysis of the PE file to get some ideas on packing and other potential code obfuscation techniques. PE parsing utilities can be valuable for getting an initial look at the characteristics of the file. Strings are a good first indicator, and the presence or lack of strings can provide critical insight into the program. Strings are an important part of any program as they are routinely needed for such functionality as making HTTP requests, writing files to disk, looking for processes, and creating files in the file system. Malware authors will often attempt to obfuscate these strings, and an added benefit of packing is that the strings are compressed and encrypted inside, obscuring their discovery. This sample presents some strings, but most of these come from the functions that it is importing. Outside of that, there are no further indicators such as command and control (C2) URLs or IPs, indications of file or process activity, or evidence of intended behavior such as a ransom note.
Sample of strings output using strings utility
Sections of the PE file are also worth investigating. Sections provide structure to the PE file for such items as the executable code and hard-coded data. In addition, they may provide evidence of malware that is packed. There are usually two strong indicators: the name of the section and the entropy of the section. Section names are arbitrary, but some packers use consistent naming and allow for easier detection. Entropy is a measure of the randomness in a sequence of bytes, which make up the content of the sections. This is usually measured on a scale from zero to eight, with eight being the highest measure of entropy. Programs that contain sections with high entropy are more suspect for packing and other obfuscation techniques, since this garbled code tends to be more random and less deliberate.
Dumpbin output of PE file sections
While there are other indicators to consider, it appears this program is packed and will require deeper investigation.
PACKING ANALYSIS AND CODE OBFUSCATION
Now we can turn to IDA Pro to start analyzing the code of this program. Upon loading the file, IDA provides further indications that the sample is packed.
IDA Pro dialog indicating potential packing
This program begins with a lot of instructions, most of them unnecessary. One way to try to filter through this code is to see how the registers, variables, and functions are being used. In this first code block, there are several function calls where the return value (in EAX) is being used in a compare/conditional jump combination. The conditional jump goes to loc_407257.
If we navigate to that location, we end up in an infinite loop. This is helpful, as we can now start to visually filter out this noise and attempt to find the true purpose of this code. Since we suspect that we are looking at purely packing code, we don’t want to spend a lot of time analyzing how this code works but find the point at which it’s done. This will allow us to focus on whatever is unpacked. With unpacking code, I’ve often found that you can concentrate on the end of the functions and look for abnormal returns or control transfers. This function ends with a function call, which is far from a normal epilogue.
Tracing into function sub_407027, we can investigate the code at the end. It appears there are two possible paths for it to go, both with unconventional methods of going there.
This function uses a technique of pushing a DWORD value onto the stack and then jumping to ESP. What is pushed onto the stack is actually an address: 0x4071B1. This technique has actually prevented IDA from identifying the correct location and continuing with disassembly. If we go to that location manually, however, we can tell IDA to disassemble this code.
Unanalyzed JMP target
Disassembled location 0x4071B1
Once the data at this location is disassembled, we reveal a call instruction with a call target of dword_40A34C. The value of this DWORD is not hard-coded, which means it is populated during runtime. Instead of continuing with static-analysis, we can now turn to WinDbg for dynamic analysis to see where this call goes.
Switching to Dynamic Analysis
Setting a breakpoint on that call instruction reveals that the call target is to location 0x4071c4.
Since IDA was unable to find this location during static analysis, it initially shows up as data instead of instructions.
Invoking IDA’s analysis reveals the disassembled instructions:
It’s easy to get lost in the assembly here and important to keep the big picture in mind. This code is all likely unpacking code, so let’s analyze it a little further down to see how it ends. There is a strange indirect call to ESI at 0x407244.
If we continue execution to this point, we can see where it intends to lead. In this case, it’s to an address not in the original image – 0x57000 for this run. This address will change, as it’s a region of read-write-execute memory that is allocated during runtime.
This tells us that the previous code was responsible for not only allocating this memory, but also for staging shellcode for execution. Using a tool like Process Hacker, we can extract this shellcode from memory and disassemble it.
Tracing the Shellcode
Fortunately, we know the entry point is at the beginning of the binary content from our dynamic analysis. Once this shellcode is disassembled, there will be a considerable amount of code to analyze. Let’s stick with the same approach we used to get here in the first place and analyze instructions toward the end of the shellcode. This shellcode ends with a PUSH/RET technique. The location the author wants to return to is pushed on the stack just before the return instruction.
This goes further into the shellcode. However, if we trace to the end of this code, there is a jmp esi. ESI contains an address of 0x406FC0. This is a good sign, as it is taking execution back to an address in the original address space of the program. But is it the same code? By comparing the original data at the location to what is now in memory, a different result means that unpacking could be complete.
The Plot Thickens
Unfortunately, the malware is not yet ready to reveal what it is up to. Prior to performing a deep technical analysis, automated dynamic analysis was used to understand as much of this program’s behavior as possible. This malware makes a request to hxxps://real-estate-advisors[.win] and starts another process. This is likely the point at which the malware receives code for its true intended purpose. However, if we let the program run from this point, the request isn’t made and no additional processes are created. Not only do we now know that it’s not done unpacking/deobfuscating, it is also exhibiting anti-analysis techniques not observed in our manual sandbox environment.
Looking at the cross-reference graph from sub_406FC0, there is a considerable amount of code. How do we overcome this mess? One method is to start by setting breakpoints on expected. For example, CreateProcessA or InternetOpenURLA. Letting this code run ends in a call to TerminateProcess, and in this case none of these breakpoints were hit. This could indicate a few things, including anti-analysis techniques. Instead of trying to analyze this function from the top-down, focusing on the call instructions towards the end of the function may speed up analysis. Especially if this involves more unpacking, then the earlier function calls will likely be for memory allocation and more unpacking, and the later function calls for executing the unpacked code. This function ends with three function calls and after inspecting them, the call to sub_5200 appears to be the most promising.
Again, we’re faced with a significant amount of code and a limited amount of time for analysis, so let’s focus on the end of the function. Toward the end of this function is another indirect function call. These are usually interesting as they may indicate a dynamically-generated address.
Indirect function call at offset 0x5D65
As it turns out, this is the call to ExitProcess, so somewhere before this call is not only any anti-analysis, but also the next stage of functionality.
After spending some time analyzing this function, another promising location presents itself:
Call instruction to offset 0x51B0 at location 0x5C00
This function is limited in functionality, but it ultimately proves to be the location responsible for the next stage of this malware.
The call $+5 is a common shell code technique to get the address of the stack, as the call instruction will push the address of the next instruction (add [esp+10h+var_10], 5) onto the stack and then add 5 to it. The push instruction will push the address 0x51D5 onto the stack, once 5 is added to it the address that this function will return to is 0x51D6. This takes execution to the first instruction after the return. Since IDA was not able to follow this logic, we need to disassemble the code at this location.
There’s a call to DWORD_0, which actually represents the beginning of this section of code (.TEXT section). We can resume our dynamic analysis to continue to trace this code.
Setting the appropriate breakpoints, I stopped at the RETF to ensure that my analysis of where this code was going to return to was correct.
And the value on top of the stack is:
However, if you trace into this RETF the program doesn’t go to the address we expect:
What happened? Turns out, RETF takes two values off of the stack: one value for the segment and a second value for the return address. Notice the PUSH 33h before the RETF, this will force the CPU into 64-bit mode instead of 32-bit! Since I was using a 32-bit instance of WinDbg, I was getting unexpected results. Switching to a 64-bit instance of WinDbg allows us to trace into this RETF.
It’s a call to 0x401000. We have to go back to our original shellcode. IDA wasn’t able to find a reference to this location, so the code was never disassembled. Keep in mind that I extracted this as shellcode from the .text section, so an offset of 0 is equivalent to a virtual address of 0x401000. We also know something else that is very important–this is 64-bit code. Opening this code with the 64-bit version of IDA gives us an accurate disassembly listing.
Disassembled 64-bit shellcode, function graph, and call graph
One of the first things to determine is if we can find any API calls. This code doesn’t have an extensive call graph, but one function, Sub_F90, stands out simply due to the number of times it is called.
Sub_F90 may be responsible for resolving APIs. Setting a breakpoint on this function allows us to investigate the return value in the EAX register. Sure enough, they’re function pointers! Some of the more relevant ones are: NtAllocateVirtualMemory, NtWirteVirtualMemory, and RtlCreateUserThread. Following these API calls, it eventually becomes clear that the code is attempting to load a DLL via the CreateUserThread method. During execution, the DLL is copied directly into memory and never touches disk! It’s unpacked purely in memory and then loaded into the current process by the createthread call. As this is a “fileless” stage of the attack, extracting this DLL from memory provides the opportunity to continue our analysis.
Closer to the End
This DLL has only one export, which is DllEntryPoint (or DLLMain). This is called by the thread created in the previous stage, and it reveals yet another round of complicated code.
Similar to the last stage, I was able to identify the function responsible for resolving APIs. In this code, sub_180011820 returns a function pointer in the RAX register.
Tracing this allows visibility into the different APIs being called, and that is where the majority of the anti-analysis is employed. For example, there is a call to CreateToolhelpSnapshot32, which is then used to look for evidence of sandbox/analysis processes. Each process name is converted to a multibyte string, changed to upper-case, and then used to create a CRC32 checksum. The checksum value is compared to a list of pre-computed values to avoid using any strings in the sample, a deliberate obfuscation technique used to avoid clear-text strings which are easily discovered.
Array of DWORD pre-computed checksum values
R8 contains a pointer to pre-computed checksum and compared with dynamically computed checksum from process name in EBX
This code also looks for manufacturer information through a call to GetSystemFirmwareTable. Bypassing these checks allows the program to finally deliver its intended payload—to make a request for another stage to hxxps://real-estate-advisors[.win]/vwrdhrbisero/sqyeqten3/niejln3i/tag1h/luyb/45014rvw/4w5unn5vx4di.jpg!
This resource is retrieved from a command and control node and then is used to create yet another process. However, this server has now gone offline, but not before its ultimate payload was categorized as a malicious banking trojan by the anti-malware community.
Despite all of the obfuscation and anti-analysis we have examined together—and the fact that we utilized multiple tools to reveal the complete picture—every stage of this malware would have been safely contained within the Bromium Secure Platform in an isolated micro-virtual machine. Detection failed to stop the initial stages of the attack, which gave the attacker complete freedom to place secondary payloads onto the victim’s PC. This one was a banking trojan, but next time it could be something entirely different or completely new. Attackers never stop innovating—and they are always a step or two ahead of detection-focused defenders—so consider application isolation and control using virtualization-based security to protect your endpoints against whatever they come up with next.