Data Talks: Following a Trail of Confusion: PowerShell in Malicious Office Documents
While the threat landscape continues to evolve, Microsoft Office documents continue to see steady usage by malicious actors. These documents, often times equipped with nothing more than the built-in capability offered by the macros, are continuously leveraged to gain a foothold into the enterprise. And why shouldn’t they? Macros provide a broad range of powerful features, from executing PowerShell scripts to staging and executing shellcode in memory, all of which is accomplished with native functionality found in the Office Suite. This means that for many malicious documents, there is no zero-day being leveraged and therefore no patch that can be pushed to stop the attack. Even in the age of web-based Office applications, it is common for users to email Office documents to each other. From a detection standpoint, this makes it difficult to identify a malicious document from one that is benign. Add in some enticing social engineering and varying levels of obfuscation and malware authors can increase their odds of success while effectively delaying detection.
One evolution that we have observed with these types of attacks is to utilize built-in operating system functionality, such as PowerShell. This approach allows the attacker to not only utilize “living off the land” techniques, but the memory resident nature of this attack allows them to avoid writing any unnecessary artifacts to disk. Combine this tactic with sophisticated obfuscation techniques, and it increases the odds that the attackers will evade traditional and even advanced detection-based approaches. In Part 1 of this blog – the fourth installment of the Bromium Threat Labs: Data Talks series – we’ll take a look at a malicious Office document that employs such an approach (VirusTotal summary here). This analysis will begin by discussing static analysis techniques for initial assessment, then focus on dynamic analysis techniques to speed up the process of code deobfuscation. By the end of this post, you will have a better understanding of how the Office document was used in this attack.
Find the Macros
Our analysis begins by inspecting the document for macros. While macros are not the only method of executing code on a victim’s system, they are used to begin the attack in this document. OLEDUMP, which is a utility developed by Didier Stevens, is an incredibly useful tool for inspecting and extracting macros from an Office document. To begin using this tool, simply provide the name and path of our Office document to OLEDUMP, this will provide information about the structure of the document. What we’re looking for is the presence of a macro stream, this is indicated next to the index by a lower or upper case ‘m’.
This document has two macro streams: stream 8 and stream 14. By default, streams are compressed within the document, to decompress them provide the “-v” argument. The “-s” argument allows you to define the stream you wish to inspect and combining these two arguments allows us to print the content of the stream to standard out. While this allows us to investigate the macros, it is often useful to redirect this to a file and use a text editor for further analysis.
Analyzing Macro Code
Macros will use “auto” functions to begin execution. In this document, the Autoopen function is defined and will be called when the document is opened, and the content enabled.
Code obfuscation will be consistent throughout this sample, and the Autoopen function is no exception. In this function, there are two variable declarations defined around the statement wNjqSj. This statement is actually a function call, while the other statements are simply junk code intended to complicate analysis. One method to detect this type of obfuscation is to trace if those variables are used. In this example, they are assigned a random value but never used again. Identifying this method of obfuscation early in analysis is helpful, as this method will be used throughout the document.
Tracing into the function wNjqSj, the same method of adding unnecessary statements is used again. Eliminating these allows us to focus on the only function call, which is the third statement in this function.
What stands out with this function call is the keyword Shell. This method is used in macros to run commands in the command prompt within the operating system. The arguments are a series of function calls and variables being concatenated together, which is also a common technique to obfuscate the scripts that the authors intend to execute. Since these two functions are the only ones defined in stream 8, we have to investigate steam 14 to see these function calls.
The functions defined in this stream also use the same technique of adding unnecessary statements into the functions.
While we could continue to identify and then eliminate the unnecessary statements, we would still need to deal with the string obfuscation for the embedded script. To help expedite this deobfuscation, we can turn to dynamic analysis.
Switching to Dynamic Analysis with Bromium Secure Platform
Instead of manually dealing with this obfuscation, we can utilize dynamic analysis to capture the process created by these macros. This will represent the deobfuscated script hidden within the macro code and saves a considerable amount of analysis time. To do this, we utilized the Bromium Secure Platform to hardware-isolate the malware and observe its behaviors directly. Bromium Secure Platform utilizes a micro-VM architecture to allow for the isolated execution of our sample, but on a real endpoint system and not in a sandbox environment. This allows us to easily capture the process activity of the Office document, along with any other valuable threat intelligence and forensic artifacts.
The result of this first stage is the following command, which provides another stage of obfuscation:
While not immediately apparent, this command is being used to execute a PowerShell script and is seen in the subsequent process activity:
Following the PowerShell
With PowerShell, the “-e” flag informs the environment that the command is base64 encoded. Using any base64 decoding utility reveals the next stage.
There is still one more stage to deobfuscate before we can determine what this script is up to. By analyzing the function calls and objects, a new object of the DeflateStream class is created. The argument provided to this object is another base64 encoded string, this is made apparent by the call to FromBase64String. Since the decoded string is compressed data, we need to also decompress it. This can be accomplished in a number of ways, in this case we simply modified the script to print out the object using the built-in ISE in Windows.
We have finally reached the end of the obfuscation trail. This script will download one or more files from a malicious remote server, write the file(s) to the user’s TEMP folder, and then execute each binary. The macro-enabled Office document allowed the attackers to gain a foothold onto an endpoint and drop and execute any file(s) of their choosing. Often, the next payload is a loader which will regularly retrieve instructions from a command-and-control node, allowing the attackers to customize their attack based on the environment that they landed in. At this point, you are now dealing with an potentially severe incident and already several steps behind the attackers. Only the virtualization-based security provided by Bromium Secure Platform effectively deals with these constantly evolving attack techniques, as Bromium isolation prevents any malicious access to the host PC – without relying on detection – since the compound obfuscation has already defeated detection in this case.
In Part 2 of this blog, we’ll take a deep dive into the next stage of this attack, the downloaded binary file. Read the full Data Talk series here.