Over the past few years exploit kits have been widely adopted by criminals looking to infect users with malware. They are used in a process known as a drive-by download, which invisibly directs a user’s browser to a malicious website that hosts an exploit kit.
The exploit kit then proceeds to exploit security holes, known as vulnerabilities, in order to infect the user with malware. The entire process can occur completely invisibly, requiring no user action.
In this research article we will take a closer look at one of the more notorious exploit kits used to facilitate drive-by downloads – a kit known as Angler exploit kit (Angler hereafter).
The weekly volume of Angler related detections since mid-2014 shows a burst of activity in late 2014, followed by a slight lull, and then increased activity from March 2015 onwards:
Since the demise of the Blackhole exploit kit in October 2013, when its alleged operators were arrested, other exploit kits have certainly flourished and shared the marketplace, but Angler has begun to dominate. To show Angler’s prevalence against other exploit kits, we analyzed a snapshot of activity for three different periods (September 2014, January 2015 and May 2015):
2 Traffic Control
2.1 HTTP POST redirection
The data sent in the form submission includes three values (encoded), presumably to assist the criminals managing this redirection mechanism:
- IP address
- User-Agent string
A few months later we saw a similar technique being used, but this time with a Flash component. The compromised web pages were modified to include HTML that loaded a malicious Flash file from yet another compromised site.
ActionScript within the Flash file would then retrieve the various parameters and issue the HTTP POST request.
2.2 Domain generation algorithm
The downside of this type of redirect is that once the algorithm is known, the security community can predict the target hostname on any particular date, and can blocklist it, effectively shutting down the attack.
2.3 HTTP redirects
The final example included in this section illustrates the use of web redirects using the HTTP response code 302, commonly used on legitimate sites to divert visitors elsewhere. This involves straightforward content injection, but with an additional link in the redirect chain. In late April 2015, we noticed that users browsing the eHow web site were being redirected to Angler. Analysis of the traffic revealed the redirection steps illustrated in Figure 8.
Within days of observing this, we received other reports of identical redirection (cdn3.optimizelys.com again) being used from other sites – this time the final target being the Nuclear exploit kit.
3 Exploit Kit
3.1 Landing page
A variety of obfuscation techniques have been used in the Angler landing pages since the first appearance of the kit. Aside from making analysis more awkward, these techniques make it easier for the criminals to dynamically build unique content on each request, in an attempt to evade detection by security products. For the past year or so, Angler has been encoding its main script functionality as data strings stored in the parent HTML. This content is then retrieved and decoded when the landing page is loaded by the browser. This is a common and well established anti-emulation technique used by numerous malicious threats.
Decoding the outermost obfuscation layer reveals the next trick used by Angler – anti-sandbox checks. It uses the XMLDOM functionality in Internet Explorer to determine information about files present on the local system. It does this in order to detect the presence of various security tools and virtualization products:
The added script content will be specific to the particular vulnerabilities targeted in any version of the landing page. In all cases however, the added scripts will contain code to:
- Dynamically construct shellcode.
- Retrieve data from an array in the Angler landing page (see below).
The size and content of this array will vary according to the vulnerabilities targeted in that version of the kit. Examples of the data stored in the array include:
- The name of the server hosting the exploit kit.
- Folder used to locate Silverlight content
- Folder used to locate Flash content
- Payload URLs.
- Flash payload string (encoded shellcode or URL data).
This data is retrieved and decoded as necessary. Here, you can see the relevant payload string being incorporated into the dynamically generated shellcode:
The code to load the malicious Flash component is straightforward. Assuming the anti-sandbox checks are passed, three characteristically named (get*) functions are used to retrieve and decode strings from the landing page. These are then used to build the HTML object element, which is added to the document:
3.2 Flash content
Angler’s Flash content has varied considerably over the past year. The samples are routinely obfuscated using various techniques, including:
- ActionScript string obfuscation techniques.
- Base64 encoding.
- RC4 encryption.
- Commercial Flash obfuscation/protection tools (e.g. DoSWF and secureSWF).
An additional complication is Angler using embedded Flash objects. The initial Flash loaded from the landing page is fairly innocuous, serving merely as a loader to deliver the exploit via an inner Flash. The inner Flash may be carried as a binaryData object or encoded as a string within the ActionScript. In either case, the data will be encrypted using RC4. An example from January 2015 looks like this:
Data is passed from the landing page to Flash via the use of a FlashVars parameter in the landing page HTML – a common technique in recent exploit kits.
This is easily accessible from ActionScript via the parameters property of the loaderInfo object. This mechanism has been well used in many exploit kits in order to:
- Pass in payload URL data.
- Pass in shellcode.
This flexibility enables dynamic customization of shellcode without having to recompile the Flash itself.
Figure 15 shows snippets of the ActionScript used in recent Angler Flash objects to retrieve the encoded data from the landing page (“exec” variable). Additionally, you can see the use of control flow obfuscation in some of the functions involved, complicating analysis:
In early 2015, there was a succession of zero-day vulnerabilities (including CVE-2015-0310, CVE-2015-0311, CVE-2015-0313, CVE-2015-0315, CVE-2015-0336, CVE-2015-0359) in Adobe Flash Player, which were quickly targeted by Angler. The targeting of these vulnerabilities by Angler (and other exploit kits) has been well described elsewhere.
This flurry of Flash activity supports the trend away from Java exploitation over the past 18 months, a change that was kickstarted when Oracle blocked the use of unsigned browser applets by default in Java 7 update 51.
3.3 Shellcode analysis
As noted above, the shellcode is dynamically generated within the script when the landing page is loaded by the browser. The exact contents of the shellcode will vary according to the vulnerability being targeted, but in all cases the encoding and structure is similar. The analysis below describes the shellcode that is used in exploiting CVE-2014-6332.
Sure enough, analysis of the additional Unicode data provided from the VBScript (buildshell1 in Figure 16) confirms it is executable code, and actually contains a small decryption loop to decrypt the bytes contained in remainder of the shellcode, which includes the payload URL and decryption key. This decryption loop reverses the encData() function referenced in Figure 12, where the payload URL and payload key were encrypted.
After this loop completes, the main shellcode body is decoded and can be analyzed.
After resolving the base address for kernel32, the shellcode parses the export address table to find the functions it requires (identified by hash). It then uses LoadLibraryA API to load winhttp.dll, and parses those exports to find the functions it needs:
|Functions (imported by hash reference)
|CreateThread, WaitForSingleObject, LoadLibraryA, VirtualAlloc, CreateProcessInternalW, GetTempPathW, GetTempFileNameW, WriteFile, CreateFile, CloseHandle
|WinHttpOpen, WinHttpConnect, WinHttpOpenRequest, WinHttpSendRequest, WinHttpReceiveResponse, WinHttpQueryDataAvailable, WinHttpReadData, WinHttpCrackUrl, WinHttpQueryHeaders, WinHttpGetIeProxyConfigForCurrentUser, WinHttpGetProxyForUrl, winHttpSetOption, WinHttpCloseHandle
If the exploit works, then the payload is downloaded and decrypted by the shellcode, using the aforementioned key. Angler uses different keys according to the exploitation path (Internet Explorer, Flash, Silverlight – at least two keys for each are currently known).
After decrypting the payload, the shellcode checks the header to identify if the payload is yet more shellcode (which starts with do-nothing NOP instructions at the start) or a Windows program (which starts with the identifying text string “MZ” at the start):
If the decrypted payload is a program, it will be saved and run. If it is a second stage of shellcode, the final executable payload is embedded within the body of the shellcode (sometimes in both 32-bit and 64-bit versions).
When the second-stage shellcode runs, the payload is inserted directly into memory in the process of the exploited application, without first being written to disk. This “no-file” characteristic is responsible for some of the notoriety that Angler has recently gained. This mechanism has been used to infect users with malware from the Bedep family, which itself provides the ability for an attacker to download yet more malwares.
4 Network Perspective
In this section, we will switch focus from content analysis to a look at Angler activity from a network perspective.
4.1 Fresh registrations
As with most web attacks, Angler makes liberal use of fresh domain registrations.
As is typical for drive-by downloads, we usually see a flurry of registrations that resolve to the same IP number for a short period.
Occasionally, Angler has used free dynamic DNS services, a tactic widely used by exploit kits for years.
4.2 Domain shadowing
Angler has made aggressive use of hacked DNS records as well, a technique used by attackers for several years that is becoming popular once again. The criminals update the DNS records of legitimate domains, adding multiple sub-domains that direct to the malicious exploit kit – a technique sometimes called domain shadowing.
An example is shown in Figure 20, based on some activity seen in May 2015. In these attacks, a 2-level technique is used, where the DNS records have been updated to include wildcard entries (*.foo.example.com, *.bar.example.com). So in this attack, the records were updates with entries including:
This enables the attackers to resolve their shadowed domains to the malicious IP, a computer hosted in Russia at the time of writing. As you can see, the specific sub-sub-domains used in the Angler redirects appear to be six-character strings, presumably chosen to let the attackers track and attribute incoming traffic (most likely for statistics and payment).
In other cases, we have seen Angler using single-level DNS hacks:
In some cases the string used in the sub-domain appears to have some relevance to the domain name being shadowed. This suggests human involvement, rather than random text generated by a program.
Domain shadowing relies on the criminals being able to modify legitimate DNS records, which is most likely due to stolen credentials. Site owners do not necessarily understand the critical nature of their DNS records, so it is concerning that many of the providers/registrars do not guard the DNS configuration more closely.
Most site owners will rarely need to update the records, so any updates should ideally be protected by more than user credentials. Suggestions to improve this situation include:
- Implement two-factor authentication.
- Send email notifications after DNS changes.
Thanks to collaboration with researchers at Nominet, we have been able to look more closely into the domain shadowing activity used in Angler attacks. The intention was to reveal the timeline of events upon a DNS record being hacked. We focused on the DNS query activity for a number of domains associated with a single compromised user account where attackers were using two-level domain shadowing (as in Figure 20).
Figure 22 illustrates a snapshot of DNS query activity for 26 February 2015. Data for three domains (all within the same compromised user account) is included. Green circles show the overall query volume; purple squares highlight above-average query volumes; and the white circles denote periods of sharply-increased query volume, when the attackers started actively using these domains for Angler traffic.
This data reveals two stages in the attack. First, the criminals tested the hacked DNS records with a modest number of queries at 10am. These queries were for the single-level sub-domain, (foo.domain.co.uk), and were all from the same source.
Once the criminals are happy their newly-stolen domain names are resolving reliably, we see bursts of malicious Angler traffic using the two-level sub-domain. In figure 22 you can visualize the attackers cycling through the different domains, with each one being active only for a short period. This correlates nicely with our detection telemetry, where we see similar bursts of activity per hostname.
This information gives us a better impression of the infrastructure and management used to control user web traffic for the purpose of redirecting it to malicious sites.
4.3 URL structure
The URL structure used for the various components of a drive-by attack can often be useful in identifying malicious activity in amongst user web traffic. Historically, certain exploit kits have used predictable URL structure for different components, making it easier for security providers to detect and block content. Some examples from the Nuclear and Blackhole exploit kits include:
- Timestamps as filenames.
- Hashes used in file and folder names.
- Characteristic text in the query part of URLs.
Similar weakness were present in early versions of Angler as well, but the kit has evolved significantly since then, taking steps to remove any Achilles heel that might have been easy to spot in the URLs used for its various components.
The final section of this article aims to provide an overview of the actual malware being installed through Angler. To investigate this, we analyzed payloads for a 4-week period during April 2015.
As you can see, all the payloads collected during this period were delivered by exploits against Internet Explorer (59%) or Flash (41%):
This matches our data for recent Angler landing pages, where we have not seen exploits against Silverlight or Java. However, these results are determined as much by the victims’ computer configurations as by Angler itself, because Angler avoids trying exploits against components that are not installed.
The malware families installed in these drive-by downloads were as follows:
Clearly there is a ransomware focus here – the names tagged with asterisks are ransomware families, and account for more than 50% of the malware attacks. The most common ransomware was Teslacrypt.
In this research article we have taken a top-to-bottom look at the Angler exploit kit, highlighting some of the methods used to ramp up traffic to Angler-infected web pages.
Understanding this behavior end-to-end is vital when providing protection. Angler attempts to evade detection at every level. To evade reputation filtering it switches hostnames and IP numbers rapidly, as well as using domain shadowing to piggyback on legitimate domains. To evade content detection, the components involved in Angler are dynamically generated for each potential victim, using a variety of encoding and encryption techniques. Finally, Angler uses obfuscation and anti-sandbox tricks to frustrate the collection and analysis of samples.
As illustrated above, Angler has risen above its competitors in recent months. This could be down to many factors: higher traffic to Angler-infected pages; exploits with a better hit-rate in delivering malware; slicker marketing amongst the criminal fraternity; more attractive pricing – in other words, good returns for the criminals who are buying “pay-per-install” malware services from the team behind Angler.
One thing is clear: Angler has a serious impact on anyone browsing the web today.
Thanks also to Richard Cohen and Andrew O’Donnell of SophosLabs for their work on the shellcode components.
Thanks to Ben Taylor, Sion Lloyd and Roy Arends of Nominet for their insights into DNS domain shadowing.