Site icon Sophos News

“Double agent”: a MacOS bundleware installer that acts like a spy

Security software frequently blocks “bundleware” installers – software distribution tools that bundle their advertised applications with (usually undesired) additional software – as potentially undesirable applications. But one widely-used software distribution tool for MacOS applications goes to great lengths to avoid being blocked as “bundleware” – using a number of anti-forensics techniques that are more common in malware code. The installer is from a legitimate company, which claims to reach over 1 billion users per month with its various downloaders and adware platforms.

This software attempts to evade blocking by endpoint protection systems with anti-debugging code, strings and API encryption, runtime decompression and virtual machine detection. Its developers went as far as to embed a full backdoor-like component into the installer that allows remote injection of new code into the installer – granting it capabilities that extend far beyond what one might expect from a piece of installation software.

This particular bundleware has potential access to a large number of Macs around the world. The power given to the installer practically enables full control over the target system. Even if this was done so that the company behind it would have ‘advanced analytics’ or the ability to push any third-party software it wants, given the amount of power it aggregates, what happens if this power is abused?

We’re not naming the company or the installer here because of its wide distribution; this research is entirely focused on technical aspects of the reverse-engineered software. The developers have been contacted by researchers, but have shown no interest in changing what they see as a feature of their installer, as long as it allows them to monetize the installations of their software.

The allure of bundleware

MacOS application developers use third-party installers for a number of reasons. One of the reasons often given is the desire to improve the overall “user experience” of software installation. Many developers turn to application installation platforms because they promise enhanced installation analytics, an optimized footprint, and a guaranteed smoothness of installation.

But these installers offer another attraction that is more important to many developers: the promise of monetization. Developers or websites distributing freeware (such as free games) or software based on open source components may not be able to charge for the applications, but they can capture some revenue by teaming with a third-party installation tool.

Some of these application installation platforms bundle the title application with third-party software such as adware or browser toolbars, leading to a setup that the end-user might find unwanted. Once the target application is installed, users often encounter undesirable consequences, such as installed browser hijackers that modify the search engine for their web browser. In some cases, these inject messages for “scare-ware” removers that prompt the user to pay for the removal of non-exist threats.

Apple actively combats these types of unethical applications, which the company labels as ” suspicious software .” In 2019, starting with macOS 10.14.5 (Mojave), Apple began requiring application developers to submit their code for ” notarization “, and blocking applications without notarisation or with revoked developer signatures by macOS’s Gatekeeper . Apple can revoke notarisation and the developer accounts associated with applications that are found to be abusive. But bundleware installers may not get flagged as “suspicious” in the notarization process.

Detecting bundleware is almost never a problem for endpoint protection systems – it’s a much more simplistic breed, so to speak, than malware. But the bundleware package we examine here, already blocked by a number of endpoint protection tools as a potentially unwanted application, adopts many anti-forensics and counter-detection techniques that have in the past been found mostly in malware.

The Application

The bundleware described in this post is a Cocoa application – an application built with the AppKit framework, and is distributed as an application bundle . From sample to sample, the name of the main executable varies. For example, it can be named as radiosurgicalor Herculid.

The digital signature used to sign the app varies, as this is generated by the developer packaging the application with the installer. Some observed installers are signed by “Owen Bell”. Some other installers were found to be signed by “RuiQing Software Technology Beijing Inc.” or “AVSoftware EOOD”.

The main executable

Compiled as Mach-O (the native executable file format for macOS), the main executable relies on Objective-C runtime libobjc.dylib. When the kernel first loads the main executable, it makes sure that the executable is a valid Mach-O file, and then examines its mach_headerstructure.

Next, it loads the dynamic linker to load all the shared libraries that the main executable links against. The dynamic linker then initializes the Objective-C runtime, and then calls the program’s main () function.

However, the main executable’s entry point starts from garbage bytes – that is, it has no valid code to execute:

__text: 0000000100001150 04      start     db     4  
__text: 0000000100001151 4A                db     4Ah ; J 
__text: 0000000100001152 3E                db     3Eh ; >

With no valid code at the entry point, how does it run without crashing? The answer lays in Objective-C’s concept of non-lazy (‘eager’) and lazy (‘on-demand’) implementation of classes.

Non-lazy classes are realized when the program starts up. These classes will always implement + load method. Contrary to that, lazy classes (classes without + load method) do not have to be realized immediately, but only when they receive a message for the first time (hence the term ‘lazy’).

Let’s check out Objective-C Runtime’s own source, found in objc-runtime-new.mm file. The snippet below realizes the non-lazy classes, retrieved with _getObjc2NonlazyClassList () call:

// Realize non-lazy classes (for + load methods and static instances) 
for (EACH_HEADER) {
     classref_t * classlist = _getObjc2NonlazyClassList (hi, & count);
    for (i = 0; i <count; i ++) { 
        realizeClass (remapClass (classlist [i])); 
    } 
}

Looking at the Objective-C Runtime’s source in objc-file.mm, one can see that _getObjc2NonlazyClassList () function collects non-lazy classes from the __objc_nlclslistdata section:

GETSECT (_getObjc2NonlazyClassList,    // function name 
         classref_t ,                  // content type 
         "__objc_nlclslist" );         // section name - 'nl' stands for non-lazy

In the case of our bundleware executable, the __objc_nlclslistdata section of the main binary is very small. It enlists only two non-lazy classes: ListedUpaithricand __ARCLite__:

__objc_nlclslist: 0001000692C8     __objc_nlclslist segment para public 'DATA' use64  
__objc_nlclslist: 0001000692C8 dq offset _OBJC_CLASS _ $ _ ListedUpaithric  
__objc_nlclslist: 0001000692D0 dq offset _OBJC_CLASS _ $ ___ ARCLite__  
__objc_nlclslist: 0001000692D0     __objc_nlclslist ends

The ListedUpaithricname, like many of the other class names in the executable, is random. For example, in another sample this class is called HoundingHusky.

These two non-lazy classes will be called in the order they’re listed, realized by their + load  method.  However, at first glance the + load method of the __ARCLite__class contains no valid code, because it is located within the __textsection of the executable – which is encrypted.

Finding a secret agent

The + load method of the ListedUpaithricclass is physically located in a section of the executable that has a random name, such as __amorphaor __mottled. Once run, the + load method will take a 32-byte XOR key hard-coded in the body and then use that key to decrypt the __textsection (around 15KB in size) of the executable, including the + load method of the __ARCLite__class.

The decryption routine relies on a special anchor stored in the __textsection. The virtual address of this anchor is used to describe the virtual address and virtual size of the encrypted __textsection.

For example, in this particular sample, the anchor can be located at the virtual address 0x100001CF0, as shown below:

__text: 000100001CF0 23        anchor       db 23h ; #      ; DATA XREF: decrypt_code + 101  
__text: 000100001CF1 2B                     db 2Bh ; + 
...

The decryptor uses the address of the anchor to describe the parameters of the __text section. In the snippet below, the decryptor makes the __text section writable and executable, by assigning a new protection to it. To do that, it takes the address of the anchor 0x100001CF0 and subtracts 0xBA0 from it to locate the start of the __text section – 0x100001150:

vm_protect (mach_task_self (),          // decoded stings: 'vm_protect', 'mach_task_self_'  
            ( char *) & anchor - 0xBA0,    // 0x100001150 -> start of the __text section  
            0x37F2,                    // size of the entire __text section: 14,322 bytes 
            0 , 
            VM_PROT_ALL)               // assign read, write, and execute access rights

Once the entire __textsection is decrypted, the anchor shown above gets decrypted into the following text:

__text: 000100001CF0 MaximMaximovicIsayev db ' Maxim Maximovich Isayev ', 0

The resulting text is the same across all the samples we’ve looked at, and is quite interesting by itself. Maxim Maximovich Isayev (Максим Максимович Исаев) is the Russian name of Max Otto von Stierlitz, the lead character in a popular Russian book series written in the 1960s, and of the most popular Soviet television series ever, Seventeen Moments of Spring .

A Soviet James Bond , Stierlitz takes a key role in SS Reich Main Security Office in Berlin during World War II. Working as a deep undercover agent within SS, he diverts the German nuclear “Vengeance Weapon” research program into a fruitless dead-end.

Leaving a hidden marker like that could indicate an intentionally planted false flag. Regardless of the intention, this marker stays constant across the entire family of this bundleware.

After the ListedUpaithric class is called and decrypts the text, being next in line, the + load method of the __ARCLite__ non-lazy class is called to perform further initialization. The decrypted __text section is quite small – it’s a valid code section containing a valid entry point in it, and it consists of another layer of decryptor and decompressor:

__text: 0000000100001150             public start  
__text: 0000000100001150      start proc near  
__text: 0000000100001150 push    0 
__text: 0000000100001152 mov rbp, rsp 
__text: 0000000100001155 and rsp, 0FFFFFFFFFFFFFFF0h  
__text: 0000000100001159 mov rdi, [rbp + 8 ]

With both non-lazy classes realized, the entry point above receives control. From there, the main () function of the executable is called, followed with _exit () .

The main () function

Once the entry point within __textsection is called, the main () function that follows it will read an internal chunk of data with the size of ~ 300KB.

This encrypted data is stored in a separate section of the executable.

The data will be read, its CRC32-based hash validated, then decrypted and further decompressed into a buffer, allocated with vm_allocate () function.

The decompression is achieved by dynamically loading libz.1.dylib library, and calling uncompress () API from it.

The decompressed data has a size of ~ 800Kb, having a format of Mach-O executable ( MH_BUNDLE type). This data is loaded from memory as a plugin with the help of NSCreateObjectFileImageFromMemory () and NSLinkModule () APIs.

This method is equivalent to dynamic DLL loading on Windows. It is described on macOS man page as a way to programmatically load plugins after a program starts executing.

The Engine

The loaded module represents itself an engine driven by the JavaScript files.

Some of the scripts reside in the app’s Resources directory in an encrypted form, forming an SDK. Other JavaScript files are fetched from a remote server as tasks (internally called ‘offers’ as they are designed to offer / advertise other products).

The downloaded tasks rely on high-level function calls from the SDK. This allows composing tasks with a very flexible logic.

There are several classes exposed by the engine to the SDK, such as:

The engine itself is executed on macOS platform natively.

For example, a JavaScript task may attempt to elevate the privilege level with the following call:

function relaunchWithRoot () { 
    installer.relaunchingWithRoot = true; 
    var successR = installer.elevatePrivilegedTask ();

The elevatePrivilegedTask () call in JavaScript has a corresponding method in the engine’s class, such as tr54jds23. The engine exposes this method to JavaScript code so that it can be called directly from the engine.

When tr54jds23 -> elevatePrivilegedTask () is executed, the engine calls another method: ICTaskManager -> elevatePrivilegedTask () .

That will, in turn, create a task ICTaskManager-> root_Task . The task will then create an authorization with the system.privilege.adminflag. Next, the task is executed with the AuthorizationExecuteWithPrivileges () call.

In practice, this may invoke a dialog asking for admin password so that the task can be executed as root.

String / API Encryption

The engine module stores the names of all critical functions and most critical strings encoded. In one of the analyzed samples, there are 1,228 encoded strings, decoded with 1,055 different functions. That is, some strings are decoded with the same function.

All the string decoding functions use different keys, but they implement one of the following 3 algorithms:

One of the string decryption routines can be demonstrated with the anti-debugging trick explained in the next section.

Anti-debugging Trick

An attempt to attach to or run the bundleware app under a debugger produces the following error:

mac: / user $ sudo lldb /Users/user/Installer/Installer.app
(lldb) target create "/Users/user/Installer/Installer.app"
Current executable set to '/Users/user/Installer/Installer.app' (x86_64).
(lldb) r
Process 1280 launched:
'/Users/user/Installer/Installer.app/Contents/MacOS/radiosurgical' (x86_64)
Process 1280 exited with status = 45 (0x0000002d)

The anti-debugging defense is provided with ptrace () request named PT_DENY_ATTACH( 0x1F), called from the function below:

ptrace = 0x515D5A5D;                       // encrypted 'ptrace' string: 5D 5A 5D 51 
ptrace_plus_4 = 0x5752;                    // 52 57 
ptrace_plus_6 = 0x33;                      // 33 
ptrace [0] = add_2D_xor (0x5D, 0);           // decrypt 1st char (5D ^ (2D + 0)) 
i = 1;                                     // start loop from the 2nd char
do
{                                           // decrypt the rest 
    ptrace [ i ] = add_2D_xor ( ptrace [ i ], i );  // ptrace [i] ^ = 2D + i 
    i ++;
}
while ( i ! = 6);                            // 6 characters from the 2nd char, including / 0 
fn_ptrace = dlsym (RTLD_NEXT, & ptrace );     // get proc address from the linked dylibs 
return fn_ptrace ( PT_DENY_ATTACH , 0, 0, 0); // call ptrace () by pointer, deny tracing

If the process is being debugged, as defined in man ptrace, it will exit with the exit status of ENOTSUP( 45), ‘error, not supported’ . Otherwise, it sets a flag that denies future traces – an attempt to debug it with this flag set will result in a segmentation violation exception.

By stepping over a call into the function that denies tracing (or NOP-ing the 5 bytes of the call), the anti-debugging trick above can be easily circumvented:

->   0x103dd1ff5 <+25>: callq 0x103e30cd3                ; call the function with ptrace ()
    0x103dd1ffa <+30>: callq 0x103de03aa; ICCrashLogger :: sharedLogger ()
    0x103dd1fff <+35>: movq% rax,% rdi
(lldb) re w pc `$ pc + 5`       ; step over the call by adding 5 bytes to $ pc 
(lldb) x / 2i $ pc              ; now $ pc (pseudo-name for RIP) points to the next instruction
-> 0x103dd1ffa: e8 ab e3 00 00 callq 0x103de03aa; ICCrashLogger :: sharedLogger ()
    0x103dd1fff: 48 89 c7 movq% rax,% rdi

Obfuscation Variations

Some samples of this bundleware family contain variations in the string encryption algorithm.

For example, a hard-coded integer number 6within a decryption function can be encoded with a separate function, as shown below:

get_6 proc near
        push rbp
        mov rbp, rsp
        mov al, 3          ; al = 3 
        shl al, 2          ; al = 12 
        movsx ecx, al        ; ecx = 12 
        mov eax, 65        ; eax = 65 
        xor edx, edx       ; edx = 0 
        idiv ecx            ; 65/12, eax = 5 
        mul cl             ; eax = 60 
        mov cl, 65         ; cl = 65 
        sub cl, al         ; cl = 65 - 60 = 5 
        inc cl             ; cl = 6 
        movsx eax, cl        ; result = 6
        pop rbp
        retn
get_6 endp

The same function is collapsed by the Hex-Rays Decompiler into:

signed __int64 get_6 ()
{
    return 6;
}

VM Evasion

Because of their advertising-based business model, the developers of this bundleware installer go to great lengths to prevent fraud by those looking to pump up the monetization of their apps using large numbers of virtual machine installs. The engine is able to detect the presence of a virtual environment through the method checkPossibleFraud () . This method is exposed to JavaScript, where it can be called as:

var isVm = system.checkPossibleFraud ()> 0? 1: 0;

To achieve that, the engine compiles a so called ‘fraud’ report that consists of the following details:

Crash Logger

The crash logger in the samples we examined sends a GET request to a remote script, disguised as a PNG file:

http [: //] ec2-54-191-37-103.us-west-2.compute.amazonaws [.] com / black.png

The stats it submits to the remote script are encoded as URL parameters:

Configuration

The installer uses two configuration files. The first one is dynamically extracted from an unused cavity of the installer’s own DMG file. This configuration is written into the DMG file (a process internally called ‘injection’) after the DMG file is built, and is encrypted with AES-128 algorithm.

To locate the encrypted config within the DMG file, the installer module parses the contents of the file. For each pair of bytes, it subtracts one byte from another, until if locates the following signature that consists of 7 64-bit integers, such as:

__const: 0000000103E51C40    signature    dq 0Fh , 9 , 3Eh , 23h , 7 , 86h , 0Ch

Once located, the config is then extracted and decrypted. As shown in the example below, the extracted configuration specifies the URL of an application to download and install:

PRODUCT_TITLE = Duolingo% 202017
 PRODUCT_DESCRIPTION = To% 20install% 20Duolingo% 202017% 20for% 20Mac% 20click% 20Continue.
PRODUCT_VERSION = Mac
 PRODUCT_PUBLIC_DATE = 2017
 PRODUCT_FILE_NAME = Duolingo% 20% 20Setup
 PRODUCT_FILE_SIZE = 260.8% 20MB
 CHNL = download7-Duolingo
 DOWNLOAD_URL = http% 3A% 2F% 2Fcdn.downloadfree2.com% 2Fmacsoftware% 2FBlueStacks Installer.dmg
 PRODUCT_LOGO_URL = http% 3A% 2F% 2Fwww.download7.co% 2Fgamegraphics% 2F90.png
 ROOT_IF_INSTALLED = com.bluestacks.BlueStacks
 APP_NAME = BlueStacks
 TOS_URL = http% 3A% 2F% 2Fwww.download7.co% 2Feula.html
TYP = http% 3A% 2F% 2Fpiroga.space% 2Fpages% 2FDM% 2FDMTYP.html% 3Foffers% 3D
 EXIT_PAGE_URL = http% 3A% 2F% 2Fpiroga.space% 2Fpages% 2FDM% 2FDMInter.html
 PRIVACY_URL = http% 3A% 2F% 2Fwww.download7.co% 2Fprivacy.html
 ISPBROWSER = ch
 % 40REPORT_ADD_PARAMS = IRONBRO_ID% 253D9309% 2526INST_GUID% 253D4136ef6c-c79c-49b3 -...
 INST_GUID = 4136ef6c3e -794

The 2nd configuration file is provided as a JavaScript file, and is decrypted with the other SDK files from the app’s Resources directory.

This configuration defines multiple operational parameters, such as report and ad servers:

var appInfo = {
    report: 'http: // rp. [REMOVED] .com' ,
    ad_url: ’http://os.[REMOVED].com/MacDarwenDLM/?v=5.0’,
    requires_root: false,
    root_if_installed: [’’],
    skip_vm_check: false,
    ...

Report Server

The report server from the configuration is used to receive posted reports.

The example below demonstrates what data is posted to the report server:

AC = DarwenDLM
PrID = MacDarwenDLM
PrSub = MacDarwenDLM
RS = Q
IRVER = 106.1712
CHNL = download7-Duolingo
PROD_TITLE = Duolingo 2017
schemeName = MacDarwenDLM
OSName = OSX
OSVer = 10.12
OSLang = en
_makeDate = 201711091722
SDT = 20181004195204931
UID = 9C0C266E-266D-4D98-B83C-BCB2A3018EB7
BRW = Safari
OSPlat = 2
MAC_L = [REMOVED]000000000000%3A127.0.0.1%3A24%3A0
hddSize = 107374182400
_makerver = total20171107115116
Isuseradmin = 1
isVmDef = 1
inst_flv = no_injection_106.1712
dwa.SrcNo = 1
QuitPage = welcomePage
RepCnt = 1
ofrClPrm = 266E-266D-4D98-B83C-BCB2A3018EB7

As seen in the example, the data it posts contains basic system information, such as macOS version number (OSVer), language (OSLang), MAC and IP addresses for all network interfaces (MAC_L), default browser name (BRW), HDD size, whether a VM was detected or not (isVmDef), whether user is admin (isuseradmin), and some other parameters.

The collected data is assembled into a text, then encrypted with AES-128, and posted to the server:

Tasks

Remote tasks are received encrypted from the ad server, as shown below:

POST os.[REMOVED].com/MacDarwenDLM/?v=5.0
USER-AGENT: ICMAC
Response:
    Header: X-ICSCT-SERVER-NAME: ads-slave-1111-production-us-west-2-i-07e9c6437616f3e49
    Data: 85,368 bytes binary [6c ec 6c 99...]

When the received task is decrypted, its data is split into named sections. Each section is surrounded with the following comments:

var namestartstr = ’<!--SECTION NAME="’;
var nameendstr = ’"-->’;
var sectionendstr = ’<!--/SECTION-->’;

The parser extracts JavaScript code from those sections. That code will then rely on APIs exposed by the SDK, to drive the engine that exposes its own API interface to the SDK.

The nature of the received tasks may depend on the presence of VM (a condition, internally called ’fraud’).

An analysis of the tasks received from the ad server reveals no malicious activity.

Engine Capabilities

The bundleware’s engine consists of the several components, capable of doing the following:

CI/CD pipeline

The authors of the bundleware have apparently adopted Jenkins, an open source tool for building Continuous Integration and Continuous Delivery (CI/CD) pipelines.

This fact is evident from the full path names of the source files that are still visible in the compiled engine module, such as:

/Users/[REMOVED]-ai/jenkins/workspace/AI_Mac_Release_Production/continuous-integration/mac_team_repo/Sources/[REMOVED]Utils.mm

With Continuous Integration in place, every commit made to the source code in the Git repository is pulled and built automatically. If it results in a bug, the developers can quickly identify the responsible commit.

Continuous Integration pipeline also allows automated testing and deployment. Apart from reducing the length of time it takes to release new PUA builds, it also allows to increase their quality.

Conclusion

Being a legitimate distribution platform, the techniques employed by this popular bundleware product conceal a very powerful engine.

When viewed from a certain angle, this engine resembles a backdoor as it unlocks full access to the system.

The sheer power of the engine is made covert with the wisely engineered trickery. Some of its methods, such as loading code from memory, are known from the “The Mac Hacker’s Handbook”, and rightly belong to the world of malware.

Given that the engine is driven by symmetrically encrypted remote tasks, any researcher who pays attention to detail couldn’t help but wonder what would happen if the control of its engine was intercepted.

Careful analysis of these techniques also demonstrates a disturbing trend we’re witnessing – the continued ’spill’ of the traditional Windows malicious techniques, such as run-time packing, strings/API obfuscation, memory injection into the Mac world.

Even though the installer itself is legitimate, an analysis of state-of-art code where these techniques are honed to perfection is vitally important for the researchers to understand what opportunities exist on the macOS platform, to be better prepared for the challenges that lay ahead of us.

This report is based on research previously presented at Virus Bulletin’s VB2019 conference .

Exit mobile version