Skip to content
Naked Security Naked Security

PyPI Python repository hit by typosquatting sneak attack

Imposters posing as popular packages were found to contain malicious code

Somebody with time on their hands has tested out a devious new form of typosquatting targeting developers installing Python packages from the PyPI (Python Package Index) repository.

According to an advisory posted to the Slovak National Security Office (NBU), ten packages for Python 2.x were removed from the site after their setup.py files were found to contain malicious code. The bad code was hiding in plain site in the repository, using filenames either nearly identical to, or which could be mistaken for, legitimate ones.

For example, the genuine HTTP library urllib3 was being shadowed by an imposter – the difference between them is a single character:

Real: urllib3-1.21.1.tar.gz
Fake: urllib-1.21.1.tar.gz

The other fake packages were (correct names in parenthesis):

  • acqusition (acquisition)
  • apidev-coop (apidev-coop_cms)
  • bzip (bz2file)
  • crypt (crypto)
  • django-server (django-server-guardian-api)
  • pwd (pwdhash)
  • setup-tools (setuptools)
  • telnet (telnetsrvlib)
  • urllib (urllib3, a second attack on the library mentioned above).

Says the NBU:

These packages contain the exact same code as their upstream package thus their functionality is the same, but the installation script, setup.py, is modified to include a malicious (but relatively benign) code.

There’s a lot to discuss here, but clearly the attack relies on two subterfuges, the first being devs-in-a-hurry mis-typing the package name when using Python command-line installers such as pip.

That’s easy to do and there’s no way of knowing that something untoward has happened because, as the advisory says, everything looks normal when the packages install on Python 2.x using admin privileges. The pip installer, it has been pointed out, lacks any way of verifying a package using a cryptographic signature.

But some of the names seem designed to impersonate popular packages in a more general way – file names that look plausible rather than identical – which suggests that the people who put the bogus files on PyPI are also trying to catch out developers who download the source code direct.

As to the issue of motivation, could this be a proof-of-concept attack?

The NBU describes the fake packages as the “code execution of benign malware”, which sounds about right given that they collect data on the users installing them, hostnames, and which package was installed.

Except that anything installed without consent with the intention of collecting identifiable information is, arguably, harmful even if the precise motive is not clear.

Separately, researchers Benjamin Bach and Hanno Böck used the same “Pytosquatting” MO to upload 20 modified Python libraries (now removed) designed to track the IP addresses of those accessing them. The results showed that since June the packages were accessed 45,000 times on 17,000 domains.

This research, they pair said, was designed to probe insecurities in the way repositories are being used. Typosquatting is more often associated with rogue websites but this research, and the attack spotted by the NBU, is a warning that the technique can be deployed in any context.

Unravelling the PyPI attack will still be a slog:

There is evidence that fake packages have been downloaded and incorporated into software multiple times between June 2017 and September 2017.

If the tail has a sting it’s that not only is the code sitting on an unknown number of servers, it is now part of real software.

Admins who see outbound connections to 121.42.217.44 on port 8080 may be harbouring a rogue. If you have one, re-installing the correct packages should fix the issue.

Sophos blocks access to the IP address and detects the malware as Troj/Pytoy-A.

Developers – be careful including other people’s code in your projects.


2 Comments

I’m hoping this will encourage the Python package managers to implement package validation immediately.
I’m not sure if anything can be done about typographically similar package names. I typically copy and paste package names, but if I only have a general idea of what the package name should be, I’ll download everything similar, and hope I find the right one in the mix. At which point, the malware monster is inside the gate.
Is there a simple PIP command for reloading all libraries installed in a date range? I would hesitate to use it, since there might still be some compromised libraries in the public repository.

Snip>>>
The pip installer, it has been pointed out, lacks any way of verifying a package using a cryptographic signature.
<<<Snip
In this day and age I find this hard to believe. What about Python 3.xx?

Comments are closed.

Subscribe to get the latest updates in your inbox.
Which categories are you interested in?