Early disclaimer: this isn’t quite the mother of all data breaches, nor even perhaps a younger cousin, so you can stand down from Blue Alert right away.
As far as we can tell, only names, email addresses and employers were leaked in the wrongly shared document.
But what names they were!
The leaked list apparently made up a handy email Who’s Who list of global cybersecurity experts from intelligence agencies, law enforcement groups, and serving military staff.
Threat intelligence company Recorded Future and German news site Der Spiegel have listed a wide range of victims, including the NSA, FBI and the US Cyber Command in America, the German BSI (Federal Office for Information Security), the UK’s National Cybersecurity Centre…
…and we could go on.
Other countries with affected government ministries apparently include, in no particular order: Taiwan, Lithuania, Israel, the Netherlands, Poland, Saudi Arabia, Qatar, France, the United Arab Emirates, Japan, Estonia, Turkey, Czechia, Egypt, Colombia, Ukraine, and Slovakia.
Der Spiegel suggests that numerous big German companies were affected, too, including BMW, Allianz, Mercedes-Benz, and Deutsche Telekom.
A total of about 5600 names, emails and organisational affiliations were leaked in all.
How did the leak happen?
It helps to remember that Virus Total is all about sample sharing, where anyone in the world (whether they’re paying Virus Total customers or not) can upload suspicious files in order to achieve two prompt outcomes:
- Scan the files for malware using dozens of participating products. (Sophos is one.) Note that this not a way to compare detection rates or to “test” products, because only one small component in each product is used, namely its pre-execution, file-based, anti-malware scanner. But it’s a very quick and convenient way of disambiguating the many different detection names for common malware families that different products inevitably end up with.
- Share uploaded files swiftly and securely with participating vendors. Any company whose product is in the detection mix can download new samples, whether they already detected them or not, for further analysis and research. Sample sharing schemes in the early days of anti-malware research typically relied on PGP encryption scripts and closed mailing lists, but Virus Total’s account-based secure download system is much simpler, speedier and more scalable than that.
In fact, in those early days of malware detection and prevention, most samples were so-called executable files, or programs, which rarely if ever contained personally identifiable information.
Even though helpfully sharing a malware-infected sample of a proprietary program might ultimately attract a complaint from the vendor on copyright grounds, that sort of objection was easily resolved simply by deleting the file later on, given that file wasn’t supposed to be kept secret, merely to be licensed properly.
(In real life, few vendors minded, given the the files were never shared widely, rarely formed a complete application installation, and anyway were being shared specifically for malware analysis purposes, not for piracy.)
Non-executable files containing malware were rarely shared, and could easily and automatically be identified if you tried to share one by mistake because they lacked the tell-tale starting bytes of a typical program file.
In case you’re wondering, DOS and Windows .EXE
files have, from the earliest days of MS-DOS onwards, started with the text characters MZ
, which come out as 77 90 in decimal and as 0x4D 0x5A in hexadecimal. This makes EXEs easy to recognise, and all non-EXEs similarly quick to spot. And in case you’re wondering why MZ
was chosen, the answer is that those are the initials of Microsoft programmer Mark Zbikowski, who came up with the file format in the first place. For what it’s worth, and as an additional fun fact, memory blocks allocated by DOS all started with the byte M
, except for the very last one in the list, which was flagged with Z
.
Data files with added code
In 1995, the first Microsoft Word virus appeared, dubbed Concept because that’s exactly what it was, albeit an unhelpful one.
From then on, an significant proportion of active malware samples have been files that consist primarily of private data, but with unauthorised malware code added later in the form of scripts or programming macros.
Technically, there are ways to purge such files of most of their personal information first, such as overwriting every numeric cell in a spreadsheet with the value 42, or replacing every printable non-space character in a document with X
or x
, but even that sort of pre-processing is prone to trouble.
Firstly, numerous malware families sneakily store at least some of their own needed data as added information in the personal part of such files, so that trying to bowdlerise, redact or rewrite the sensitive, “unsharable” parts of the file causes the malware to stop working, or to behave differently.
This rather ruins the purpose of collecting a real-life sample in the first place.
Secondly, reliably redacting all personal information inside complex, mulitpart files is effectively an unsolvable problem in its own right.
Even apparently sanitised files may nevertheless leak personal data if you aren’t careful, especially if you’re trying to redact files stored in proprietary formats for which you have little or no offical documentation.
In short, any upload system that accepts files of arbitrary type, including programs, scripts, configuration data, documents, spreadsheets, images, videos, audio and many more…
…introduces the risk that every now and then, without meaning to, someone with the best will in the world will inadvertently share a file that should never have been released, not even on the basis of working for the greater good of all.
Right file, wrong place
And that’s exactly what happened here.
A file containing a structured list of some 5600 names, email addresses and cybersecurity affiliations of Virus Total customers was uploaded to Virus Total’s scanning-and-sharing service by mistake…
…by an employee inside Virus Total.
This really does appear to have been an innocent mistake that inadvertently shared the file with exactly the wrong people.
And before you say to yourself, “What were they thinking?”…
…ask yourself how many different file upload services your own company uses for various purposes, and whether you would back yourself never to put the right file in the wrong place yourself.
After all, many companies use numerous different outsourced services for different parts of their business workflow these days, so you might have completely different web upload portals for your vacation requests, expense claims, timesheets, travel requests, pension contributions, training courses, source code checkins, sales reports and more.
If you’ve ever sent the right email to the wrong person (and you have!), you should assume that uploading the right file to the wrong place is the sort of mistake that you, too, could make, leaving you asking yourself, “What was I thinking?”
What to do?
Here are three tips, all of which are digital lifestyle changes rather that settings or checkboxes you can simply turn on.
It’s unpopular advice, but logging out from online accounts whenever you aren’t actually using them is a great way to start.
That won’t necessarily stop you uploading to sites that are open to anonymous users, like Virus Total (downloads require a logged-in account, but uploads don’t).
But it greatly reduces your risk of unintentionally interacting with other sites, even if all you do is inadvertently like a social media post by mistake, when you didn’t want to.
If you’re in the IT team, consider putting controls on which users can send what sorts of file to whom.
You could consider using firewall upload rules to limit which file types can be sent to what sites, or activating various data loss prevention policies in your endpoint security software to warn users when they look like sending something somewhere they shouldn’t.
And if you’re not in IT, don’t take it personally if you one day find your upload freedoms restricted by order of the security team.
After all, you’ll always get a second chance to send a file that wouldn’t go out the first time, but you never get the chance to unsend a file that wasn’t supposed to go out at all.
We’re willing to bet that the Google employee who uploaded the wrong file in this incident would much rather be sitting down right now to negotiate with the IT department about having overly strict upload restrictions relaxed…
…than sitting down to explain to the security team why they uploaded the right file to the wrong place.
As Pink Floyd might have sung, in their early days, “Careful with that file, Eugene!”
David Heath
Thanks for the Pink Floyd reference!! Those of us of a certain age…..
Laurence Marks
Duck wrote:
Technically, there are ways to purge such files of most of their personal information first, such as overwriting every numeric cell in a spreadsheet with the value 42, or replacing every printable non-space character in a document with [X] or [x], but even that sort of pre-processing is prone to trouble.
See (expired) US patent US6631482B1, by the undersigned, readily through Google or the USPTO.