Skip to content
Naked Security Naked Security

Forget BadBIOS, here comes BadBarcode…

BadBarcode is a securiy problem with barcode scanners that has made some dramatic headlines - but what is it, and what can be done about it?

What a difference a cool name makes!

A security research paper entitled Putting control characters into one-dimensional barcodes to trip up sloppily coded apps probably wouldn’t grab your attention.

But BadBarcode would, so that’s what a Chinese security researcher who goes by the name Hyperchem Ma called his paper at the recent PacSec 2015 conference.

The paper has received a lot of publicity, including some dramatic headlines like “Poisoned barcodes can be used to take over systems” and “Customized barcodes can hack computers”.

So we thought we’d take a look at what BadBarcode really is, so you can decide how dangerous the problem is likely to be.

A one-dimensional study

Ma looked only at so-called one-dimensional, or 1D, barcodes, which are the ones you typically find on products on a supermarket shelf.

The barcode runs in a single line, printed from left-to-right, though thanks to the arrangement of the stripes, it can be read upside down.

There are two main sorts of 1D barcode, known as Code 39 and Code 128.

The names are a curious mix of history and peculiarity.

Code 39 can represent 43 different characters these days, but it was originally limited to 40 symbols, with one reserved as a start/stop marker, and so the number 39 (40 minus 1) stuck in the name

Code 128, curiously, can represent 108 symbols, only 103 of which are actual data characters, but it has 3 control symbols that choose which data bytes are represented by each of the 103 encodings.

You can mix control symbols and data symbols inside a barcode, with the control symbols acting a bit like the Caps Lock key on a keyboard to toggle between different parts of the ASCII character set.

In short: Code 128 can represent all 128 characters in the 7-bit ASCII set, including characters like Ctrl-C, Ctrl-M (Carriage Return) and Ctrl-[ (Escape).

The barcode “keyboard”

The reason this matters is that most barcode readers are implemented as plug-and-play keyboards, just like old-school credit card magstripe readers.

That way, you can read barcodes into your app simply by reading from the keyboard, as you would if the operator typed in the characters printed underneath the barcode.

Now, imagine that your app expects Code 39 barcodes: you might well assume that the input from the pseudo-keyboard barcode reader will only ever include A-Z, 0-9, space and one of -$%./+.

So, even if your app is written using a programming library that processes, say, Ctrl-O as a shortcut to open a file dialog, or Ctrl-R to run a new program, and so on, you might assume that you don’t have to worry about those special characters turning up in a maliciously-generated barcode.

Code 39 doesn’t support those characters, so they can’t show up.

So you might be inclined to trust the input from the barcode implicitly, for example when a user wants to scan an item at one of the price check stations that many supermarkets provide.

But if a crooked customer shows up with a Code 128 barcode that reads something like…

[Ctrl-R]CMD.EXE[Enter]DEL /Y /S C:\*.*[Enter]

…then many barcode readers will nevertheless recognise it as a valid barcode, choose the right decoding algorithm, and return the characters anyway.

As a result, your app might wander into trouble.

Validate your input

To work around that, you’re probably thinking that validating your input is a good idea, and you’d be right.

In other words, you accept a line of input from the barcode scanner but check through it first for anything out of place.

If you’re expecting digits only, for example, then when letters, punctuations or control characters appear, you can trigger an error and refuse the input, instead of going ahead with something definitely unexpected and potentially dangerous.

However, that might not be enough on its own, because the operating system itself – or at least what’s called the window manager – might detect and act on some special characters immediately, before your input validation algorithm is even called.

Window managers are needed when several apps share the keyboard and screen, to make sure that the right apps send and receive the right content, and to deal with special keystrokes that should be consumed directly by the window manager itself, such as Alt-TAB on Windows.

So if you want to protect your barcode-reading app from unusual, unexpected or even malicious “keystroke” data inside a barcode, you also need to familiarise yourself with the low-level programming functions that allow you get the first look at every keystroke, even before the window manager gets its chance.

On Windows, for example, the function SetWindowsHookEx() is your friend.

With this function, you can instruct Windows to call a special procedure inside your app, known as a LowLevelKeyboardProcHook, giving you first look at the keystroke that’s coming next, and allowing you to process it (or ignore it, or change it) before anyone else gets a chance.

That way, you can improve the safety and security of programs that need to accept input from untrusted outsiders, yet are forced by the available hardware to consume that input as if the potential attacker were typing away at a keyboard.

By the way, there’s a whole slew of 2D barcodes as well, such as Data Matrix, PDF417 and – perhaps the best known sort – QR codes.

The 2D barcodes typically let you store much more data in the same space, so are increasingly widely used – and increasingly widely supported by barcode readers.

For all you know, your Code 39-based app, programmed to assume digits only, might some day be confronted by hundreds of bytes of data from a QR code, simply because you can’t control what an untrusted outsider might hold up to the reader.

What to do?

Briefly put:

  • Always validate input before using it.
  • Always understand how untrusted input might affect the underlying operating system before you see it.
  • Assume that specialised input devices (e.g. barcode scanners) can be made to behave like general-purpose ones (e.g. keyboards).
  • Expect the unexpected.

10 Comments

Hi Paul, that is interesting and confirms the need to verify all user input. But I find the idea of an app having responsibility for protecting/preventing the OS from interpreting barcode input rather strange. Surely it the OS that has this duty.

Also, does the SetWindowsHookEx() function you mentioned effectively allow any app to read password input? Is there no protection against that?

As far I know, not just any app can set a global keyboard hook – you need sufficient privilege for that. So malware that wants to use this function for password stealing either has to inject itself into the victim app, or have enough privilege to monitor all keystrokes for all apps. (If you are logged in as an admin, of course, then any malware that runs will probably be admin as well, but in that case you have even bigger worries :-)

What do you think about QR code readers, on an Android phone? is there not a severe risk of a possible hack in this way?

Don’t know…depends how the QR reader is implemented. I think it’s provided as a library by Android itself, whereas the problem outlined here is caused by not having built-in support for reading barcodes, but instead retrofitting that support by making the barcode reader into a keyboard. In other words, I think Android’s QR reader is implemented *as a QR reader*, so if it has security problems, they would be of a different sort.

Probably not, as the app parses it and does expect wonky characters
Just make sure to use a trusted and popular qr code reader

Isn’t it possible to filter out malicious input on the hardware level, by setting the barcode reader accordingly?

It would be possible to filter out input in the reader itself, the problem is, they don’t, at the moment. And different customers might want different filtering. It’d take a major re-work, there’d need to be a way of the user, or his admin, telling the scanner what characters to accept, and what not to. It could be done, but it’d require reprogramming every scanner, assuming their controller chips are reprogrammable.

They’d need sending back to the factory at least, and that’s the ones that are still supported, that the manufacturer still remembers how to service. Most would need throwing out.

The root cause is scanners acting as keyboards, a keyboard has authority to control a computer fully, because it’s assumed the owner is sat typing on it. A scanner is for data entry, barcodes from third parties that aren’t human-readable.

Scanners masquerading as keyboards was a nice hack, back decades ago, it allowed interoperability with manual data-entry apps, and all computers support keyboards, with no drivers needed. Keyboards are also easy to emulate. So no standards needed drawing up, they just piggybacked the keyboard.

It’s about time they dropped this, and treated scanners separately. New software will be needed, standards will need to be set. But it’s not technically difficult. There’s a LOT of scanners out there in the wilderness, that manufacturers have long since lost track of. But a scare story like this might provide an incentive for a quick change. Businesses were willing to throw money at the Y2K bug. Scanner manufacturers could use this story in their advertising!

Bad barcodes are in the wild; if you frequent StackOverflow, StackExchange, etc., you will occasionally run into profiles with embedded QR codes. At least one of these responds with valid ShellShock; potentially funny if you’re a server admin witnessing a bot get compromised – but still a lesson for trusting every code you see.

Reading the other comments, in general the problem is identifying malicious input. Keyboards are a special case because the assumption is that you meant to type that; how does it know you aren’t typing into notepad (where there’s negligible risk code will run), writing an internal security report or article? The hardware context rarely knows much about the software context, and relies on software to track temporal state changes (like a string of characters longer than a hardware buffer) so this has historically been a tricky problem.

Comments are closed.

Subscribe to get the latest updates in your inbox.
Which categories are you interested in?