Single Sign-On authentication – the bug that lets you logon as someone else

Paul Ducklin

7 years ago

Logon security company Duo recently found a rather worrying flaw in its own authentication gateway.
A bit of digging revealed that the flaw was reflected in many other so-called single-sign on (SSO) applications, thanks to a problem in handling the underlying “authentication language” that has become a standard for products in this space.
Duo disclosed the problem responsibly late last year, and after giving vendors – including itself – time to fix the bug, has now gone public with an excellent and educational explanation of what went wrong.

In the vocabulary of SSO, network authentication uses dedicated authentication servers, known as IdPs (Identity Providers), to validate requests from client software (users) for access to servers on the network, known as SPs (service providers).
This means that you don’t need to program an authentication module, or maintain a separate password database, or run yet another two-factor authentication service, for every server.
In the jargon, you use an SSO IdP server to handle usernames and passwords for all the other SPs on the network.
Of course, if you want various clients and SPs from different vendors to work cleanly together with an IdP from yet another vendor, you need a uniform data language and vocabulary for them to communicate.
One such language is SAML, short for Security Assertion Markup Language.
SAML is a dialect of XML, which is a sort-of tidied-up form of HTML, the language used to create web pages.
Now, if you have written software or scripts that generate web pages in HTML format, you’ll know that it’s gloriously simple to do – you just stick the right tags at each end of each sentence in bold, each web link, each paragraph, each item in a bulleted list, and so on.
Easy!
But if you have ever had to write software to go the other way – to read in HTML or XML and make sense of it – then you will know where this article is going.
Hard!

Generating HTML and then reliably reading it back in are as far apart in difficulty as being able to utter enough badly-pronounced words in a foreign language to find your way to the train station, and being able to chat fluently with a native speaker.

What the bug looks like

Duo did us all a favour by producing a stripped-out representation of the parts that matter in an SAML authentication response; we’ve followed their synthetic example here.
An SAML response typically contains an XML-formatted assertion that identifies the authenticated user, something like this:

<Assertion ID="ABC1245">
    <Subject><NameID>user@example.com</NameID></Subject>
</Assertion>

There should also be a digital signature for the assertion (here identified by the string ABC1245), without which an imposter could simply copy a SAML response, and casually alter the NameID to refer to a different account:

<Signature>
   <SignedInfo><Reference URI="#ABC1245"/></SignedInfo>
   <SignatureValue>digital sig of assertion ABC1245</SignatureValue>
</Signature>

The problem that Duo found was how various programming libraries – including python-saml, used by Duo, ruby-saml and saml2-js – dealt with XML comments inside SAML data structures, and how these comments affected the digital signature process.
Above, the correct data string for the field NameID is obviously user@example.com, being the full text immediately between the start tag NameID; and the end tag /NamedID.
But if you were to write this instead…

<Assertion ID="ABC1245">
    <Subject><NameID>user@example.com<!-- comment -->.test</NameID></Subject>
</Assertion>

…what’s the correct value for NameID, given that the text  is supposed to be ignored?
Duo found that buggy SAML libraries would read the NameID string in various ways, sometimes as user@example.com (treating the comment as a terminator for the data field), and sometimes as user@example.com.test (simply treating the comment as it it were not there at all).
Either interpretation has technical validity, and it doesn’t really matter which approach you choose as long as you are consistent.
Duo found that wasn’t the case: buggy SAML libraries would use the interpretation user@example.com when validating the signature, but the second interpretation when matching the username.
In other words, by injecting a comment followed by some extra text into the NameID field of a signed SAML response, a crook could alter the username in the authentication message without invalidating its digital signature.
As a result, the altered response would pass muster, thus potentially tricking servers on the network into trusting an unauthorised user.

What to do?

If you use an SSO system in your business: check with your vendor if it is SAML-based. If so, ask if it is affected and whether there is a patch available.
If you are a vendor with any product that speaks SAML: check with your programmers which SAML libraries you use, and whether they need patching.

Finally, at the risk of sounding impractically pompous, re-evaluate everywhere that you’ve used an XML-based approach to data when you didn’t need to.
As a wise man once said, “There is no limit to how much worse you can make a computer security problem by using XML in the process of solving it.”