Have you ever asked Apple’s personal voice assistant, Siri, if it’s always listening to you?
If so, you presumably got one of its cutesy responses. To wit:
I only listen when you’re talking to me.
Well, not just when you’re talking to Siri, actually. These voice assistant devices get triggered accidentally all the time, according to a whistleblower who’s working as a contractor with Apple.
The contractor told The Guardian that the rate of accidental Siri activations is quite high – most particularly on Apple Watch and the company’s HomePod smart speaker. Those two devices lead to Siri capturing the most sensitive data out of all the data that’s coming from accidental triggers and being sent to Apple, where human contractors listen to, and analyze, all manner of recordings that include private utterances of names and addresses.
The Guardian quoted the Apple contractor:
The regularity of accidental triggers on [Apple Watch] is incredibly high. The watch can record some snippets that will be 30 seconds – not that long, but you can gather a good idea of what’s going on.
The whistleblower says there have been “countless” instances of Apple’s Siri voice assistant mistakenly hearing a “wake word” and recording “private discussions between doctors and patients, business deals, seemingly criminal dealings, sexual encounters” and more. Those recordings often reveal identifying information, they said:
These recordings are accompanied by user data showing location, contact details, and app data.
If you aren’t muttering, “So, what’s new?” by this point, you haven’t been paying attention to the news about how much these devices are overhearing and how little the vendors are worrying about the fact that it’s a privacy violation.
Over the past few months, news has emerged about human contractors working for the three major voice assistant vendors – Apple, Google and Amazon – listening to us as they transcribe audio files. As a series of whistleblowers have reported, Google Assistant, Amazon Alexa and Siri have all been capturing recordings that are often unintentionally recorded by owners after their devices get triggered by acoustic happenstance: word sound-alikes, say, or people chattering as they pass by in the street outside.
Accidental recordings: “technical problem” or “privacy invasion?”
It’s all done to improve the vendors’ speech recognition capabilities, and identifying mistaken recordings is part of that. However, the whistleblower said, Apple instructs staff to report accidental activations “only as a technical problem”, with no specific procedures to deal with sensitive recordings. The contractor:
We’re encouraged to hit targets, and get through work as fast as possible. The only function for reporting what you’re listening to seems to be for technical problems. There’s nothing about reporting the content.
All the big voice assistant vendors are listening to us
First, it was whistleblowers at Amazon who said that human contractors are listening to us. Next it was Google, and now Apple has made it a trifecta.
Earlier this month, Belgian broadcaster VRT News published a report that included input from three Google insiders about how the company’s contractors can hear some startling recordings from its Google Assistant voice assistant, including those made from bedrooms or doctors’ offices.
With the help of one of the whistleblowers, VRT listened to some of the clips. Its reporters managed to hear enough to discern the addresses of several Dutch and Belgian people using Google Home, in spite of the fact that some of them never said the listening trigger phrases. One couple looked surprised and uncomfortable when the news outlet played them recordings of their grandchildren.
The whistleblower who leaked the Google Assistant recordings was working as a subcontractor to Google, transcribing the audio files for subsequent use in improving its speech recognition. He or she reached out to VRT after reading about how Amazon workers are listening to what you tell Alexa, as Bloomberg reported in April.
They’re listening, but they aren’t necessarily deleting: in June of this year, Amazon confirmed – in a letter responding to a lawmaker’s request for information – that it keeps transcripts and recordings picked up by its Alexa devices forever, unless a user explicitly requests that they be deleted.
“The amount of data we’re free to look through seems quite broad”
The contractor told the Guardian that he or she went public because they were worried about how our personal information can be misused – particularly given that Apple doesn’t seem to be doing much to ensure that its contractors are going to handle this data with kid gloves:
There’s not much vetting of who works there, and the amount of data that we’re free to look through seems quite broad. It wouldn’t be difficult to identify the person that you’re listening to, especially with accidental trigger – addresses, names and so on.
Apple is subcontracting out. There’s a high turnover. It’s not like people are being encouraged to have consideration for people’s privacy, or even consider it. If there were someone with nefarious intentions, it wouldn’t be hard to identify [people on the recordings].
The contractor wants Apple to be upfront with users about humans listening in. They also want Apple to ditch those jokey, and apparently inaccurate, responses Siri gives out when somebody asks if it’s always listening.
This is the response that Apple sent to the Guardian with regards to the news:
A small portion of Siri requests are analyzed to improve Siri and dictation. User requests are not associated with the user’s Apple ID. Siri responses are analyzed in secure facilities and all reviewers are under the obligation to adhere to Apple’s strict confidentiality requirements
Apple also said that a very small, random subset, less than 1% of daily Siri activations, are used for “grading” – in other words, quality control – and those used are typically only a few seconds long.
For its part, Google also says that yes, humans are listening, but not much. Earlier in the month, after its own whistleblower brouhaha, Google said that humans listen to only 0.2% of all audio clips. And those clips have been stripped of personally identifiable information (PII) as well, Google said.
You don’t need an Apple ID to figure out who’s talking
The vendors’ rationalizations are a bit weak. Google has said that the clips its human employees are listening to have been stripped of PII, while Apple says that its voice recordings aren’t associated with users’ Apple IDs.
Those aren’t impressive privacy shields, for a few reasons. First off, Big Data techniques mean that data points that are individually innocuous can be enormously powerful and revealing when aggregated. That’s what Big Data is all about.
Research done by MIT graduate students a few years back to see how easy it might be to re-identify people from three months of credit card data, sourced from an anonymized transaction log, showed that all it took was 10 known transactions – easy enough to rack up if you grab coffee from the same shop every morning, park at the same lot every day and pick up your newspaper from the same newsstand – to identify somebody with a better than 80% accuracy.
But why get all fancy with Big Data brawn? People flat-out utter names and addresses in these accidental recordings, after all. It’s the acoustic equivalent of a silver platter for your identity.
Getting hot and sweaty with your honey while wearing your Apple Watch, or near a HomePod? Doing a drug deal, while wearing your watch? Discussing that weird skin condition with your doctor?
You might want to rethink such acoustic acrobats when you’re around a listening device. That’s what they do: they listen. And that means that there’s some chance that humans are also listening.
It’s a tiny sliver of a chance that humans will sample your recordings, the vendors claim. It’s up to each of us to determine for ourselves just how much we like those vague odds of having our private conversations remain private.