Skip to content
License plates. Image courtesy of Shutterstock.
Naked Security Naked Security

Entire Oakland Police license plate reader data set handed to journalist

The entire LPR dataset of the Oakland Police Department (OPD) included more than 4.6 million reads of over 1.1 million unique plates captured in just over 3 years.

License plates. Image courtesy of Shutterstock.

Howard Matis, a physicist who works at the Lawrence Berkeley National Laboratory in California, didn’t know that his local police department had license plate readers (LPRs).

But even if they did have LPRs (they do: they have 33 automated units), he wasn’t particularly worried about police capturing his movements.

Until, that is, he gave permission for Ars Technica’s Cyrus Farivar to get data about his own car and its movements around town.

The data is, after all, accessible via public records law.

Ars obtained the entire LPR dataset of the Oakland Police Department (OPD), including more than 4.6 million reads of over 1.1 million unique plates captured in just over 3 years.

Then, to make sense out of data originally provided in 18 Excel spreadsheets, each containing hundreds of thousands of lines, Ars hired a data visualization specialist who created a simple tool that allowed the publication to search any given plate and plot its locations on a map.

That’s when things got a bit more worrisome for Mr. Matis.

After Ars ran his plate, the journalists were able to show the physicist a map of the five instances where a camera had captured his car, guessing (correctly) that they were near where he lived or worked: places where, Matis confirmed, he and his wife go “all the time”.

He hadn’t been worried about the police having his movement data, but the thought of it being stored, for an indefinite period of time, even though he wasn’t being investigated, and then having it handed over to anybody who simply asked, well, that’s where the creep factor came in, he told Ars:

If anyone can get this information, that’s getting into Big Brother. If I was trying to look at what my spouse is doing, [I could]. To me, that is something that is kind of scary. Why do they allow people to release this without a law enforcement reason? Searching it or accessing the information should require a warrant.

This is the letter that he immediately sent to his city council member:

Do you know why Oakland is spying on me and my wife? We haven't done anything too radical or illegal.

I gave my license plate to a journalist and he found my wife's and my car in their database. One of the locations is right near our house.

The astounding thing about this information is that anyone, and I mean anyone, can get this information. Some of the information is more than two years old.

I can see lawyers using this information for lawsuits. I can check where my wife is located. Car companies can see my habits. Insurance companies can check up on their clients. We have entered the world of 1984 with the difference that anyone can get the information.

Matis’s concern is justified.

Many people, when asked how they feel about surveillance, shrug it off, claiming that they don’t have anything to hide.

They’re wrong. We all have something to hide – not because we’re guilty of crimes, but because we deserve data privacy.

Such privacy is crucial for a number of reasons. For one thing, it shields us from persecution, whether it concerns our race, religion, gender, political orientation, or any other of a vast number of personal attributes.

Can our geolocation reveal such things about us?

Absolutely. Catherine Crump, a law professor at the University of California, Berkeley, made this point when talking to Ars:

Where someone goes can reveal a great deal about how he chooses to live his life. Do they park regularly outside the Lighthouse Mosque during times of worship? They’re probably Muslim. Can a car be found outside Beer Revolution a great number of times? May be a craft beer enthusiast - although possibly with a drinking problem.

As Naked Security often stresses in our reporting about Big Data, we have to stop thinking about data sets in terms of individual records and start thinking about them in terms of huge networks of possible relationships that exist between those records.

As Paul Ducklin recently pointed out, license plate readers are a good example of how seemingly innocuous pieces of discrete data – i.e., where your license plate was and when – manifest into something entirely different when amassed in huge data sets and cross-correlated, given that your plate number stays constant while your location changes.

There are properties and capabilities that emerge from large collections of data that don’t exist in the same data at smaller scales (it’s why we had to invent a term – Big Data – to describe it).

While one data point about a license plate could – and has – been used to do things such as track fugitives or solve a gang-related homicide, there’s no saying what the government can do with massive amounts of correlated data spanning years of collection, the vast majority of which has been surveilled from innocent people who aren’t breaking any laws.

As a group of MIT graduate students outlined in this paper, even supposedly vague/imprecise/anonymised data can tell you who’s who once your data set gets big enough.

In fact, anonymity fell off the data like tissue paper in a rainstorm when the data sets got big enough, as Paul writes:

When the authors knew the details of any four transactions you'd made during the three-month data period, as, for example, would any shop that you had visited four times, they had a chance lower than 15% of guessing which anonymous tag in the file was yours.

But with 10 known transactions, something you might easily rack up with multiple retailers due to daily habits at at a coffee shop, a parking lot, or a newsagent, their chance of pinpointing you rose above 80%.

Oakland, it’s time we had a talk.

For a city in laid-back California, you’re pretty jittery. It looks a bit like your data collection habit is getting out of control.

As it is, you’ve been one of the biggest surveillance hotspots for years, in a country where cities are increasingly gobbling up data on residents and ignoring privacy.

You’re gathering it. You’re retaining it. You’re passing it out to journalists.

To echo Mr. Matis: why are you spying on him and his wife?

Why are you spying on all your other citizens, come to think of it?

It can’t be for solving crimes, since, as Ars reported, your “hit rate” of reading license plates of people who are actually under suspicion is at 0.16 percent.

It’s time to rethink your ways. You, and much of law enforcement.

Image of license plates courtesy of Shutterstock.

0 Comments

I disagree with the author telling me I have something to hide. I will be the judge of that not him.

Reply

I would also be concerned about the reliability of this information. I doubt that the system is sophisticated enough to tell a real license plate from a laminated printout. If someone wanted to frame someone else, they could take a picture of their license plate, print it out, laminate it and put it on a vehicle left somewhere embarrassing or risky. Therefore, this information could not be relied upon.

Reply

While the author’s statement could have been worded a little better, taken in context it is accurate. I find it appalling that we would allow data like this to be freely handed out. Imagine if you will someone using that data to track you for some nefarious reason or simply to find something to use against you. This data can be manipulated and used to possibly get you fired from your job. I can visualize many ways to use data like that. Data is a tool and has it’s uses but like any tool those uses can be for good or bad. No, the author should have said that you have something to hide, your data.

Reply

There is nothing preventing people from doing that now by following you and seeing where all you go taking video and then turn it over. You want privacy, then do not use technology and live off the land. As we advance ways to track people will become ever so easy to acquire such as personal drones flying over head and under the altitude limit set by the FAA.

Reply

What you say is correct but with drones you have recourse should one invade your private space. The debate is not what data is collected, it is how it is used and who has access. Forgoing technology is not a realistic option because it is forced on us by the very fact that we have satellites circling above us and have been for decades. Even everyday technology is so abundant that we can not help but trip over it. You might hide in a cave but eventually you have to come out.

Reply

Chance,

So how far would you extend your private space? Would it be 10 feet over your head or 25 miles over your head? Would it be 1.5 foot radius from your waste line, or would it be 40 mile radius? Where does this private space get defined international, so as to avoid such actions as being followed by drones and people for that matter? Would this definition of private space be ever defined, I think the answer is it would never be defined as humanity would be in complete chaos then,

Reply

4.6 million reads of 1.1 million unique licence/license plates comes to a mean of just over 4 reads per plate, in just over three years. So a random plate is been seen about once a year.

About as often as Santa’s sleigh.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to get the latest updates in your inbox.
Which categories are you interested in?
You’re now subscribed!