Skip to content
Naked Security Naked Security

Google knows where your photos were taken

Street View images + machine learning = Google pinpointing almost any pic.

It’s easy enough to stalk kitty cats or track fugitives to, say, the jungles of Guatemala if you have photo EXIF data.

After all, EXIF data reveals, among other things, GPS latitude and longitude coordinates of where a photo was taken.

But really. EXIF data? Bah!

Enter Google. It don’t need no stinkin’ EXIF data.

Tobias Weyand, a computer vision specialist at Google, along with two other researchers, have trained a deep-learning machine to work out the location of almost any photo, just going by its pixels.

To be fair, the learning machine did get trained, initially, on EXIF data.

Make that a huge amount of EXIF data: after all, imagine how many images Google can wrap its tentacles around.

It trained its system on 126 million of them.

The result is a new machine that significantly outperforms humans at determining the location of images – even images captured inside, without geolocation giveaways or hints such as palm fronds, street signs, local-language bearing billboards or Niagara Falls misting away in the background.

There are sites such as GeoGuessr and View From Your Window that suggest that humans are pretty good at integrating clues to guess a photo’s geolocation: we get our tips from landmarks, weather patterns, vegetation, road markings, and architectural details to figure out at least an approximate location, and sometimes even an exact one.

When computers try to figure it out, they’ve usually used image retrieval methods.

In contrast, Weyand and his colleagues approached it as a classification problem.

As they explain in their paper, titled PlaNet – Photo Geolocation with Convolutional Neural Networks, they first divided the earth’s surface into a grid consisting of over 26,000 squares of varying sizes that depend on the number of images taken in that location.

Next, they trained a deep network using millions of geotagged images.

Of course, that meant that the system had a lot more images to go on when dealing with photos of cities, where scads of photos are taken. PlaNet had far fewer images to rely on when it comes to remote regions where people don’t take many photos, such as oceans or polar regions, so the team ignored such areas.

They created that huge database of 126 million photos with EXIF geolocations mined from all over the web.

It’s a noisy data set. The Google team excluded non-photos, such as diagrams or clip art, as well as porn.

That left all manner of photos: those taken indoors, portraits, pet photos, food snaps, and other images that don’t have geolocation cues.

The Google team started training the powerful neural network with 91 million of these images to teach it to work out the grid location using only the image itself, the idea being to input an image and get an output of a particular grid location or a set of likely candidates.

They used the rest of the images – 34 million of them – to validate the results.

Of course, it’s easy to be correct when there’s a famous landmark in the photo, like the Statue of Liberty, the Sydney opera house or Big Ben, for example.

But PlaNet also learned to recognize locally typical landscapes or objects – think red phone booths – architectural styles, and even plants and animals.

To gauge how well it was doing, the team pitted PlaNet against 10 well-traveled humans in a game of Geoguessr.

Geoguessr presents players with a random street view panorama and asks them to place a marker on a map at the location the panorama was captured.

It normally allows players to pan and zoom, but not to navigate to adjacent panoramas. To keep the comparison fair, the Google team didn’t allow the humans to pan or zoom.

You’d imagine that well-traveled humans would have an advantage by knowing, for example, that Google Street View isn’t available in countries including China, thereby allowing them to narrow down their guesses.

But PlaNet, trained solely on image pixels and geolocations, still beat humans by a decent percentage: it localized 17 panoramas at the country level, for example, while humans only localized 11.

From the paper:

We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learned subtle cues of different scenes that are even hard for a well-traveled human to distinguish.

Or, to phrase it with more “in your FACE, humans!”, PlaNet is “superhuman.”

In total, PlaNet won 28 of the 50 rounds with a median localization error of 1131.7 km, while the median human localization error was 2320.75 km. [This] small-scale experiment shows that PlaNet reaches superhuman performance at the task of geolocating Street View scenes.

When it comes to geolocating photos that don’t have location cues, such as those taken indoors, the team figured out how to teach PlaNet to scrutinize photos that are part of albums.

Even if PlaNet can’t determine that a picture of, say, a toaster is in China, if it’s in an album with photos of the Great Wall, it can assume that the toaster’s in the same place.

Do you like the idea of Google using Street View photos and its mighty search muscle to pinpoint your photos’ geolocation?

And how do you feel about Google unleashing that might onto mobile phones?

It could well happen. For a deeply powerful neural network, PlaNet is one svelte bit of code:

Our model uses only 377 MB, which even fits into the memory of a smartphone.

Google has already seen its share of privacy wreckage over Street View.

It’s long been a challenge for the company to operate Street View in countries with stronger privacy laws than the US, such as in the European Union.

Though Google uses technology to blur faces and license plates in Street View images, European data protection authorities have also required that Google notify the public before the Street View cars start driving on European streets and that it limit the amount of time that it keeps unblurred images of faces and license plates.

So. Imagine this: Google Street View on steroids, beefed up with machine learning and running on the fuel of all the images Google has access to, in back pockets throughout the land.

Images, mind you, that don’t necessarily have to use EXIF data for geolocation but can instead be crunched by pixels alone.

Please do give us your thoughts on that scenario in the comments below.

Image of Google Street View car courtesy of 1000 Words / Shutterstock.com

9 Comments

I have occasionally viewed a photo posted online and wondered where it was taken. If there is almost any shop or building with a name on it, usually a simple Google search will yield the location. If the business is say a worldwide fast food outlet, it may not help much, but a business with only a few outlets usually makes it easy to find the location. Kudos to those people who post pictures with generic backgrounds and no EXIF data.

It’s a shame that Google does not abide by the EU requirement for prior notification before sending its cameras down streets in the UK or elsewhere in Europe! WHere I live the newest images are less than a year old yet none of us have ever been notified by Google!

My digital cameras do not have any form of location system included so any EXIF data will have time but not location. If you camera does have location software, then disable it for security.

Is google trying to be god? See all, know all, the army of google robots (internet and walking) and drones, scan every document, every photo, can’t opt out of “suggested” on FB so they can try and guide you to material they want you exposed to while restricting other. Tracking your every move (if you have google maps installed even when not using it).
Google is getting way to creepy.

I volunteer to teach online safety to school kids in the UK. One if the things I try to convey is the importance of turning off geo-tagging to ensure long/lat info is not included in photo EXIF data from phones etc. This info could easily be used by a malicious actor attempting to understand places a child frequents. Have Google even given this consideration?

I think it is to the point we are working for free and over exposing our identities to these companies making or trying to make profits over our selfs. I just read that billboards will be tracking us too. We need to be able to opt out of everything or we need euoropean laws to protect us.

My cameras and phone do not have geolocation turned on.
A couple years ago I happened to be walking on a street when a Google Street view car was driving towards me. No way to avoid being photographed there. Where do I go on Google to see if I am in the streetview images? And does Google blur out the faces of people on the street?

Just go to Google maps, at the location where you were, and look at the current Street View images at the very point where you were standing. (They are updated irregularly, so even if you were in a photo a few years ago you might not be now.)

Google does claim to blur faces (and number plates, and perhaps some other details), and the process that does the blurring does mostly seem to work.

Can you opt out? Probably not…though that depends on the law in the country concerned, I suspect.

They get updated, but the old ones seem to hang around. You can just click on the date in the top corner of the map and select an older version.

They do blur faces, but if you look distinctive that doesn’t help so much. My old university campus features one very recognisable member of staff with face blurred but anyone who was in his department will spot him easily.

Thanks, I went to Google street view map for the place I saw the Google car drive by. There are 8 dates for that location, 2009 – 2015. But nearly all have the same images, and none have me in them. The different dates seem not to be updates of the images. Maybe Google doesn’t update images if when it drives a street it sees there are no changes to buildings etc.

Comments are closed.

Subscribe to get the latest updates in your inbox.
Which categories are you interested in?