Naked Security Naked Security

LinkedIn can’t block public profile data scraping, court rules

The long-awaited decision found that automated scraping of publicly accessible data likely doesn't violate the CFAA.

An appeals court has told LinkedIn to back off – no more interfering with a third-party data-analytics startup’s use of the publicly available data of LinkedIn’s users.

The court’s decision, which affirmed that of a lower court, has been closely anticipated for what some legal scholars consider to be the case’s important constitutional and economic issues, as well as what critics believe could be a chilling effect on digital competition.

Constitutional scholar and Harvard law professor Laurence Tribe, for one, has weighed in on this issue to offer advice to the data-scraping startup in question, hiQ Labs.

At issue, Tribe has said, was that social media is the modern equivalent of the public square. He’s called LinkedIn’s attempts to stop hiQ from using its users’ publicly available data “a serious challenge to free expression in the modern world.”

Freedom of speech is not just about flag-burning. It’s about how you use information in the digital economy. Data is the new form of capital in creating products and services.

The decision was applauded for providing clarity around the scope of the nation’s major hacking law, the Computer Fraud and Abuse Act (CFAA). The Electronic Frontier Foundation (EFF), for one, said that it should come as a relief to researchers, journalists, and companies…

who have had reason to fear cease and desist letters threatening liability simply for accessing publicly available information in a way that publishers object to.

The case

Back in 2016, hiQ, a San Francisco startup, was marketing two products, both of which depend on whatever data LinkedIn’s 500 million members have made public: Keeper, which identifies employees who might be ripe for being recruited away, and Skills Mapper, which summarizes an employee’s skills.

hiQ wasn’t hacking anything away. It was just grabbing the kind of stuff you or I could get on LinkedIn without having to log in. All you need is a browser and a search engine to find the data hiQ sucks up, digests, analyzes and sells to companies who want a heads-up when their pivotal employees might have one foot out the door or that are trying to figure out how their workforce needs to be bolstered or trained.

In 2016, LinkedIn decided to offer a similar service, at which point it sent hiQ and others in the sector cease and desist letters and started blocking the bots that were reading its pages.

LinkedIn’s case has two main arguments:

  1. hiQ is scraping data that belongs to LinkedIn and threatens its members’ privacy; and
  2. It does this with bot-scraping programs that have negative effects.

LinkedIn alleged that hiQ was violating the CFAA, as well as the Digital Millennium Copyright Act (DMCA). It also alleged that hiQ was conducting unfair business practices under California state law. In the letter to hiQ, LinkedIn noted that it had used technology to block the startup from accessing its data.

On Monday, a three-judge panel nixed LinkedIn’s claims about the alleged CFAA violation and told LinkedIn to stop blocking the scraping. The judges wrote that data scraping of publicly available information does not constitute a violation of the CFAA.

CFAA doesn’t apply to public data

The court found that the CFAA simply doesn’t apply to information that’s available to the general public, as is LinkedIn users’ data. The court pointed out that LinkedIn’s privacy policy clearly states that…

…’any information you put on your profile and any content you post on LinkedIn may be seen by others’ and instructs users not to ‘post or add personal data to your profile that you would not want to be public.’

From the get-go, the CFAA was enacted not to protect such publicly available data, but rather to prevent “intentional intrusion onto someone else’s computer – specifically, computer hacking,” the decision reads.

The three judges referenced a 1984 House Report on the CFAA that explicitly compared the conduct prohibited by section 1030 of the existing computer fraud law (the CFAA was enacted in 1986 as an amendment to that law) to forced entry. From that 1984 report:

It is noteworthy that section 1030 deals with an ‘unauthorized access’ concept of computer fraud rather than the mere use of a computer. Thus, the conduct prohibited is analogous to that of ‘breaking and entering’.

The court pointed out that when the CFAA was first enacted, it only applied to certain categories of computers that had military, financial, or other sensitive data:

None of the computers to which the CFAA initially applied were accessible to the general public. Affirmative authorization of some kind was presumptively required.

In 1996, the law was extended to cover more computers. At that time, a Senate report said that the goal was to “increase protection for the privacy and confidentiality of computer information.”

Thus, California’s 9th Circuit reasons that “the prohibition on unauthorized access is properly understood to apply only to private information – information delineated as private through use of a permission requirement of some sort.”

But hiQ is only scraping information from public LinkedIn profiles. It’s the same data any member of the public is authorized to access.

LinkedIn argued that it could selectively revoke that authorization using a cease-and-desist letter, but the 9th Circuit wasn’t persuaded. The court said that ignoring a cease-and-desist letter isn’t the same as hacking into a private computer system.

Besides finding that hiQ hasn’t violated the CFAA, Monday’s ruling also upheld a lower court order that banned LinkedIn from interfering with hiQ’s scraping activities during the course of the litigation. As it is, if it can’t scrape LinkedIn data, hiQ doesn’t have anything to sell to its clients and will very likely go belly up before it has a chance to finish the case, the court recognized.

Next steps?

The EFF, which had filed an amicus brief along with the search engine DuckDuckGo and the Internet Archive, said that Monday’s decision is an “important step” in limiting use of the CFAA “to intimidate researchers with the legalese of cease and desist letters.”

But the CFAA could still be used by the likes of LinkedIn to stifle competition, EFF said, since the Ninth Circuit “sadly left the door open to other claims, such as trespass to chattels or even copyright infringement.”

This isn’t the end of the story, the EFF predicts. The CFAA is still full of muddy language, and the issues raised in this litigation could still wind their way on up to the Supreme Court:

Even with this ruling, the CFAA is subject to multiple conflicting interpretations across the federal circuits, making it likely that the Supreme Court will eventually be forced to resolve the meaning of key terms like ‘without authorization.’

Leave a Reply

Your email address will not be published. Required fields are marked *