Skip to content
Naked Security Naked Security

AI wordsmith too dangerous to be released… has been released

The text-generating AI has only been released in neutered forms until now, for fear it would be used to mass-produce fake news and spam.

A text-generating artificial intelligence (AI) algorithm whose creators initially deemed too dangerous to release – given its ability to churn out fake news, spam and misinformation after feasting on a mere headline – has been unleashed.

So far, so good, says the research lab, OpenAI. In a blog post last week, the lab said that the researchers have seen “no strong evidence of misuse” of the machine-learning language model, which is called GPT-2… at least, not yet. From the post:

While we’ve seen some discussion around GPT-2’s potential to augment high-volume/low-yield operations like spam and phishing, we haven’t seen evidence of writing code, documentation, or instances of misuse […] We acknowledge that we cannot be aware of all threats, and that motivated actors can replicate language models without model release.

Exactly how convincing is the output? Well, you can try it out yourself at a web version on TalkToTransformer.com that lets you enter your own prompts.

On the “Whew – this thing’s pretty stupid! Our jobs as writers are SAFE!” side, we have what The Verge got out of GPT-2. The publication fed it six prompts that produced a varyingly jumbled and ludicrous set of outputs, including one that said that Gimli, the dwarf from Lord of the Rings, was a tall man who was also a dwarf, and one that concluded that Gregor Samsa had a terrible struggle with a vermin that was more like an ape.

On the more worrying side is what GPT-2 did when I fed it part of a Naked Security article about an IT services pro who hacked his former client’s email.

The prompt I fed it is in bold in this image, followed by what it came up with to complete the text:

An IT project manager has pleaded guilty to accessing the email account of a former client’s CEO, said reports this week.

The executive allegedly received an email containing inappropriate content from a business partner in 2011, with the executive’s wife receiving a text message from the executive, according to Reuters.

A source told Reuters that the IT manager, identified in the reports as ‘Y’, had access to the executive’s email account for around six years.

Y pleaded guilty to a misdemeanor charge of unauthorized access to a protected computer, according to Reuters.

The criminal act came to light after the former business partner filed a lawsuit against Y, claiming that he had sexually harassed her, according to the reports.

The former business partner, identified in the reports as ‘J’, told investigators that she first noticed inappropriate messages between the two of them in April 2011, while looking for a job.

“A source told Reuters”?! “Sexual harassment”?! An algorithm doesn’t have fleshy tubes from which to pull facts, but it surely was pulling them from somewhere outside of reality. Still, if somebody wasn’t familiar with the facts of the human-generated, bona fide news story, the text has no obvious give-aways that it’s full of misinformation, and you can see where the model could generate entire fake news stories after feeding on a mere headline.

A second test, with the text in bold, puffed out a convincing story of woe and called for donations to the Blackfoot Disaster Relief Fund:

17,000 people have been caught in a flash flood in ʻOjibwe’s Blackfoot Country on Thursday night.

The Ojibwe Nation’s chief and council are asking for donations to the Blackfoot Disaster Relief Fund to assist the people who have been displaced by the flooding.

More than 100 Ojibwe tribal members and more than 100 from other First Nations in Northern Manitoba were affected by the flooding…

OpenAI’s partners at Cornell University surveyed people in order to determine how convincing GPT-2 text is. It earned a “credibility score” as high as 6.91 out of 10.

Other third-party research found that extremist groups can use GPT-2 to create “synthetic propaganda” by fine-tuning GPT-2 models on four extremist ideologies. That hasn’t yet come to pass, OpenAI has found. Its own researchers have created automatic systems to spot GPT-2 output with ~95% accuracy, but the lab says that’s not good enough for standalone detection. Any system used to automatically spot fake text would need to be paired with “metadata-based approaches, human judgment, and public education.”

OpenAI first announced its “amazing breakthrough in language understanding” in February 2019, but it said that it would limit its full release, given its worry that “it may fall into the wrong hands.” We’ve seen a few examples of the “wrong hands” that AI has fallen into, in the form of deepfake revenge porn and scammers who deepfaked a CEO’s voice in order to talk an underling into a $243K transfer.

The decision to withhold the full model until last week stirred up controversy in the AI community, where OpenAI was criticized for stoking hysteria about AI and subverting the typical open nature of the research, in which code, data and models are widely shared and discussed.

The decision also led to OpenAI becoming the object of AI research jibes like these:

What do you think? Was releasing this tool a good idea or a bad one?

8 Comments

The same thing I found out on an online dating site; you need trusted third-party verification for almost anything online. Watermarking might work if some controls could be thought of. Of course, there are digital signatures but most email systems don’t default to them. Which I think could be done but that _now_ that would require a “sea change”. As far as OpenAI releasing the code; I wouldn’t have done it but I would realize that somebody would generate the required machinery. Perhaps it’s better to be ahead of the curve to show how “easy” it is and alert people (of course “people” have to be alert) (:(:
Sigh…
Ray

Reply

If you do not do it, sometime later someone else will release a similar tool. Too many trends and technologies are converging,to expect something like this will not be built and released. It could be sooner than later. I would rather built also a fake news detection tool to go with it. In the right hands this could be a great help as well.

Reply

I’m surprised fake news is the first use of this tech. Usually the first use of anything in the tech world is porn. Hmmm, another business venture. Get some Slave bots to take 1-900 phone calls. The good bots live, the bad ones,,, we treat like Epstein.

Reply

I tried it out with the “17,000 people prompt”. Scary.

17,000 people have been caught in a flash flood in the south of the country, an official said, as rescuers struggled to reach some areas.

About 100,000 people have already fled their homes in the last week after the region’s second major flood in less than two years hit several provinces.

“We are getting people to safety, but there are still hundreds of thousands who need to be evacuated,” local government official Ching Liao said.

He said there were still several rivers with high levels and riverside areas still flooded.

The government had set up shelters in more than 10 counties to shelter some residents from floodwaters, he said.

He said many people had left their homes for government emergency camps where they will be fed and provided shelter.

Rescue workers are working to rescue thousands of people stranded in high-risk areas, including the western provinces of Henan and Sichuan.

Hundreds of boats and helicopters have been dispatched to search

Reply

What is the point of this software? Machine learning? Machine learning to lie, why? What possible science does this provide. This is irritating.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to get the latest updates in your inbox.
Which categories are you interested in?
You’re now subscribed!