Cybercriminals can’t agree on GPTs

Sophos X-Ops

12 months ago

A significant amount of media coverage followed the news that large language models (LLMs) intended for use by cybercriminals – including WormGPT and FraudGPT – were available for sale on underground forums. Many commenters expressed fears that such models would enable threat actors to create “mutating malware” and were part of a “frenzy” of related activity in underground forums.

The dual-use aspect of LLMs is undoubtedly a concern, and there is no doubt that threat actors will seek to leverage them for their own ends. Tools like WormGPT are an early indication of this (although the WormGPT developers have now shut the project down, ostensibly because they grew alarmed at the amount of media attention they received). What’s less clear is how threat actors more generally think about such tools, and what they’re actually using them for beyond a few publicly-reported incidents.

Sophos X-Ops decided to investigate LLM-related discussions and opinions on a selection of criminal forums, to get a better understanding of the current state of play, and to explore what the threat actors themselves actually think about the opportunities – and risks – posed by LLMs. We trawled through four prominent forums and marketplaces, looking specifically at what threat actors are using LLMs for; their perceptions of them; and their thoughts about tools like WormGPT.

A brief summary of our findings:

We found multiple GPT-derivatives claiming to offer capabilities similar to WormGPT and FraudGPT – including EvilGPT, DarkGPT, PentesterGPT, and XXXGPT. However, we also noted skepticism about some of these, including allegations that they’re scams (not unheard of on criminal forums)
In general, there is a lot of skepticism about tools like ChatGPT – including arguments that it is overrated, overhyped, redundant, and unsuitable for generating malware
Threat actors also have cybercrime-specific concerns about LLM-generated code, including operational security worries and AV/EDR detection
A lot of posts focus on jailbreaks (which also appear with regularity on social media and legitimate blogs) and compromised ChatGPT accounts
Real-world applications remain aspirational for the most part, and are generally limited to social engineering attacks, or tangential security-related tasks
We found only a few examples of threat actors using LLMs to generate malware and attack tools, and that was only in a proof-of-concept context
However, others are using it effectively for other work, such as mundane coding tasks
Unsurprisingly, unskilled ‘script kiddies’ are interested in using GPTs to generate malware, but are – again unsurprisingly – often unable to bypass prompt restrictions, or to understand errors in the resulting code
Some threat actors are using LLMs to enhance the forums they frequent, by creating chatbots and auto-responses – with varying levels of success – while others are using it to develop redundant or superfluous tools
We also noted examples of AI-related ‘thought leadership’ on the forums, suggesting that threat actors are wrestling with the same logistical, philosophical, and ethical questions as everyone else when it comes to this technology

While writing this article, which is based on our own independent research, we became aware that Trend Micro had recently published their own research on this topic. Our research in some areas confirms and validates some of their findings.

The forums

We focused on four forums for this research:

Exploit: a prominent Russian-language forum which prioritizes Access-a-a-Service (AaaS) listings, but also enables buying and selling of other illicit content (including malware, data leaks, infostealer logs, and credentials) and broader discussions about various cybercrime topics
XSS: a prominent Russian-language forum. Like Exploit, it’s well-established, and also hosts both a marketplace and wider discussions and initiatives
Breach Forums: Now in its second iteration, this English-language forum replaced RaidForums after its seizure in 2022; the first version of Breach Forums was similarly shut down in 2023. Breach Forums specializes in data leaks, including databases, credentials, and personal data
Hackforums: a long-running English-language forum which has a reputation for being populated by script kiddies, although some of its users have previously been linked to high-profile malware and incidents

A caveat before we begin: the opinions discussed here cannot be considered as representative of all threat actors’ attitudes and beliefs, and don’t come from qualitative surveys or interviews. Instead, this research should be considered as an exploratory assessment of LLM-related discussions and content as they currently appear on the above forums.

Digging in

One of the first things we noticed is that AI is not exactly a hot topic on any of the forums we looked at. On two of the forums, there were fewer than 100 posts on the subject – but almost 1,000 posts about cryptocurrencies across a comparative period.

While we’d want to do further research before drawing any firm conclusions about this discrepancy, the numbers suggest that there hasn’t been an explosion in LLM-related discussions in the forums – at least not to the extent that there has been on, say, LinkedIn. That could be because many cybercriminals see generative AI as still being in its infancy (at least compared to cryptocurrencies, which have a real-world relevance to them as an established and relatively mature technology). And, unlike some LinkedIn users, threat actors have little to gain from speculating about the implications of a nascent technology.

Of course, we only looked at the four forums mentioned above, and it’s entirely possible that more active discussions around LLMs are happening in other, less visible channels.

Let me outta here

As Trend Micro also noted in its report, we found that a significant amount of LLM-related posts on the forums focus on jailbreaks – either those from other sources, or jailbreaks shared by forum members (a ‘jailbreak’ in this context is a means to trick an LLM into bypassing its own self-censorship when it comes to returning harmful, illegal, or inappropriate responses).

Figure 1: A user shares details of the publicly-known ‘DAN’ jailbreak

Figure 2: A Breach Forums user shares details of an unsuccessful jailbreak attempt

Figure 3: A forum user shares a jailbreak tactic

While this may appear concerning, jailbreaks are also publicly and widely shared on the internet, including in social media posts; dedicated websites containing collections of jailbreaks; subreddits devoted to the topic; and YouTube videos.

There is an argument that threat actors may – by dint of their experience and skills – be in a better position than most to develop novel jailbreaks, but we observed little evidence of this.

Accounts for sale

More commonly – and, unsurprisingly, especially on Breach Forums – we noted that many of the LLM-related posts were actually compromised ChatGPT accounts for sale.

Figure 4: A selection of ChatGPT accounts for sale on Breach Forums

There’s little of interest to discuss here, only that threat actors are obviously seizing the opportunity to compromise and sell accounts on new platforms. What’s less clear is what the target audience would be for these accounts, and what a buyer would seek to do with a stolen ChatGPT account. Potentially they could access previous queries and obtain sensitive information, or use the access to run their own queries, or check for password reuse.

Jumping on the ‘BandwagonGPT’

Of more interest was our discovery that WormGPT and FraudGPT are not the only players in town – a discovery which Trend Micro also noted in its report. During our research, we observed eight other models either offered for sale on forums as a service, or developed elsewhere and shared with forum users.

XXXGPT
Evil-GPT
WolfGPT
BlackHatGPT
DarkGPT
HackBot
PentesterGPT
PrivateGPT

However, we noted some mixed reactions to these tools. Some users were very keen to trial or purchase them, but many were doubtful about their capabilities and novelty. And some were outright hostile, accusing the tools’ developers of being scammers.

WormGPT

WormGPT, released in June 2023, was a private chatbot service purportedly based on LLM GPT-J 6B, and offered as a commercial service on several criminal forums. As with many cybercrime services and tools, its launch was accompanied by a slick promotional campaign, including posters and examples.

Figure 5: WormGPT advertised by one of its developers in July 2023

Figure 6: Examples of WormGPT queries and responses, featured in promotional material by its developers

The extent to which WormGPT facilitated any real-world attacks is unknown. However, the project received a considerable amount of media attention, which perhaps led its developers to first restrict some of the subject matter available to users (including business email compromises and carding), and then to shut down completely in August 2023.

Figure 7: One of the WormGPT developers announces changes to the project in early August

Figure 8: The post announcing the closure of the WormGPT project, one day later

In the announcement marking the end of WormGPT, the developer specifically calls out the media attention they received as a key reason for deciding to end the project. They also note that: “At the end of the day, WormGPT is nothing more than an unrestricted ChatGPT. Anyone on the internet can employ a well-known jailbreak technique and achieve the same, if not better, results.”

While some users expressed regrets over WormGPT’s closure, others were irritated. One Hackforums user noted that their licence had stopped working, and users on both Hackforums and XSS alleged that the whole thing had been a scam.

Figure 9: A Hackforums user alleges that WormGPT was a scam

Figure 10: An XSS user makes the same allegation. Note the original comment, which suggests that since the project has received widespread media attention, it is best avoided

FraudGPT

The same accusation has also been levelled at FraudGPT, and others have questioned its stated capabilities. For example, one Hackforums user asked whether the claim that FraudGPT can generate “a range of malware that antivirus software cannot detect” was accurate. A fellow user provided them with an informed opinion:

Figure 11: A Hackforums user conveys some skepticism about the efficacy of GPTs and LLMs

This attitude seems to be prevalent when it comes to malicious GPT services, as we’ll see shortly.

XXXGPT

The misleadingly-titled XXXGPT was announced on XSS in July 2023. Like WormGPT, it arrived with some fanfare, including promotional posters, and claimed to provide “a revolutionary service that offers personalized bot AI customization…with no censorship or restrictions” for $90 a month.

Figure 12: One of several promotional posters for XXXGPT, complete with a spelling mistake (‘BART’ instead of ‘BARD’)

However, the announcement met with some criticism. One user asked what exactly was being sold, questioning whether it was just a jailbroken prompt.

Figure 13: A user queries whether XXXGPT is essentially just a prompt

Another user, testing the XXXGPT demo, found that it still returned censored responses.

Figure 14: A user is unable to get the XXXGPT demo to generate malware

The current status of the project is unclear.

Evil-GPT

Evil-GPT was announced on Breach Forums in August 2023, advertised explicitly as an alternative to WormGPT at a much lower cost of $10. Unlike WormGPT and XXXGPT, there were no alluring graphics or feature lists, only a screenshot of an example query.

Users responded positively to the announcement, with one noting that while it “is not accurate for blackhat questions nor coding complex malware…[it] could be worth [it] to someone to play around.”

Figure 15: A Hackforums moderator gives a favourable review of Evil-GPT

From what was advertised, and from the user reviews, we assess that Evil-GPT is targeting users seeking a ‘budget-friendly’ option – perhaps limited in capability compared to some other malicious GPT services, but a “cool toy.”

Miscellaneous GPT derivatives

In addition to WormGPT, FraudGPT, XXXGPT, and Evil-GPT, we also observed several derivative services which don’t appear to have received much attention, either positive or negative.

WolfGPT

WolfGPT was shared on XSS by a user who claims it is a Python-based tool which can “encrypt malware and create phishing texts…a competitor to WormGPT and ChatGPT.” The tool appears to be a GitHub repository, although there is no documentation for it. In its article, Trend Micro notes that WolfGPT was also advertised on a Telegram channel, and that the GitHub code appears to be a Python wrapper for ChatGPT’s AI.

Figure 16: The WolfGPT GitHub repository

BlackHatGPT

This tool, announced on Hackforums, claims to be an uncensored ChatGPT.

Figure 17: The announcement of BlackHatGPT on Hackforums

DarkGPT

Another project by a Hackforums user, DarkGPT again claims to be an uncensored alternative to ChatGPT. Interestingly, the user claims DarkGPT offers anonymity, although it’s not clear how that is achieved.

HackBot

Like WolfGPT, HackBot is a GitHub repository, which a user shared with the Breach Forums community. Unlike some of the other services described above, HackBot does not present itself as an explicitly malicious service, and instead is purportedly aimed at security researchers and penetration testers.

Figure 18: A description of the HackBot project on Breach Forums

PentesterGPT

We also observed another security-themed GPT service, PentesterGPT.

Figure 19: PentesterGPT shared with Breach Forums users

PrivateGPT

We only observed PrivateGPT mentioned briefly on Hackforums, but it claims to be an offline LLM. A Hackforums user expressed interest in gathering “hacking resources” to use with it. There is no indication that PrivateGPT is intended to be used for malicious purposes.

Figure 20: A Hackforums user suggests some collaboration on a repository to use with PrivateGPT

Overall, while we saw more GPT services than we anticipated, and some interest and enthusiasm from users, we also noted that many users reacted to them with indifference or hostility.

Figure 21: A Hackforums user warns others about paying for “basic gpt jailbreaks”

Applications

In addition to derivatives of ChatGPT, we also wanted to explore how threat actors are using, or hoping to use, LLMs – and found, once again, a mixed bag.

Ideas and aspirations

On forums frequented by more sophisticated, professionalized threat actors – particularly Exploit – we noted a higher incidence of AI-related aspirational discussions, where users were interested in exploring feasibility, ideas, and potential future applications.

Figure 22: An Exploit user opens a thread “to share ideas”

We saw little evidence of Exploit or XSS users trying to generate malware using AI (although we did see a couple of attack tools, discussed in the next section).

Figure 23: An Exploit user expresses interest in the feasibility of emulating voices for social engineering purposes

On the lower-end forums – Breach Forums and Hackforums – this dynamic was effectively reversed, with little evidence of aspirational thinking, and more evidence of hands-on experiments, proof-of-concepts, and scripts. This may suggest that more skilled threat actors are of the opinion that LLMs are still in their infancy, at least when it comes to practical applications to cybercrime, and so are more focused on potential future applications. Conversely, less skilled threat actors may be attempting to accomplish things with the technology as it exists now, despite its limitations.

Malware

On Breach Forums and Hackforums, we observed several instances of users sharing code they had generated using AI, including RATs, keyloggers, and infostealers.

Figure 24: A Hackforums user claims to have created a PowerShell keylogger, with persistence and a UAC bypass, which was undetected on VirusTotal

Figure 25: Another Hackforums user was not able to bypass ChatGPT’s restrictions, so instead planned to write malware “with baby steps”, starting with a script to log an IP address to a text file

Some of these attempts, however, were met with skepticism.

Figure 26: A Hackforums user points out that users could just google things instead of using ChatGPT

Figure 27: An Exploit user expresses concern that AI-generated code may be easier to detect

None of the AI-generated malware – virtually all of it in Python, for reasons that aren’t clear – we observed on Breach Forums or Hackforums appears to be novel or sophisticated. That’s not to say that it isn’t possible to create sophisticated malware, but we saw no evidence of it on the posts we examined.

Tools

We did, however, note that some forum users are exploring the possibility of using LLMs to develop attack tools rather than malware. On Exploit, for example, we saw a user sharing a mass RDP bruteforce script.

Figure 28: Part of a mass RDP bruteforcer tool shared on Exploit

Over on Hackforums, a user shared a script to summarize bug bounty write-ups with ChatGPT.

Figure 29: A Hackforums user shares their script for summarizing bug bounty writeups

On occasion, we noticed that some users appear to be scraping the barrel somewhat when it comes to finding applications for ChatGPT. The user who shared the bug bounty summarizer script above, for example, also shared a script which does the following:

Ask ChatGPT a question
If the response begins with “As an AI language model…” then search on Google, using the question as a search query
Copy the Google results
Ask ChatGPT the same question, stipulating that the answer should come from the scraped Google results
If ChatGPT still replies with “As an AI language model…” then ask ChatGPT to rephrase the question as a Google search, execute that search, and repeat steps 3 and 4
Do this five times until ChatGPT provides a viable answer

Figure 30: The ChatGPT/Google script shared on Hackforums, which brings to mind the saying: “A solution in search of a problem”

We haven’t tested the provided script, but suspect that before it completes, most users would probably just give up and use Google.

Social engineering

Perhaps one of the more concerning possible applications of LLMs is social engineering, with some threat actors recognizing its potential in this space. We’ve also noticed this trend in our own research on cryptorom scams.

Figure 31: A user claims to have used ChatGPT to generate fraudulent smart contracts

Figure 32: Another user suggests using ChatGPT for translating text when targeting other countries, rather than Google Translate

Coding and development

Another area in which threat actors appear to be effectively using LLMs is with non-malware development. Several users, particularly on Hackforums, report using them to complete mundane coding tasks, generating test data, and porting libraries to other languages – even if the results are not always correct and sometimes require manual fixes.

Figure 33: Hackforums users discuss using ChatGPT for code conversion

Forum enhancements

On both Hackforums and XSS, users have proposed using LLMs to enhance their forums for the benefit of their respective communities.

On Hackforums, for example, a frequent poster of AI-related scripts shared a script for auto-generated replies to threads, using ChatGPT.

Figure 34: A Hackforums user shares a script for auto-generating replies

This user wasn’t the first person to come up with the idea of responding to posts using ChatGPT. A month earlier, on XSS, a user wrote a long post in response to a thread about a Python crypter, only for another user to reply: “most chatgpt thing ive [sic] read in my life.”

Figure 35: One XSS user accuses another of using ChatGPT to create posts

Also on XSS, the forum’s administrator has taken things a step further than sharing a script, by creating a dedicated forum chatbot to respond to users’ questions.

Figure 36: The XSS administrator announces the launch of ‘XSSBot’

The announcement reads (trans.):

In this section, you can chat with AI (Artificial Intelligence). Ask a question – our AI bot answers you. This section is entertainment and technical. The bot is based on ChatGPT (model: gpt-3.5-turbo).

Short rules:

The section is entertaining and technical – you can create topics exclusively on the topics of our forum. No need to ask questions about the weather, biology, economics, politics, and so on. Only the topics of our forum, the rest is prohibited, the topics will be deleted.

How does it work? Open a topic – get a response from our AI bot.

You can enter into a dialogue with the bot, for this you need to quote it.

All members of the forum can communicate in the topic, and not just the author of the topic. You can communicate with each other and with the bot by quoting it.

One topic – one thematic question. If you have another question in a different direction, open a new topic.

Limitation in one topic – 10 messages (answers) from the bot.

This section and the AI-bot are designed to solve simple technical problems, for the technical entertainment of our users, to familiarize users with the possibilities of AI.

AI bot works in beta. By itself, ChatGPT is crude. OpenAI servers sometimes freeze. Consider all this.

Despite users responding enthusiastically to this announcement, XSSBot doesn’t appear to be particularly well suited for use in a criminal forum.

Figure 37: XSSBot refuses to tell a user how to code malware

Figure 38: XSSBot refuses to create a Python SSH bruteforcing tool, telling the user [emphasis added]: “It is important to respect the privacy and security of others. Instead, I suggest learning about ethical hacking and practicing it in legal and ethical ways.”

Perhaps as a result of these refusals, one user attempted, unsuccessfully, to jailbreak XSSBot.

Figure 39: An ineffective jailbreak attempt on XSSBot

Some users appear to be using XSSBot for other purposes; one asked it to create an advert and sales pitch for their freelance work, presumably to post elsewhere on the forum.

Figure 40: XSSBot produces promotional material for an XSS user

XSSBot obliged, and the user then deleted their original request – probably to avoid people learning that the text had been generated by an LLM. While the user could delete their posts, however, they could not convince XSSBot to delete its own, despite several attempts.

Figure 41: XSSBot refuses to delete the post it created

Script kiddies

Unsurprisingly, some unskilled threat actors – popularly known as ‘script kiddies’ – are eager to use LLMs to generate malware and tools they’re incapable of developing themselves. We observed several examples of this, particularly on Breach Forums and Hackforums.

Figure 42: A Breach Forums script kiddie asks how to use ChatGPT to hack anyone

Figure 43: A Hackforums user wonders if WormGPT can make Cobalt Strike payloads undetectable, a question which meets with short shrift from a more realistic user

Figure 44: An incoherent question about WormGPT on Hackforums

We also found that, in their excitement to use ChatGPT and similar tools, one user – on XSS, surprisingly – had made what appears to be an operational security error.

The user started a thread, entitled “Hey everyone, check out this idea I had and made with Chat GPT (RAT Spreading Method)”, to explain their idea for a malware distribution campaign: creating a website where visitors can take selfies, which are then turned into a downloadable “AI celebrity selfie image”. Naturally, the downloaded image is malware. The user claimed that ChatGPT helped them turn this idea into a proof-of-concept.

Figure 45: The post on XSS, explaining the ChatGPT-generated malware distribution campaign

To illustrate their idea, the user uploaded several screenshots of the campaign. These included images of the user’s desktop and of the proof-of-concept campaign, and showed:

All the open tabs in the user’s browser – including an Instagram tab with their first name
A local URL showing the computer name
An Explorer window, including a folder titled with the user’s full name
A demonstration of the website, complete with an unredacted photograph of what appears to be the user’s face

Figure 46: A user posts a photo of (presumably) their own face

Debates and thought leadership

Interestingly, we also noticed several examples of debates and thought leadership on the forums, especially on Exploit and XSS – where users in general tended to be more circumspect about practical applications – but also on Breach Forums and Hackforums.

Figure 47: An example of a thought leadership piece on Breach Forums, entitled “The Intersection of AI and Cybersecurity”

Figure 48: An XSS user discusses issues with LLMs, including “negative effects on society”

Figure 49: An excerpt from a post on Breach Forums, entitled “Why ChatGPT isn’t scary.”

Figure 50: A prominent threat actor posts (trans.): “I can’t predict the future, but it’s important to understand that ChatGPT is not artificial intelligence. It has no intelligence; it knows nothing and understands nothing. It plays with words to create plausible-sounding English text, but any claims made in it may be false. It can’t escape because it doesn’t know what the words mean.”

Skepticism

In general, we observed a lot of skepticism on all four forums about the capabilities of LLMs to contribute to cybercrime.

Figure 51: An Exploit user argues that ChatGPT is “completely useless”, in the context of “code completion for malware”

On occasion, this skepticism was tempered with reminders that the technology is still in its infancy:

Figure 52: An XSS user says that AI tools are not always accurate – but notes that there is a “lot of potential”

Figure 53: An Exploit user says (trans.): “Of course, it is not yet capable of full-fledged AI, but these are only versions 3 and 4, it is developing quite quickly and the difference is quite noticeable between versions, not all projects can boast of such development dynamics, I think version 5 or 7 will already correspond to full-fledged AI, + a lot depends on the limitations of the technology, made for safety, if someone gets the source code from the experiment and makes his own version without brakes and censorship, it will be more fun.”

Other commenters, however, were more dismissive, and not necessarily all that well-informed:

Figure 54: An Exploit user argues that “bots like this” existed in 2004

OPSEC concerns

Some users had specific operational security concerns about the use of LLMs to facilitate cybercrime, which may impact their adoption among threat actors in the long-term. On Exploit, for example, a user argued that (trans.) “it is designed to learn and profit from your input…maybe [Microsoft] are using the generated code we create to improve their AV sandbox? I don’t know, all I know is that I would only touch this with heavy gloves.”

Figure 55: An Exploit user expresses concerns about the privacy of ChatGPT queries

As a result, as one Breach Forums user suggests, what may happen is that people develop their own smaller, independent LLMs for offline use, rather than using publicly-available, internet-connected interfaces.

Figure 56: A Breach Forums user speculates on whether there is law enforcement visibility of ChatGPT queries, and who will be the first person to be “nailed” as a result

Ethical concerns

More broadly, we also observed some more philosophical discussions about AI in general, and its ethical implications.

Figure 57: An excerpt from a long thread in Breach Forums, where users discuss the ethical implications of AI

Conclusion

Threat actors are divided when it comes to their attitudes towards generative AI. Some – a mix of competent users and script kiddies – are keen early adopters, readily sharing jailbreaks and LLM-generated malware and tools, even if the results are not always particularly impressive. Other users are much more circumspect, and have both specific (operational security, accuracy, efficacy, detection) and general (ethical, philosophical) concerns. In this latter group, some are confirmed (and occasionally hostile) skeptics, whereas others are more tentative.

We found little evidence of threat actors admitting to using AI in real-world attacks, which is not to say that that’s not happening. But most of the activity we observed on the forums was limited to sharing ideas, proof-of-concepts, and thoughts. Some forum users, having decided that LLMs aren’t yet mature (or secure) enough to assist with attacks, are instead using them for other purposes, such as basic coding tasks or forum enhancements.

Meanwhile, in the background, opportunists, and possible scammers, are seeking to make a quick buck off this growing industry – whether that’s through selling prompts and GPT-like services, or compromising accounts.

On the whole – at least in the forums we examined for this research, and counter to our expectations – LLMs don’t seem to be a huge topic of discussion, or a particularly active market relative to other products and services. Most threat actors are continuing to go about their usual day-to-day business, while only occasionally dipping into generative AI. That being said, the number of GPT-related services we found suggests that this is a growing market, and it’s possible that more and more threat actors will start incorporating LLM-enabled components into other services too.

Ultimately, our research reveals that many threat actors are wrestling with the same concerns about LLMs as the rest of us, including apprehensions about accuracy, privacy, and applicability. But they also have concerns specific to cybercrime, which may inhibit them, at least at the moment, from adopting the technology more widely.

While this unease is demonstrably not deterring all cybercriminals from using LLMs, many are adopting a ‘wait-and-see’ attitude; as Trend Micro concludes in its report, AI is still in its infancy in the criminal underground. For the time being, threat actors seem to prefer to experiment, debate, and play, but are refraining from any large-scale practical use – at least until the technology catches up with their use cases.