Cybercriminals are increasingly exploiting OpenAI’s model, ChatGPT, to carry out a range of malicious activities, including malware development, misinformation campaigns, and spear-phishing.
A new report revealed that since the beginning of 2024, OpenAI has disrupted over 20 deceptive operations worldwide, spotlighting a troubling trend of AI misuse that includes creating and debugging malware, producing content for fake social media personas, and generating persuasive phishing messages.
OpenAI says its mission is to ensure that its tools benefit humanity universally, and it is focusing on detecting, preventing, and disrupting attempts to misuse its models for harmful purposes. In this election year, the company said it’s particularly vital to establish strong, multi-layered defenses against state-linked cyber actors and covert influence operations that might use our models to promote deceptive campaigns on social media and other platforms.
Since the beginning of 2024, OpenAI has thwarted more than 20 operations and deceptive networks globally that have tried to exploit its models, including activities disrupted since its May 2024 threat report. These actions ranged from debugging malware and writing website articles to generating content for fake social media personas.
Activities ranged in complexity from simple content generation requests to sophisticated, multi-stage efforts aimed at analyzing and responding to social media posts. OpenAI said one case even involved a hoax related to AI use.
The report includes sample case studies to show the diverse activities that have been intercepted. To better understand how threat actors seek to exploit AI, it analyzed disrupted activities, identifying initial trends that the company believes can inform discussions about AI in the broader threat landscape.
Here are the key insights that emerged from OpenAI’s analysis:
- AI provides defenders, such as AI companies, with powerful capabilities to identify and analyze suspicious behavior. AI provides defenders, such as AI companies, with powerful capabilities to identify and analyze suspicious behavior. Since its May threat report, the company has continued to build new AI-powered tools that allow it to detect and dissect potentially harmful activity. While the investigative process still requires intensive human judgment and expertise throughout the cycle, these tools have allowed OpenAI to compress some analytical steps from days to minutes.
- Threat actors most often used the company’s models to perform tasks in a specific, intermediate phase of activity – after they had acquired basic tools such as internet access, email addresses, and social media accounts, but before they deployed “finished” products such as social media posts or malware across the internet via a range of distribution channels. Investigating threat actor behavior in this intermediate position allows AI companies to complement the insights of both “upstream” providers – such as email and internet service providers – and “downstream” distribution platforms, such as social media. Doing so requires AI companies to have appropriate detection and investigation capabilities in place.
- Threat actors continue to evolve and experiment with OpenAI’s models, but there has not been evidence of this leading to meaningful breakthroughs in their ability to create substantially new malware or build viral audiences. This is consistent with the company’s assessment of the capabilities of GPT-4o, which it has not seen as materially advancing real-world vulnerability exploitation capabilities as laid out in its Preparedness Framework. It is noteworthy that, of the case studies in this report, the deceptive activity that achieved the greatest social media reach and media interest was a hoax about the use of AI, not the use of AI itself.
- This limited impact also applies to the handful of networks OpenAI has seen that posted content about global elections this year. The company disrupted activity that generated social media content about the elections in the United States, Rwanda, and (to a lesser extent) India and the European Union; in these, it did not observe these networks attracting viral engagement or building sustained audiences.
- Finally, AI companies themselves can be the targets of hostile activity: as we describe below, we disrupted a suspected China-based threat actor known as “SweetSpecter” that was unsuccessfully spear phishing OpenAI employees’ personal and corporate email addresses.
Moving forward, OpenAI says it will continue collaborating across its intelligence, investigations, security research, and policy teams to anticipate how malicious actors might use advanced models for harmful purposes and plan enforcement actions accordingly.
“We remain committed to sharing our findings with our internal safety and security teams, informing key stakeholders, and partnering with industry peers and the research community to stay ahead of emerging risks and reinforce our collective safety and security,” the company said.
The opinions expressed in this post belongs to the individual contributors and do not necessarily reflect the views of Information Security Buzz.