Cybercriminals Bypass ChatGPT Restrictions to Generate Malicious Content


There have been many discussions and research on how cybercriminals are leveraging the OpenAI platform, specifically ChatGPT, to generate malicious content such as phishing emails and malware. In Check Point Research’s (CPR) previous blog, we described how ChatGPT successfully conducted a full infection flow, from creating a convincing spear-phishing email to running a reverse shell, which can accept commands in English.

CPR researchers recently found an instance of cybercriminals using ChatGPT to “improve” the code of a basic Infostealer malware from 2019. Although the code is not complicated or difficult to create, ChatGPT improved the Infostealer’s code.

Figure 1. Cybercriminal is using ChatGPT to improve Infostealer’s code

Working with OpenAI models

As of now, there are two ways to access and work with open AI models:

  1. Web user interface – To use ChatGPT, DALLE-2 or the openAI playground.
  2. API – Used for building applications, processes, etc. You can use your own user interface with the OpenAI models and data running in the background.

The examples below show legitimate brands which have integrated OpenAI´s API models as part of their applications.

 Figure 2. Brands using OpenAI´s API model

Barriers to malicious content creation

As part of its content policy, OpenAI created barriers and restrictions to stop malicious content creation on its platform. Several restrictions have been set within ChatGPT’s user interface to prevent the abuse of the models. For example, if you ask ChatGPT to write a phishing email impersonating a bank or create a malware, it will not generate it.

Figure 3. Examples of ChatGPT responses for abusive requests

Bypassing limitations to create malicious content

However, CPR is reporting that cyber criminals are working their way around ChatGPT’s restrictions and there is an active chatter in the underground forums disclosing how to use OpenAI API to bypass ChatGPT´s barriers and limitations.  This is done mostly by creating Telegram bots that use the API. These bots are advertised in hacking forums to increase their exposure.

Figure 4. Russian cybercriminals show an interest in integrating ChatGPT via API to their Telegram channels

The current version of OpenAI´s API is used by external applications (for example, the integration of OpenAI’s GPT-3 model to Telegram channels) and has very few if any anti-abuse measures in place. As a result, it allows malicious content creation, such as phishing emails and malware code, without the limitations or barriers that ChatGPT has set on their user interface.

Example 1 – Telegram OpenAI bot as a service

In an underground forum, CPR found a cybercriminal advertising a newly created service: a Telegram bot using OpenAI API without any limitations and restrictions.

Figure 5. Underground forum advertisement of OpenAI bot in Telegram

Figure 6. Example of a phishing email created in a Telegram bot utilizing OpenAI API

Figure 7. Example of the ability to create a malware code without anti-abuse restrictions in a Telegram bot utilizing the OpenAI API

Figure 8. Business model of OpenAI API based Telegram channel

Example 2 – A basic script to bypass the anti-abuse restrictions advertised by cybercriminal

A cybercriminal created a basic script that uses OpenAI API to bypass anti-abuse restrictions.

Figure 9. Example of a script directly querying the API and bypassing restrictions to develop malware

In conclusion, we see cybercriminals continue to explore how to utilize ChatGPT for their needs of malware development and phishing emails creation.
As the controls ChatGPT implement improve, cybercriminals find new abusive ways to use OpenAI models – this time abusing their API.

Next Post Previous Post
No Comment
Add Comment
comment url