ChatGPT and the cyber risk you haven’t thought of

ChatGPT is already the cause of security breaches, but not through attacks — people are voluntarily (and unlawfully) uploading sensitive information to the system in order to generate insights. One report found that over 4% of employees have already tried to put sensitive company data into the model, and the recent release of GPT4, which can accept much larger chunks of text, is likely to make this problem worse (and quickly).  

The report came from a company who detected and blocked the 67,000 attempts across their client base, and most organizations don’t have this capability. One executive tried to paste the corporate strategy into the system to make a PowerPoint, and a doctor put in a patient’s name and medical condition to draft a report. 

Cyber experts have proven that training data extraction attacks are possible in GPT, where an attacker can get the system to recall, verbatim, sensitive information it has been given, and in March 2023, ChatGPT did go offline because users were able to see descriptions of other users’ conversations with the system. 

What are the shortfalls of ChatGPT in cyber security? 

As well as potentially having your spilled data ‘hacked’ out of GPT’s databases, just spilling it in the first place could be a breach of many different security policies, secrecy laws, and privacy regulations. And on the flip side, retrieving and using someone else’s information from GPT, that turns out to be proprietary, confidential, or copyright, could also get your company in trouble.  

The only way to stop this, apart from blocking access to GPT or other LLM tools, is training and education. But it’s very hard to train every staff member, and make sure they understand and retain that training. It’s even harder to make sure they apply it day to day.   

What are the other known risks? 

Chatbots have the challenge of being opaque. When they give an answer, it’s hard to fact check. We run a risk that staff will rely on the machine’s recommendation, even when that recommendation is wrong. When we start to ask the machine security-related questions, the consequences of being wrong can be catastrophic.  

This is why it’s so essential not to rely on black-box, algorithmic AI for regulatory or security compliance. When we are deciding what law or policy to apply, we need to be able to understand and challenge the evidence behind that decision. 

This is why we use Rules as Code AI for automating governance and compliance in enterprises. It is transparent, and traceable, which are vital principles for ethical AI.  

What other risks could emerge? 

Phishing is already one of the most common and successful attack methods for bad actors. GPT can put the ability to craft more believable phishing messages (and quickly) in more hands. Deepfakes are the next level. A believable email from your boss asking you to email a sensitive document, followed up by a video call that looks and sounds exactly like her? These are some of the enormous challenges we face that training just can’t control for.   

Having a data spill is essentially inevitable. We can never reduce the likelihood of a breach to zero, because we will always have trusted insiders. The approach to take now is to reduce the potential impact of a future breach. Know what data you have, what risk it has, and what value. What rules apply to it, where it is, who is doing what to it. And know what needs to be locked down, and what can be disposed of (across the whole enterprise). This is something we can use AI for right now, and it’s really moving the needle back to the side of good governance.