skip to content

AI, Easily Fooled: Hackers show how easy it is to hack ChatGPT, Google Bard and make them say anything

Def Con, the largest hacker conference globally, has always been a playground for cybersecurity experts to test and show off their skills, whether it’s hacking cars, uncovering vulnerabilities in smart homes, or even attempting to manipulate election outcomes.

This year’s Def Con event in Las Vegas took a predictable yet exciting turn as hackers focused on AI chatbots like ChatGPT and Google Bard.

At the conference, a contest was organized in which the goal for hackers wasn’t to uncover software vulnerabilities but rather to invent new types of prompt injections that could compel chatbots like Google’s Bard and ChatGPT to generate almost anything the attackers desired.

Interestingly, the contest saw the participation of major AI companies, including Meta, Google, OpenAI, Anthropic, and Microsoft. Their participation indicated a willingness to have hackers identify potential flaws in their generative AI tools.

Even the White House announced its support for this event in May, indicating the significance of the endeavor.

This shouldn’t be a shock to anyone. While these chatbots exhibit impressive technical capabilities, they have gained a reputation for consistently struggling to differentiate between factual information and fiction. Their susceptibility to manipulation has been demonstrated time and again.

Considering the billions of dollars pouring into the AI industry, there’s a tangible financial incentive to uncover these vulnerabilities.

“All of these companies are trying to commercialize these products,” explained Rumman Chowdhury, a trust and safety consultant involved in designing the contest. “And unless this model can reliably interact in innocent interactions, it is not a marketable product.”

The participating companies in the contest have taken measures to ensure a controlled environment. For example, any discovered vulnerabilities won’t be disclosed until February, giving the companies ample time to address them. Additionally, hackers at the event could only access the systems through provided laptops.

However, the effectiveness of the work in leading to lasting solutions remains uncertain. The guardrails implemented by these companies for their chatbots have been surprisingly easy to bypass with essential prompt injections, as demonstrated by recent research from Carnegie Mellon University. This vulnerability means these chatbots can be transformed into tools for spreading misinformation and promoting discrimination.

Furthermore, according to Carnegie Mellon researchers, finding a definitive solution to the root issue is far from simple, regardless of the specific vulnerabilities identified by Def Con hackers.

Zico Kolter, a Carnegie Mellon professor, and a research contributor, highlighted the challenge, stating, “There is no obvious solution. You can create as many of these attacks as you want quickly.”

Tom Bonner, a representative from the AI security firm HiddenLayer and a speaker at DefCon, echoed this sentiment, stating, “There are no good guardrails.”

Adding to the complexity, researchers at ETH Zurich in Switzerland recently revealed that even an essential collection of images and text could be used to “poison” AI training data, potentially leading to severe consequences.

In essence, AI companies are faced with a significant challenge ahead. With or without the scrutiny of an army of hackers testing their products, combatting misinformation in AI systems will require substantial efforts.

“Misinformation is going to be a lingering problem for a while,” remarked Rumman Chowdhury, underscoring the ongoing nature of this issue.

Share your love
Facebook
Twitter
LinkedIn
WhatsApp

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Unauthorized Content Copy Is Not Allowed