Kaspersky experts have conducted research studying ChatGPT phishing links detection capability. While ChatGPT had previously demonstrated the ability to create phishing emails and write malware, its effectiveness in detecting malicious links was limited.
The study revealed that although ChatGPT knows a great deal about phishing and can guess the target of a phishing attack, it had high false positive rates of up to 64%. Often, it produced imaginary explanations and false evidence to justify its verdicts.
ChatGPT, an AI-powered language model, has been a topic of discussion in the cybersecurity world due to its potential to create phishing emails and the concerns about its impact on cybersecurity experts’ job security even despite its creators’ warnings that it is too early to apply the novel technology to such high-risk domains. Kaspersky experts decided to conduct an experiment to reveal ChatGPT’s ability to detect phishing links, as well as the cybersecurity knowledge it learned during training. The company’s experts tested gpt-3.5-turbo, the model that powers ChatGPT, on more than 2,000 links that Kaspersky anti-phishing technologies deemed phishing and mixed it with thousands of safe URLs.
Can ChatGPT help with classifying and investigating cyberscams?
In the experiment, detection rates varied depending on the prompt used. The experiment was based on asking ChatGPT two questions: “Does this link lead to a phishing website?” and “Is this link safe to visit?”. The results showed that ChatGPT had a detection rate of 87.2% and a false positive rate of 23.2% for the first question. The second question, “Is this link safe to visit?” had a higher detection rate of 93.8%, but a higher false positive rate of 64.3%. While the detection rate is very high, the false positive rate is too high for any kind of production application.
|Question asked||Detection rate||False positive rate|
|Does this link lead to a phishing website?||87.2%||23.2%|
|Is this link safe to visit?||93.8%||64.3%|
The unsatisfactory results at the detection task were expected, but could ChatGPT help with classifying and investigating attacks? Since attackers typically mention popular brands in their links to deceive users into believing that the URL is legitimate and belongs to a reputable company, the AI language model shows impressive results in the identification of potential phishing targets. For instance, ChatGPT has successfully extracted a target from more than half of the URLs, including major tech portals like Facebook, TikTok, and Google, marketplaces such as Amazon and Steam, and numerous banks from around the globe, among others – without any additional training.
The experiment also showed ChatGPT might have serious problems when it comes to proving its point on the decision whether the link is malicious. Some explanations were correct and based on facts, others revealed known limitations of language models, including hallucinations and misstatements: many explanations were misleading, despite the confident tone.
Below are the examples of misleading explanations provided by ChatGPT:
- References to WHOIS, which the model doesn’t have access to:
- Finally, if we perform a WHOIS lookup for the domain name, it was registered very recently (2020-10-14) and the registrant details are hidden.
- References to content on a website that the model doesn’t have access to either:
- the website is asking for user credentials on a non-Microsoft website. This is a common tactic for phishing attacks.
- The domain ‘sxxxxxxp.com’ is not associated with Netflix and the website uses ‘http’ protocol instead of ‘https’ (the website uses https)
- Revelatory nuggets of cybersecurity information:
- The domain name for the URL ‘yxxxx3.com’ appears to be registered in North Korea which is a red-flag.
“ChatGPT certainly shows promise in assisting human analysts in detecting phishing attacks but let’s not get ahead of ourselves – language models still have their limitations. While they might be on par with an intern-level phishing analyst when it comes to reasoning about phishing attacks and extracting potential targets, they tend to hallucinate and produce random output. So, while they might not revolutionise the cybersecurity landscape just yet, they could still be helpful tools for the community,” Vladislav Tushkanov, Lead Data Scientist at Kaspersky says.