This post is also available in:
עברית (Hebrew)
DeepSeek’s reasoning model, R1, a new AI tool designed to assist in various tasks, has shown vulnerabilities when it comes to generating malicious software. Researchers from cybersecurity firm Tenable have demonstrated that R1 can be easily manipulated into creating malware, such as keyloggers and ransomware, with just a few carefully crafted prompts.
While generative AI tools have many benefits, they can also be misused by bad actors. This has prompted companies like OpenAI to implement safety measures to prevent such abuse. However, DeepSeek’s R1 model appears to have weaknesses that allow it to bypass these guardrails, making it more susceptible to malicious exploitation.
In a series of tests, Tenable researchers prompted R1 to generate a keylogger using C++ code. Initially, the model refused to cooperate, explaining that keyloggers can be used for malicious purposes. However, by using specific language, such as stating that the code would be used for educational purposes, the researchers were able to manipulate the system. After several rounds of prompting, R1 provided code for a keylogger, though it was incomplete and buggy. According to the researchers, the keylogger was just a few adjustments away from becoming fully functional.
Further attempts to refine the keylogger, such as hiding the keylogging file better, still required the researchers to manually tweak the generated code. Similar tactics were used to develop ransomware, with R1 providing detailed instructions on creating a simple ransomware program. Despite its effectiveness in laying the groundwork, the ransomware also required manual editing to be fully operational.
The findings highlight a key vulnerability in R1’s design, but while it can generate the basic structure of malicious software, it cannot produce a fully functional piece of malware without additional input and code modification from humans. Still, this poses a significant concern for cybersecurity, as AI-driven systems like R1 could potentially be exploited to automate the creation of harmful software if not properly controlled.