This post is also available in:
עברית (Hebrew)
As generative AI tools become a fixture in software development workflows, a new study warns of an emerging security threat linked to how large language models (LLMs) generate code. Researchers from the University of Texas at San Antonio, University of Oklahoma, and Virginia Tech have identified a phenomenon they call package hallucination—a byproduct of AI-generated code that could expose software supply chains to malicious actors.
The issue arises when LLMs suggest non-existent software packages during code generation. Developers may unknowingly include these fictitious packages in their projects. This creates an opening for attackers to register these hallucinated names and inject them with harmful code.
According to the study, over 576,000 code samples were generated using 16 leading LLMs to examine the scale of the problem. Commercial models like GPT-4 and Claude showed hallucination rates above 5%, while open-source models like CodeLlama and DeepSeek Coder generated non-existent packages more than 20% of the time. In total, researchers recorded over 200,000 unique hallucinated package names, highlighting both the frequency and variety of these false outputs.
The real danger lies in how this vulnerability could be exploited. A threat actor could monitor for frequently hallucinated names, then upload a malicious package under that name to a public repository. Developers, unaware of the package’s true origin, might incorporate it into their software—potentially compromising entire codebases or organizational dependency trees.
These findings raise concerns about the growing dependency on AI-generated code, especially in critical or production environments. While AI tools can accelerate development, researchers caution that they should not replace traditional software validation practices. Package hallucination attacks introduce a new layer of complexity, demanding increased vigilance from developers and security teams.
As the use of generative AI in coding continues to expand, this study serves as a reminder that convenience must be balanced with scrutiny—especially when working within an open-source ecosystem vulnerable to manipulation.