AI Could Be Poisoned in Wartime, and the Pentagon Is Worried

image provided by pixabay

This post is also available in: עברית (Hebrew)

Armies worldwide are beginning to use more and more AI-based modules, which raises the risk of adversaries poisoning those modules and sabotaging the algorithms, bringing catastrophic results.

Every machine-learning algorithm has to be trained on huge amounts of data, and militaries that are using those algorithms need to not only collect, organize, curate, and clean data to train them, but also to ensure there are no corrupt or dangerous data points that could teach it the wrong thing.

Take for example the commercial ChatGPT, which is known to have learned politically and racially biased misinformation from its world-wide-web training. The difference when it comes to military technology is that those modules could be intentionally poisoned with false and harmful information.

Because of this, Pentagon officials are claiming that the military should train its LLMs on a trusted, verified military dataset inside a secure, firewalled environment. Deputy Assistant Secretary Jennifer Swanson stated that the Pentagon’s main concern is the algorithms that are going to be informing battlefield decisions: “The consequences of bad data or bad algorithms or poison data or trojans or all of those things are much greater in those use cases. That’s really, for us, where we are spending the bulk of our time.”

The Pentagon, which aims to use AI to coordinate future combat operations across land, air, sea, space, and cyberspace, mainly focuses on getting the right military-specific training data. They are trying to apply the feedback loop between development, cybersecurity, and current operations used by software developers to machine learning.

But Swanson is still worried about how much the industry still doesn’t know about AI, and one of her biggest concerns is how difficult it is to test it.

While traditional code uses rigid IF-THEN-ELSE algorithms that always give the same output for a given input, modern machine learning relies on neural networks modeled loosely on the human brain, which work off complex statistical correlations and probabilities (this is why ChatGPT can give different answers for the same exact question).

Furthermore, ML algorithms keep learning when exposed to new data, making them much more adaptable than traditional code that requires a human to make changes manually, but also means they can suddenly “change their course” from what their creators intended.

“Because it continues learning, how do you manage that in the battlefield [to] make sure it’s not just going to completely go wild? How do you know that your data has not been poisoned?” Swanson asked, and concluded: “There’s a lot of work that’s going on right now. My hope is that a lot of this work is going to produce some answers in the next year or two and enable us to set the foundation properly from the beginning.”

This information was provided by Breaking Defense.