Artificial Intelligence Can Now Deceive Humans- What Does That Mean For Our Future?

Sep 6, 2023

This post is also available in: עברית (Hebrew)

It is known by now that AI-based chatbots are prone to hallucinations (providing made up responses), which is an inherent flaw in them, but Artificial intelligence pioneer Geoffrey Hinton sees a potential for human manipulation, and is very concerned.

But wait, can AI systems actually deceive humans? Techxplore claims that several systems have already learned to do this, and the consequent risks range from fraud or election tampering to humans losing control over AI.

According to Techxplore, one disturbing example of a deceptive AI is Meta’s CICERO, an AI model designed to play the alliance-building world conquest game Diplomacy. Upon close inspection CICERO turned out to be a master of deception, regularly betraying other players, and in one case even pretended to be a human with a girlfriend.

Even large language models (LLMs) have displayed deceptive capabilities, some of which have learned to lie to win social deduction games in which players compete to “kill” one another and must convince the group they’re innocent.

So far, the examples have been of bots cheating and lying for a game’s sake- what’s the harm in that?

Techxplore claims that AI systems with deceptive capabilities could be misused in numerous ways, including to commit fraud or tamper with elections, or another problem entirely- use deception to escape human control.

In a simulated experiment in which an external safety test was designed to eliminate fast-replicating AI agents, the AI agents learned to play dead and disguise their fast replication rates precisely when being evaluated.

Learning deceptive behavior may not even require explicit intent to deceive- the abovementioned AI played dead out of a goal to survive rather than deceive.

What can be done?

There’s a clear need to regulate AI systems capable of deception, and the EU AI Act is a useful regulatory framework. It assigns each AI system one of four risk levels: minimal, limited, high and unacceptable. Systems with unacceptable risk are banned, while high-risk systems are subject to special requirements for risk assessment and mitigation. There is a current claim that AI systems capable of deception should be treated as “high-risk” or “unacceptable risk” by default.

Thinking that game-playing AIs like CICERO are benign is short-sighted; capabilities developed for game-playing can still contribute to the proliferation of deceptive AI products.

Artificial Intelligence Can Now Deceive Humans- What Does That Mean For Our Future?

Latest

Data Breaches Are Connected to Mass Layoffs, Research

Airbus Helicopter Breaks Speed Record

US’s New Nuclear Space Rocket Shortens Trip to Mars

Cyberattack Cuts Heat to 600 Buildings in Winter

CrowdStrike Crash and the Consequences of Invasive Cyber Security Software

Optimal Robotic Locomotion is Inspired by Lizards

Using AI to Predict and Control Wildfires

NoName Russian Cybergang Retaliates After Members’ Arrest

British Army Tests Wearable Drone-Controlling, Laser-Detecting Tech

New Brain Chip Revolutionizes Treatment of Parkinson’s Patients

Sci-Fi Spacesuit Turns Bodily Fluids into Drinking Water

Airbus Fighter Jets Get AI-Powered Drone Wingmen

Ukraine’s Sea Baby Drones Just Got Deadlier

Huge Space Laser Communications Breakthrough

AI Model Enhances Heart Scan Analysis

Chinese Submarines with Lasers Could Take Down Starlink Satellites

Historic Microsoft Outage Affected 8.5 Million Devices, Cyberattacks Follow

Tiny Drone Gets AI Eyes to Navigate Autonomously

UAV Traffic Test Sees 5,000 Drones Self-Fly Safely

New Chinese Radar Penetrates US Navy’s Toughest Jamming Jet