New Tool Protects Voice Recordings from Being Abused by Cybercriminals

New Tool Protects Voice Recordings from Being Abused by Cybercriminals

voice recognition

This post is also available in: heעברית (Hebrew)

It is widely known by now that technologies based on artificial intelligence have brought with them synthesized media cloning the appearance and sound of real humans (with accessible and easy generation of images, video, and audio). This advancement brings immense accessibility and creativity benefits, while also posing threats of increasingly sophisticated scams and cyberattacks using deepfakes.

To fight against the potential downsides and negative impacts of voice cloning technology, the US Federal Trade Commission (FTC) has called for researchers and technology experts to develop breakthrough ideas on preventing, monitoring, and evaluating malicious voice cloning.

According to Techxplore, Ning Zhang, an assistant professor of computer science and engineering at the McKelvey School of Engineering, was one of three winners of the Federal Trade Commission’s Voice Cloning Challenge. Zhang’s winning project, DeFake, deploys a kind of watermarking for voice recordings, embedding carefully crafted distortions that are imperceptible to the human ear into recordings, making criminal cloning more difficult by eliminating usable voice samples.

Zhang explained: “DeFake uses a technique of adversarial AI that was originally part of the cybercriminals’ toolbox, but now we’re using it to defend against them. Voice cloning relies on the use of pre-existing speech samples to clone a voice, which are generally collected from social media and other platforms. By perturbing the recorded audio signal just a little bit, just enough that it still sounds right to human listeners, but it’s completely different to AI, DeFake obstructs cloning by making criminally synthesized speech sound like other voices, not the intended victim.”

This innovative project builds on Zhang’s earlier work, which dealt with finding and beating unauthorized speech synthesis before it even happens. Zhang and the other two winners of the Voice Cloning Challenge (whose proposals focused on detection and authentication) demonstrate the many different approaches that are being developed to prevent harmful practices and protect consumers from malicious actors.