This post is also available in: עברית (Hebrew)
It is safe to assume that you’ve seen a funny deepfake video somewhere online, where the faces and sometimes voices of celebrities are put into odd or amusing situations. This technology sounds like a fun game that can be used for memes and pranks, but try and imagine receiving a phone call from someone who sounds exactly like a family member, pleading for help.
The term deepfake is short for a simulation powered by deep learning technology, AI that consumes enormous amounts of data to try and replicate something human, like having a conversation (like ChatGPT) or creating an image or illustration (like Dall-E).
According to Techxplore, experts in cyber security say that deepfake technology has advanced to a point where it can be used in real-time, enabling people to use someone’s face, voice and even movements in a call or virtual meeting. Furthermore, the technology is pretty widely available and relatively easy to use, and it’s only getting better.
Real-time deepfakes have been known to be used to scare grandparents into sending money to their simulated relatives, influence voters’ opinions or scam money out of lonely people looking for a connection.
“Thanks to AI tools that create ‘synthetic media’ or otherwise generate content, a growing percentage of what we’re looking at is not authentic, and it’s getting more difficult to tell the difference,” warned the Federal Trade Commission.
When asked about this, Andrew Gardner, vice president of research, innovation and AI at Gen said that “we know we’re not prepared as a society” for this threat. More specifically, there aren’t real available verification tools to immediately go to if you are being scammed. Such tools are slowly emerging, but not fast enough, and the ones available are not always effective or accessible to the average person.
According to Yisroel Mirsky, an AI researcher and deepfake expert at Ben-Gurion University of the Negev, it’s possible to create a deepfake video from a single photo, and a “decent” clone of a voice from as little as three or four seconds of audio. But the tools needed for that are quite complex and expensive, and according to Gardner the tools widely available to make deepfakes require about five minutes of audio and one to two hours of video.
Faced with this complicated problem, experts suggest a simple solution against deepfakes impersonating a family member: Have a code or secret word that all members of the family know and is hard to guess.
Ally Armeson, the program director of the Cybercrime Support Network says that there are clues that can expose a video deepfake, such as the person blinking too much or too little, having eyebrows that don’t fit the face or hair in the wrong spot, and skin that doesn’t match their age.
She also urges people to give their conversation partner these simple tasks: turn their head around and to put a hand in front of their face. She states that these actions can reveal a deepfake because the technology hasn’t been trained to do them realistically yet.
Mirsky and his team at Ben-Gurion University have developed a different approach, called D-CAPTCHA, which poses a test designed to confuse and reveal real-time deepfakes- for example, asking callers to hum, laugh, sing or just clear their throat.
On another, more hopeful note, Gardner states that the experiences people are currently having with AI and apps like ChatGPT have made them quicker to question what is real and what is fake, and to look more critically at what they’re seeing.