New AI Tech Boosts Synthesized Speech and Voice Recognition

Sep 2, 2023

This post is also available in: עברית (Hebrew)

Developments in speech recognition technology from IBM and California universities offer hope for patients suffering from speech loss and vocal paralysis.

IBM reported the creation of a faster and more energy-efficient computer chip capable of turbo-charging speech-recognition model output.

With the explosive growth of large language models for AI projects, limitations of hardware performance leading to lengthier training periods and spiraling energy consumption have come to light.

IBM researchers seeking a solution say their prototype incorporates phase-change memory devices within the chip, optimizing fundamental AI processes known as multiply–accumulate (MAC) operations that greatly speed up chip activity. This bypasses the standard time- and energy-consuming routine of transporting data between memory and processor.

In processor-intensive speech recognition operations, IBM’s prototype achieved 12.4 trillion operations per second per watt, an efficiency level up to hundreds of times better than the most powerful CPUs and GPUs currently in use.

Meanwhile, researchers at UC San Francisco and UC Berkeley say they devised a brain-computer interface for people who lost the ability to speak that generates words from a user’s thoughts and efforts at vocalization, with Edward Chang, chair of neurological surgery at UC San Francisco saying their goal is to restore a “full, embodied way of communicating, which is the most natural way for us to talk with others.”

According to Techxplore, Chang and his team implanted two tiny sensors on the surface of the brain of a woman suffering from ALS (a neurogenerative disease that makes the person gradually lose their mobility and speech). The sensors were connected through a brain-computer interface to banks of computers housing language-decoding software.

The woman went through 25 training sessions in which she read sets of a few hundred sentences and her brain activity was translated by the decoder, which detected phonemes and assembled them into words. Researchers then synthesized her speech based on a recording of her speaking at a wedding years earlier and designed an avatar that reflected her facial movements.

After four months of training, the model was able to track the subject’s attempted vocalizations and convert them into intelligible words, and when based on a training vocabulary of 125,000 words, the accuracy rate was 76%.

The system was also able to translate the subject’s speech at a rate of 62 words per minute, which although an improvement from past experiments is still far from the 160-word-per-minute rate of natural speech.

Despite not yet being an actual device people can use in real life, this scientific proof of concept is a big step toward a world in which people with paralysis can speak once more.

New AI Tech Boosts Synthesized Speech and Voice Recognition

Latest

Iranian Hackers from the 2024 Elections Resume Cyber Attacks

Ukraine Unveils New Drone to Intercept Shahed UAVs

AI Models Unlock the Secrets of Personality Through Language Analysis

AI’s Double-Edged Impact on the Tech Workforce

Ultrasound Technology Offers Promising New Tool for Battery Safety Testing

AI Models Unintentionally Amplify Chinese State Narratives, New Report Finds

Next-Gen Military Textiles: Smart Uniforms with Embedded Sensors

AI-Enhanced Iranian-Made UAVs Used by Russia in Ukraine Signal Escalating Tech...

Study Finds AI Text Detectors May Unfairly Penalize Non-Native Writers in...

Ministry of Defense is Looking for Dual-Use Startups

Suspected Cyberattack Disrupts Columbia University

Outdated TP-Link Routers Targeted in Active Cyberattacks

Drones Are the New Threat on the Battlefield— These Portable Defenses...

New Study Warns: Overreliance on AI Writing Tools May Weaken Retention

The New Ultra-Light EW Payload for Small Drones

AI-Powered Wearable Enhances Navigation for the Visually Impaired

Pro-Iranian Hackers Leak Sensitive Data from Saudi Arabia’s Largest Sports Event

Israel’s Ministry of Defense is Seeking Dual-Use Startups

China Unveils Compact Laser Weapon That Works in Extreme Temperatures Without...

Unprecedented Data Leak Exposes 16 Billion Login Credentials in Massive Cybersecurity...