New systems improve voice recognition

Sep 9, 2013

This post is also available in: עברית (Hebrew)

Graduate students and researchers at the University of Texas Dallas have developed novel systems that can identify speaking voices despite conditions that can make it harder to make out a voice, such as whispering, speaking through various emotions, or talking with a stuffy nose.

According to HLS News Wire by improving this ability to detect voice through changing conditions, the research could be used in voice recognition applications such as signing into a bank, getting into locked rooms, logging onto a computer, or verifying purchases online.

A UT Dallas release reports that the researchers are working in the Center for Robust Speech Systems (CRSS) under the direction of Dr. John Hansen, associate dean for research in the Erik Jonsson School of Engineering and Computer Science.

Using algorithms and modeling techniques, the group’s solutions are being sought after by other researchers in the signal processing field. The team has also won international competitions sponsored by the federal government, as well as by the largest professional engineering organization in the world — the Institute of Electrical and Electronics Engineers (IEEE).

Last fall, CRSS lab work was recognized with top rankings in the National Institute of Standards and Technology Speaker Recognition Evaluation. Earlier this summer, the team earned the Best Paper Award given at the IEEE International 38th International Conference on Acoustics, Speech and Signal Processing.

iHLS – Israel Homeland Security

Speaker verification works by either accepting or rejecting a signal that a sound matches the speech of a person. The process can be complicated by background noise or by the type of microphone used. A speaker’s voice can also change if the person hears background noise, is ill or ages.

Oftentimes, speaker verification systems are developed during ideal conditions — when background noise is controlled, when the person is prepared for the recording or reading a prepared text.

In a recent National Institute of Standards and Technology Speaker Recognition Evaluation challenge, officials sent about eighty million voice verification trials with added noise — natural background sounds or artificial computer generated sounds — to more than fifty universities, research labs and companies throughout the world. Teams had to determine whether speech recordings were from certain speakers or not.

UT Dallas’s CRSS lab members had already been refining this process based on earlier research and participation in similar competitions. They had created algorithms that more efficiently converted acoustic wave forms into computer processing for pattern analysis. Their process also eliminated silences and background noise to allow for easier processing.

New systems improve voice recognition

Latest

Fighter Jet Without a Runway? X-BAT Makes It Real

Engineering Meets Neuroscience in the Fight Against Chronic Pain

Metal That Behaves Like a Gel Could Redefine High-Temperature Systems

Gmail Data Leak: 183 Million Reasons to Rethink Your Password Security

3D-Printed Antennas Bend Without Breaking the Signal

New Method Enhances AI’s Ability to Recognize Personalized Objects

AI-Powered Eye Chip: A New Chapter for the Blind

New Tech Solves Key Weakness in Solid-State Batteries

99% Accuracy: How Never Mine is Shaping the Future of Demining

Stronger Magnets, Smaller Motors: A Boost for Clean Energy Tech

Batteries, Not Flux Capacitors: The Real Future of Urban Flight

Magnetic Origami Bots Take a Step Toward Smart Medicine

Smart, Scalable, Mobile: The Next-Gen Turret System

Tracking Mosquitoes and Floods from Space

Microscopic DNA Petals Mimic Nature to Perform Medical Tasks

The Engine That Breaks the Thermodynamic Rulebook

Smartwatch Breakthrough Enables Centimeter-Level GPS Accuracy

When AI Becomes Your Space Medic

Can’t be Cloned: Hydrogel Gives Products a Unique ID

Bridging the Global Talent Gap with AI: How Iverse Is Redefining...