ODNI crowsources speech-recognition software

ODNI crowsources speech-recognition software

Illustration

This post is also available in: heעברית (Hebrew)

Illustration
Illustration

The Intelligence Advanced Research Projects Activity (IARPA) wants to crowdsource ideas for software that would make human speech recorded in noisy environments more easily translated into less error-prone data.

Speech-recognition software renders human speech into text that can then be used in a variety of ways. As any smartphone owner who uses Siri knows, however, sometimes the software misunderstands what’s being said or is overwhelmed by background noise.

Those are the types of problems IARPA, the Office of the Director of National Intelligence’s research arm, wants to address in a new competition that seeks ideas from the public and commercial providers on how to overcome problems with speech-recognition software.

IARPA’s Automatic Speech in Reverberant Environments (ASpIRE) Challenge offers up to $50,000 in prizes for innovative ideas.

iHLS Israel Homeland Security

“We’re still ironing out the details surrounding the challenge kickoff date,” IARPA spokeswoman Schira Madan said in an email message to FCW. “Right now, the information on our website is meant to be a teaser ad.”

The ASpIRE Challenge announcement calls automatic speech-recognition software that works in a variety of acoustic environments and recording scenarios the “holy grail of the speech research community.” The project is a spin-off of the agency’s Babel program, which seeks to develop agile speech-recognition technology that can be applied to any language to help intelligence analysts rapidly search massive amounts of recorded speech.

IARPA plans to give ASpIRE participants access to 15 hours of multi-microphone speech recordings so they can tune their software entries. During the evaluation period, participants will receive 10 hours of new transcribed “far-field microphone data” from noisy, echoing rooms. IARPA will then evaluate the technology’s error rates in understanding the recordings.