This post is also available in: עברית (Hebrew)
Security cameras are everywhere – in malls, stadiums, train stations, parking garages, airports. But with so much information flowing in, it can be challenging for the people in the control rooms monitoring activity to catch every little detail. And surprisingly, most mainstream video security technology lacks sound, color or both. Researchers in MITRE, a not-for-profit organization that operates research and development centers sponsored by the US federal government, have been exploring technology that will help spot patterns in crowds.
Users will be able to identify and prevent hazards related to public safety based upon real-time surveillance feeds and respond more rapidly and accurately to natural or man-made emergencies. The technology also could be used to conduct analysis for city planning such as deciding the locations of crosswalks and streetlights.
If all goes according to plan, the research will help detect, alert and react to unusual activities in crowded open, public areas that are more difficult to monitor; warn and help intelligence agents or soldiers on the battlefield in real time to anticipate threatening events by associating people’s traits with a particular set of circumstances; and be used as an investigative tool to speed up and pinpoint evidence of fraudulent activities among an overwhelming amount of data.
Chongeun Lee is the principal investigator on the LinkBioMan technology project, part of MITRE’s internal research program. A team of researchers is contributing its expertise in video analytics, biometrics, machine learning, human language technology and computational auditory perception to create sensors that can spot irregularities in videos.
“The goal of the research is to create a decision framework that can process audio and video in real time and recommend a timely action—alert or no-alert—based upon the fusion of multimodal data,” Lee says. “The outcome is intended to be used for alerting operators … monitoring multiple feeds of the one that requires attention and action.”
Researchers decided to compare public protests and riots with concerts to determine the differences in alert context. Then they designed and implemented a decision framework, trained audio and video classifiers leveraging existing open source tools, and developed ontology on the selected test cases.
According to afcea.org, having focused mainly on nonperson entities thus far, next fiscal year the team plans to add soft biometrics such as gender, clothes and hair color to crowd behaviors; implement user error feedback and correction; and augment the decision framework with temporal tracking of events, Lee says. With continued funding, the technology should be fully developed by the end of the fiscal year 2019.
Transitioning LinkBioMan to commercial and government customers like the U.S. Defense Department and financial institutions won’t be a stretch. “We are intentionally building our system to be modular, flexible and tailorable for individual needs,” Lee emphasizes.