This post is also available in: עברית (Hebrew)
The intelligence community is collecting more data than ever. But does that mean the intelligence gleaned from these massive new stores of data is also getting better?
Intelligence officials are naturally a little tight-lipped about the types of capabilities at their disposal. But current and former officials agree the intelligence community’s foray into big data – using new tools to collect, process and sift through data on a massive scale – remains a work in progress.
For one thing, the intelligence community needs more data scientists and applications, according to John Custer, the retired army major general and former director of intelligence for US Central Command, who’s now the director of EMC’s federal division.
The kind of applications and tools necessary to sift through petabytes of data will also require a different breed of intelligence analysts, he said. “We’re talking about applications that have to be equal to your ‘star trek’ universal translator,” he said. “They have to speak multiple languages, and they have to query a host of different kinds of databases. This is graduate-level work.”
One of those applications is Hadoop, open-source software designed to help process massive datasets built across clusters of commodity servers, which is used by tech giants such as Facebook and Twitter.
I guarantee you can talk to 99% of analysts and they have no idea what Hadoop can do for them; they don’t even know what Hadoop is.” Continued Custer.
The National Security Agency has reportedly been an early adopter and robust user of Hadoop. But Custer said the intelligence community as a whole has only scratched the surface of its capability – and traditional intelligence analysts are still largely left in the dark.
Ellen McCarthy, chief operating officer of the national geospatial-intelligence agency, acknowledged the intelligence community “may not be keeping pace with the private sector” when it comes to big data, but suggested the percentage of analysts familiar with Hadoop and other big-data analysis tools was much higher than 1 percent.