New, Efficient Data Collection Algorithm

Jul 29, 2014

This post is also available in: עברית (Hebrew)

When you can’t collect all the data you need, a new algorithm tells you which to target. A new technique could help with both problems of difficulties to collect the right data and the time the process takes.

According to MIT News much artificial-intelligence research addresses the problem of making predictions based on large data sets. An obvious example is the recommendation engines at retail sites like Amazon and Netflix.

But some types of data are harder to collect than online click histories —information about geological formations thousands of feet underground, for instance. And in other applications — such as trying to predict the path of a storm — there may just not be enough time to crunch all the available data.

Dan Levine, an MIT graduate student in aeronautics and astronautics, and his advisor, Jonathan How, the Richard Cockburn Maclaurin Professor of Aeronautics and Astronautics, have developed a new technique that could help with both problems. For a range of common applications in which data is either difficult to collect or too time-consuming to process, the technique can identify the subset of data items that will yield the most reliable predictions. So geologists trying to assess the extent of underground petroleum deposits, or meteorologists trying to forecast the weather, can make do with just a few, targeted measurements, saving time and money.

Levine and How, consider the special case in which something about the relationships between data items is known in advance. Weather prediction provides an intuitive example: Measurements of temperature, pressure, and wind velocity at one location tend to be good indicators of measurements at adjacent locations, or of measurements at the same location a short time later, but the correlation grows weaker the farther out you move either geographically or chronologically.

iHLS – Israel Homeland Security

Such correlations can be represented by something called a probabilistic graphical model. In this context, a graph is a mathematical abstraction consisting of nodes — typically depicted as circles — and edges — typically depicted as line segments connecting nodes. A network diagram is one example of a graph; a family tree is another. In a probabilistic graphical model, the nodes represent variables, and the edges represent the strength of the correlations between them.

Levine and How developed an algorithm that can efficiently calculate just how much information any node in the graph gives you about any other — what in information theory is called “mutual information.” As Levine explains, one of the obstacles to performing that calculation efficiently is the presence of “loops” in the graph, or nodes that are connected by more than one path.

Calculating mutual information between nodes, Levine says, is kind of like injecting blue dye into one of them and then measuring the concentration of blue at the other. “It’s typically going to fall off as we go further out in the graph,” Levine says. “If there’s a unique path between them, then we can compute it pretty easily, because we know what path the blue dye will take. But if there are loops in the graph, then it’s harder for us to compute how blue other nodes are because there are many different paths.”

So the first step in the researchers’ technique is to calculate “spanning trees” for the graph. A tree is just a graph with no loops: In a family tree, for instance, a loop might mean that someone was both parent and sibling to the same person. A spanning tree is a tree that touches all of a graph’s nodes but dispenses with the edges that create loops.

New, Efficient Data Collection Algorithm

Latest

This Upgrade Could Eliminate One of Laser Weapons’ Biggest Weaknesses

A New Vision Sensor Mimics the Human Eye to Prevent Robotic...

The Mystery Around This Sixth-Gen Fighter Is Finally Starting to Lift

This Tiny AI Chip Just Passed Military Flight and Space Tests

This Intelligent Network Gives Border Forces a New Weapon Against Drones

The Future of Robotics Might Be Smaller Than You Think

INNOFENSE Innovation Center by iHLS – Bringing Together Startups, Industry Leaders,...

From Fish to Fleet: A New Approach to Underwater Robotics

New Counter-Drone Technology Keeps Weapons Locked on Targets While Driving

This AI-Generated Exploit Can Compromise a Website in Seconds

A Major Upgrade Is Coming to Military Night Vision

The Technology Giving Security Forces More Time to Spot Drones

This Directed-Energy Weapon Stops Drones Without Ammo

This AI Escaped Its Test—and Launched a Real Cyberattack

Researchers Say AI Could Become the Next Weapon Against Ransomware

This AI Tool Is Designed to Speed Up Battlefield Decision-Making

Researchers Just Took a Big Step Toward Smarter 6G Networks

Sensors Designed for Cars Are Now Helping Detect Drones

This Robot Dog Can Fight Fires While Keeping Crews Out of...

This AI-Powered Balloon Could Become the Next Big Thing in Electronic...