This post is also available in: heעברית (Hebrew)

Data storage is a major challenge for intelligence organizations. There are so many sources of intelligence data, ranging from satellite imagery to communications data, that the intelligence community is having difficulties figuring out where to store all that data.

Currently, all important intelligence data is stored at data centers. Data centers are large facilities capable of storing large amounts of digital data. The problem with these facilities is that they are difficult to scale and expensive to maintain.

Although now, the Intelligence Advanced Research Projects Activity (IARPA) believes it has found its solution towards the data storage issue: synthetic DNA.

IARPA has recently launched the Molecular Information Storage (MIST) program to research and hopefully develop synthetic DNA capable of storing up to an exabyte of data, or one billion gigabytes. 

As part of the program, IARPA has awarded contracts to two teams working to develop a solution. The Molecular Encoding Consortium has been awarded up to $23 million and the Georgia Tech Research Institute has been awarded up to $25 million.

The vision is to develop technologies capable of shrinking an exabyte of data to be able to operate on a desktop computer, all while reducing operation and maintenance costs. 

“This would be a transformative capability for big data stakeholders in government and industry,” said David Markowitz, IARPA Program Manager.

If all goes to plan, the scientists working on the MIST program will develop new devices capable of writing and reading data from synthetic DNA devices. reports that the goal is to make the technology commercially available sometime within three to five years.

At the end of the day, our DNA is just biological data. DNA works similar to how computers store data, where computers use binary code to format data our DNA uses nucleobase code, a collection of sequences using A,C,T, and G nucleobases. So to be able to store data on synthetic DNA, digital files must first be converted to the DNA format of sequences (A,C,T,G) from binary code. 

“With digital data growing at an exponential rate, there is increasing interest and excitement about using nature’s storage medium, DNA, to store digital data,” said Emily Leproust, CEO of Twist Bioscience, a company working alongside Georgia Tech on the MIST program. “We can truly revolutionize the DNA synthesis process, and reduce the cost of synthesis for DNA data storage by many orders of magnitude.”