This post is also available in: heעברית (Hebrew)

Artificial intelligence systems will be able to identify who, or what, authored any specific text, a game-changing technology for tracking disinformation campaigns and other malicious activities that go on in online text forums.

This is part of a US Intelligence Advanced Research Projects Activity (IARPA) program dubbed HIATUS — or, human interpretable attribution of text using underlying structure. 

The move marks the IC’s latest research and development effort in human language technology.

The challenges IARPA aims to confront through HIATUS are incredibly complex. Heaps of multilingual raw text are produced by anonymous authors—both human and machine—every day. Such materials generally contain linguistic components that can be used to pinpoint precisely who crafted the information or to safeguard authors’ identities if attribution could put them in some sort of danger.

The proposed technique would reportedly work somewhat similar to other ways forensics experts currently determine someone’s identity based on their handwriting. Just as humans have tiny little individual differences and idiosyncrasies in the way they write a word, online authors similarly have their own tells when crafting sentences online.

“Think about it as like your written fingerprint,” Dr. Timothy McKinnon, who heads the program told defenseone.com. “The technology would be able to identify that fingerprint compared against a corpus of other documents, and match them up if they are from the same author,” he explained. “On the privacy side, what the technology would do is it would figure out ways that text could be modified so that it no longer looks like a person’s writing.”

“We’re looking to develop systems that can be robustly performant across diverse domains and genres of text—and also, there’s going to be foreign languages involved in the program as it progresses as well,” McKinnon said.