This post is also available in: עברית (Hebrew)
Members of Microsoft’s Threat Protection Intelligence Team have joined representatives of Intel Labs to create images out of malware samples that can be used to detect malicious code.
“As malware variants continue to grow, traditional signature-matching techniques cannot keep up. We looked to applying deep-learning techniques to avoid costly feature engineering and used machine-learning techniques to learn and build classification systems that can effectively identify malware program binaries,” according to Intel researchers.
How was this achieved? The researchers fed malware samples into a program that converts the data into grayscale images, using an approach called static malware-as-image network analysis (STAMINA). They then analyze the samples for structural patterns that can be used to distinguish between benign and malicious code, and then rank the malicious suspects into degree of threat.
The study relied on earlier work by Intel on deep transfer learning for static malware classification. Static analysis permits malware detection without having to execute code or monitor runtime behavior.
Drawing on Microsoft’s massive dataset of malware code collected through its Defender security system, the researchers say they achieved “high accuracy” in detecting malware and “low false positives.” With static analysis, most threats are detected before they are triggered.
The study consisted of three steps: image conversion, transfer learning, and evaluation. In a process that included pixel conversion and resizing, malware code drawn from 2.2 million infected files was converted into two-dimensional images. The next step used transfer learning to apply knowledge obtained about detected malware in one task to similarly structured unidentified code. The last step was evaluation.
The STAMINA program achieved an accuracy of more than 99 percent identifying and categorizing malware samples, with a false positives rate of 2.6 percent, according to techxplore.com.