HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

A Hierarchical Convolutional Neural Network for Malware Classification

{Jordi Planes Carles Mateu Daniel Gibert}

Abstract

Malware detection and classification is a challenging problem and an active area of research. Particular challenges include how to best treat and preprocess malicious executables in order to feed machine learning algorithms. Novel approaches in the literature treat an executable as a sequence of bytes or as a sequence of assembly language instructions. However, in those approaches the hierarchical structure of programs is not taken into consideration. An executable exhibits various levels of spatial correlation. Adjacent code instructions are correlated spatially but that is not necessarily the case. Function calls and jump commands transfer the control of the program to a different point in the instruction stream. Furthermore, these discontinuities are maintained when treating the binary as a sequence of byte values. In addition, functions might be arranged randomly if addresses are correctly reorganized. To address these issues we propose a Hierarchical Convolutional Network (HCN) for malware classification. It has two levels of convolutional blocks applied at the mnemonic-level and at the function-level, enabling us to extract n-gram like features from both levels when constructing the malware representation. We validate our HCN method on the dataset released for the Microsoft Malware Classification Challenge, outperforming almost every deep learning method in the literature.

Benchmarks

BenchmarkMethodologyMetrics
malware-classification-on-microsoft-malwareCNN+BiLSTM
Accuracy (10-fold): 0.9820
LogLoss: 0.0744
Macro F1 (10-fold): 0.9605
malware-classification-on-microsoft-malwareHierarchical Attention Network
Accuracy (10-fold): 0.9742
LogLoss: 0.0933
Macro F1 (10-fold): 0.9468
malware-classification-on-microsoft-malwareMalConv
Accuracy (10-fold): 0,9641
LogLoss: 0.3071
Macro F1 (10-fold): 0.8902
malware-classification-on-microsoft-malwareDeepConv
Accuracy (10-fold): 0.9756
LogLoss: 0.1602
Macro F1 (10-fold): 0.9071
malware-classification-on-microsoft-malwareHierarchical Convolutional Network
Accuracy (10-fold): 0.9913
LogLoss: 0.0419
Macro F1 (10-fold): 0.9830

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Hierarchical Convolutional Neural Network for Malware Classification | Papers | HyperAI