HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Using Convolutional Neural Networks for Classification of Malware represented as Images

{Jordi Planes & Ramon Vicens Carles Mateu Daniel Gibert}

Abstract

The number of malicious files detected every year are counted by millions. One of the main reasons for these high volumes of different files is the fact that, in order to evade detection, malware authors add mutation. This means that malicious files belonging to the same family, with the same malicious behavior, are constantly modified or obfuscated using several techniques, in such a way that they look like different files. In order to be effective in analyzing and classifying such large amounts of files, we need to be able to categorize them into groups and identify their respective families on the basis of their behavior. In this paper, malicious software is visualized as gray scale images since its ability to capture minor changes while retaining the global structure helps to detect variations. Motivated by the visual similarity between malware samples of the same family, we propose a file agnostic deep learning approach for malware categorization to efficiently group malicious software into families based on a set of discriminant patterns extracted from their visualization as images. The suitability of our approach is evaluated against two benchmarks: the MalImg dataset and the Microsoft Malware Classification Challenge dataset. Experimental comparison demonstrates its superior performance with respect to state-of-the-art techniques.

Benchmarks

BenchmarkMethodologyMetrics
malware-classification-on-malimg-datasetGray-scale IMG CNN
Accuracy (10-fold): 0.9848
Macro F1 (10-fold): 0.9580
malware-classification-on-microsoft-malwareLBP features + XGBoost
Accuracy (5-fold): 0.951
malware-classification-on-microsoft-malwareHaralick features + XGBoost
Accuracy (5-fold): 0.9550
malware-classification-on-microsoft-malwareGray-scale IMG CNN
Accuracy (10-fold): 0.9750
Accuracy (5-fold): 0.973
LogLoss: 0.184483
Macro F1 (10-fold): 0.9400

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Using Convolutional Neural Networks for Classification of Malware represented as Images | Papers | HyperAI