英文摘要 |
In the face of APT (Advanced Persistent Threat), malware classification is one of the promising solutions in the field of digital forensics. In previous literature, researchers performed dynamic analysis or static analysis after reverse engineering. In the other hand, malware developers even use anti-VM and obfuscation techniques try to evade malware classifiers. Honey pots are increasingly deployed throughout different networks; malware source code is collected and unclassified. Source code analysis provides a better classification for forensics. In this paper, a novel classification approach is proposed, based on logic similarity and directory structure similarity. Hierarchical clustering algorithm finds the best fit classification for each testing data and creates one if none fits well. New type of malware could be identified and then analyzed further. Such classification avoids to reanalyze known malware and allocates resources for new malware. The experimental results demonstrate that the proposed system can classify the malware effectively with a small mis-classification ratio. |