英文摘要 |
Malicious code threatens the safety of computer systems. Researching malicious code design techniques and mastering code behavior patterns are the basic work of network security prevention. With the game of network offense and defense, malicious code shows the characteristics of invisibility, polymorphism, and multi-dismutation. How to correctly and effectively understand malicious code and extract the key malicious features is the main goal of malicious code detection technology. As an important method of program understanding, program slicing is used to analyze the program code by using the idea of “decomposition”, and then extract the code fragments that the analyst is interested in. In recent years, data mining and machine learning techniques have been applied to the field of malicious code detection. The reason why it has become the focus of research is that it can use data mining to dig out meaningful patterns from a large amount of existing code data. Machine learning can It helps to summarize the identification knowledge of known malicious code, so as to conduct similarity search and help find unknown malicious code. The machine learning heuristic malicious code detection method firstly needs to automatically or manually extract the structure, function and behavior characteristics of the malicious code, so we can first slice the malicious code and then perform the detection. Through the improvement of the classic program slicing algorithm, this paper effectively improves the slicing problem between binary code processes. At the same time, it implements a malicious code detection system. The machine code byte sequence variablelength N-gram is used as the feature extraction method to further prove that The efficiency and accuracy of malicious code detection technology based on data mining and machine learning. |