Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Virus Pattern Recognition Using Self-Organization Map.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extreme Re-balancing for SVMs: a case study Advisor :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Rare and Frequent Events in Multi-camera Surveillance.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Local linear correlation analysis with the SOM Advisor :
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Exploiting Data Topology in Visualization and Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The k-means range algorithm for personalized data clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Looking inside self-organizing map ensembles with resampling.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A self-organizing neural network using ideas from the immune.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 AC-ViSOM: Hybridising the Modified Adaptive Coordinate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Exploiting Data Topology in Visualization and Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Plagiarism Detection Technique for Java Program Using.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Development of a reading material recommendation system based on a knowledge engineering approach Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. TurSOM: A Turing Inspired Self-organizing Map Presenter: Tsai Tzung Ruei Authors: Derek Beaton, Iren.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing map learning nonlinearly embedded manifoldsmanifolds Author :Timo Simila.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Fast accurate fuzzy clustering through data reduction Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Evolving Reactive NPCs for the Real-Time Simulation Game.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Motivated Reinforcement Learning for Non-Player Characters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using Text Mining and Natural Language Processing for.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A modified version of the K-means algorithm with a distance.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: YU-SHENG.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive FIR Neural Model for Centroid Learning in Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Mechanisms and Cluster Identification with TurSOM.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Self Organizing Maps and Bit Signature: a study applied.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A survey of kernel and spectral methods for clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Recognizing Partially Occluded, Expression Variant Faces.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Nonlinear Mapping for Data Structure Analysis John W.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Virus Pattern Recognition Using Self-Organization Map Advisor : Dr. Hsu Presenter : Chih-Ling Wang Authors : InSeon Yoo 2003 IEEE.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Introduction Windows executable file format&virus location Visualizing virus-infected windows executable files using SOM Example cases of virus visualization Virus recognition Conclusions Personal Opinion

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation Through examination of viruses’ trends gives me such an idea that the current major spread of viruses is using file worms which are sent via s. Nonetheless, the original method of virus infection cannot be avoided.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective In this paper I address that virus codes make a great effect to the whole projection of executable files. Without virus signatures, this SOM projection tells us what the virus-infected executable file looks like.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Introduction The classic virus-detection techniques look for the presence of a virus-specific sequence of instructions, called a virus signature, inside the program: if the signature is found, it is highly probable that the program is infected. However, without knowing virus signatures, how we can recognize the viruses?

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Introduction(cont.) I have used the SOM in a way that neurons will flag the presence of peculiar patterns in Windows executable files and that the position of the active neurons will reflect the position of potentially malicious content in the file. I address every virus has its own character to be distinguished. They cannot hide their own feature through the SOM projection.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Introduction(cont.) I have found sort of distinguished virus sign/pattern or a special feature in the SOM reflection. I call this a virus mask, in this paper.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Windows executable file format&virus location The real data of the virus-infected file is like in Figure 1. Most files have the same size of DOS stub(128 bytes), and the other parts are flexible. Apart from virus code, only PE header part is filled with quite similar pattern.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Windows executable file format&virus location(cont.) [Table I] is our test data file information. As we examine like in Figure 1, virus part character feature is different from the other program codes, which means that program codes and inserted virus code have different feature.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Visualizing virus-infected windows executable files using SOM  SOM creates a topological mapping by adjusting not only the winner’s weights, but also adjusting the weights of the adjacent output units in close proximity of in the neighborhood of the winner.  So not only does the winner get adjusted, but the whole neighborhood of output units gets moved closer to the input pattern.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 As training progress, the size of the neighborhood around the winning unit decreases. I present virus parts in the virus-infected files can be weighted differently, and can be visualized differently by SOM. Visualizing virus-infected windows executable files using SOM(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 12 A. Initialization I made each test data like a table. Each row of the table is one data sample. The columns of the table are the variables of the data set. Every sample has the same set of variables. Visualizing virus-infected windows executable files using SOM(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 13 B. Creation I use SOM normalization function to normalize data between 0 and 1. To create, initialize a SOM, we need SOM initialization, map size, training algorithm, and training type as well. The figure of SOM result might depend on initialization. It means the virus mask might locate in different place. Visualizing virus-infected windows executable files using SOM(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 14 C. Visualization Unified distance matrix, or u-matrix, is a method of displaying SOMs. First, when generating a u-matrix, a distance matrix between the reference vectors of adjacent neurons of two-dimensional map is formed. Then, some representation for the matrix is selected. The color in the figure have been selected so that the lighter the color between two neurons is, the smaller is the relative distance between them. Visualizing virus-infected windows executable files using SOM(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 15 A. Win95.CIH Virus Figure 3 shows the test Windows executable files before Win95.CIH virus infected. Figure 4 shows the trained SOMs of Win95.CIH 1.2, 1.3 and 1.4 virus- infected test Windows executable files. Each Win95.CIH/Chernobyl virus has obvious location in the upper of centre. Example cases of virus visualization

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 16 We trained the SOM with labels which categorized by given name, such as DS, PE, PR, and VS. The result of the projection with labels is like in Figure 5. Example cases of virus visualization(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 17 As a result, although each Windows executable file is different, the SOM projection of CIH virus-infected files look similar and have same sort of virus projection map. I would call this similar sort of spot in each virus-infected file as a virus mask. Example cases of virus visualization(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 18 B. Win95.Boza Virus Figure 6 shows us the Boza virus; the major of lighter color in the upper centre has virus codes. I made another projection with labels like in Figure 7. Example cases of virus visualization(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 19 As Figure 5 and Figure 7 are shown, there are two parts have smaller distances than the other parts, e.g., PE (NewEXE header part) and VS (Virus Code part). This tells us that PE and VS includes close neighborhood in their code. Example cases of virus visualization(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 20 C. Win32.apparition Virus Figure 8 shows the projection of Win32.apparition virus and the other projection with label for the distribution. Since this virus code part has unusual structure, the distribution itself is quite similar with the program code and data part. Example cases of virus visualization(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 21 D. Win32.HLLP.Semisoft Virus Figure 9 shows the projection of Win32.HLLP.Semisoft virus and the other projection with label for the distribution. This SOM of Win32.HLLP.Semisoft virus-infected file also has a virus mask and the major smaller likelihood part is virus code. Example cases of virus visualization(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 22 When we see all the virus mark through the training of SOMs, the virus mask is sort of virus spot with smaller likelihood data. I illustrate the result of SOM umatrix like in Figure.10 (a) neighboring neurons have similar weight vectors in the umatrix. (b) these weighted vectors represent virus spots, and I convert each virus spot into one character ’S’. We can make each column as one string, and can search certain patterns in each string, then compare each string with next strings. Virus recognition

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 23 I implement a virus detector, and test all virus-infected executable files which I mentioned above in this paper. Virus recognition(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 24 This virus detector using SOM is able to detect unknown viruses as well. All known-viruses have virus masks. Through detecting these virus-masks, the degree of belief about possibility to detect unknown viruses increases. Virus recognition(cont.)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 25 Conclusions As we find virus masks in virus-infected files, through these virus masks we can detect virus-infected files. This research can be applied for anti-virus detection programs without virus signature knowledge, especially for unknown new virus cases.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 26 Personal Opinion Many concepts in this paper is not clear enough. So we can’t understand some detail knowledge the author wants to give the readers.