1 / 26

Virus Pattern Recognition Using Self-Organization Map

Virus Pattern Recognition Using Self-Organization Map. Advisor : Dr. Hsu Presenter : Chih-Ling Wang Authors : InSeon Yoo. 2003 IEEE. Outline. Motivation Objective Introduction Windows executable file format&virus location

ellema
Download Presentation

Virus Pattern Recognition Using Self-Organization Map

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Virus Pattern Recognition Using Self-Organization Map Advisor : Dr. Hsu Presenter : Chih-Ling Wang Authors :InSeon Yoo 2003 IEEE .

  2. Outline • Motivation • Objective • Introduction • Windows executable file format&virus location • Visualizing virus-infected windows executable files using SOM • Example cases of virus visualization • Virus recognition • Conclusions • Personal Opinion

  3. Motivation • Through examination of email viruses’ trends gives me such an idea that the current major spread of email viruses is using file worms which are sent via emails. • Nonetheless, the original method of virus infection cannot be avoided.

  4. Objective • In this paper I address that virus codes make a great effect to the whole projection of executable files. • Without virus signatures, this SOM projection tells us what the virus-infected executable file looks like.

  5. Introduction • The classic virus-detection techniques look for the presence of a virus-specific sequence of instructions, called a virus signature, inside the program: if the signature is found, it is highly probable that the program is infected. • However, without knowing virus signatures, how we can recognize the viruses?

  6. Introduction(cont.) • I have used the SOM in a way that neurons will flag the presence of peculiar patterns in Windows executable files and that the position of the active neurons will reflect the position of potentially malicious content in the file. • I address every virus has its own character to be distinguished. • They cannot hide their own feature through the SOM projection.

  7. Introduction(cont.) • I have found sort of distinguished virus sign/pattern or a special feature in the SOM reflection. • I call this a virus mask, in this paper.

  8. Windows executable file format&virus location • The real data of the virus-infected file is like in Figure 1. • Most files have the same size of DOS stub(128 bytes), and the other parts are flexible. • Apart from virus code, only PE header part is filled with quite similar pattern.

  9. Windows executable file format&virus location(cont.) • [Table I] is our test data file information. • As we examine like in Figure 1, virus part character feature is different from the other program codes, which means that program codes and inserted virus code have different feature.

  10. Visualizing virus-infected windows executable files using SOM • SOM creates a topological mapping by adjusting not only the winner’s weights, but also adjusting the weights of the adjacent output units in close proximity of in the neighborhood of the winner. • So not only does the winner get adjusted, but the whole neighborhood of output units gets moved closer to the input pattern.

  11. Visualizing virus-infected windows executable files using SOM(cont.) • As training progress, the size of the neighborhood around the winning unit decreases. • I present virus parts in the virus-infected files can be weighted differently, and can be visualized differently by SOM.

  12. Visualizing virus-infected windows executable files using SOM(cont.) A. Initialization • I made each test data like a table. • Each row of the table is one data sample. • The columns of the table are the variables of the data set. • Every sample has the same set of variables.

  13. Visualizing virus-infected windows executable files using SOM(cont.) B. Creation • I use SOM normalization function to normalize data between 0 and 1. • To create, initialize a SOM, we need SOM initialization , map size, training algorithm, and training type as well. • The figure of SOM result might depend on initialization. It means the virus mask might locate in different place.

  14. Visualizing virus-infected windows executable files using SOM(cont.) C. Visualization • Unified distance matrix, or u-matrix, is a method of displaying SOMs. • First, when generating a u-matrix, a distance matrix between the reference vectors of adjacent neurons of two-dimensional map is formed. • Then, some representation for the matrix is selected. • The color in the figure have been selected so that the lighter the color between two neurons is, the smaller is the relative distance between them.

  15. Example cases of virus visualization A. Win95.CIH Virus • Figure 3 shows the test Windows executable files before Win95.CIH virus infected. • Figure 4 shows the trained SOMs of Win95.CIH 1.2, 1.3 and 1.4 virus-infected test Windows executable files. • Each Win95.CIH/Chernobyl virus has obvious location in the upper of centre.

  16. Example cases of virus visualization(cont.) • We trained the SOM with labels which categorized by given name, such as DS, PE, PR, and VS. • The result of the projection with labels is like in Figure 5.

  17. Example cases of virus visualization(cont.) • As a result, although each Windows executable file is different, the SOM projection of CIH virus-infected files look similar and have same sort of virus projection map. • I would call this similar sort of spot in each virus-infected file as a virus mask.

  18. Example cases of virus visualization(cont.) B. Win95.Boza Virus • Figure 6 shows us the Boza virus; the major of lighter color in the upper centre has virus codes. • I made another projection with labels like in Figure 7.

  19. Example cases of virus visualization(cont.) • As Figure 5 and Figure 7 are shown, there are two parts have smaller distances than the other parts, e.g., PE (NewEXE header part) and VS (Virus Code part). • This tells us that PE and VS includes close neighborhood in their code.

  20. Example cases of virus visualization(cont.) C. Win32.apparition Virus • Figure 8 shows the projection of Win32.apparition virus and the other projection with label for the distribution. • Since this virus code part has unusual structure, the distribution itself is quite similar with the program code and data part.

  21. Example cases of virus visualization(cont.) D. Win32.HLLP.Semisoft Virus • Figure 9 shows the projection of Win32.HLLP.Semisoft virus and the other projection with label for the distribution. • This SOM of Win32.HLLP.Semisoft virus-infected file also has a virus mask and the major smaller likelihood part is virus code.

  22. Virus recognition • When we see all the virus mark through the training of SOMs, the virus mask is sort of virus spot with smaller likelihood data. • I illustrate the result of SOM umatrix like in Figure.10 • (a) neighboring neurons have similar weight vectors in the umatrix. • (b) these weighted vectors represent virus spots, and I convert each virus spot into one character ’S’. • We can make each column as one string, and can search certain patterns in each string, then compare each string with next strings.

  23. Virus recognition(cont.) • I implement a virus detector, and test all virus-infected executable files which I mentioned above in this paper.

  24. Virus recognition(cont.) • This virus detector using SOM is able to detect unknown viruses as well. • All known-viruses have virus masks. • Through detecting these virus-masks, the degree of belief about possibility to detect unknown viruses increases.

  25. Conclusions • As we find virus masks in virus-infected files, through these virus masks we can detect virus-infected files. • This research can be applied for anti-virus detection programs without virus signature knowledge, especially for unknown new virus cases.

  26. Personal Opinion • Many concepts in this paper is not clear enough. So we can’t understand some detail knowledge the author wants to give the readers.

More Related