Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Detection of Malicious Script Code CS194, 2007-08 Benson Luk Eyal Reuveni Kamron Farrokh Advisor: Adnan Darwiche Sponsored by Symantec.

Similar presentations


Presentation on theme: "Intelligent Detection of Malicious Script Code CS194, 2007-08 Benson Luk Eyal Reuveni Kamron Farrokh Advisor: Adnan Darwiche Sponsored by Symantec."— Presentation transcript:

1 Intelligent Detection of Malicious Script Code CS194, 2007-08 Benson Luk Eyal Reuveni Kamron Farrokh Advisor: Adnan Darwiche Sponsored by Symantec

2 Outline for Project Phase I : Setup Set up machine for testing environment Set up machine for testing environment Ensure that “whitelist” is clean Ensure that “whitelist” is clean Phase II : Crawling Modify crawler to output only necessary data. This means: Modify crawler to output only necessary data. This means: Grab only necessary information from webcrawling results Grab only necessary information from webcrawling results Listen into Internet Explorer’s Javascript interpreter and output relevant behavior Listen into Internet Explorer’s Javascript interpreter and output relevant behavior Phase III: Database Research and develop an effective structure for storing data and link it to webcrawler Research and develop an effective structure for storing data and link it to webcrawler Phase IV: Analysis Research trends for normalcy and investigate possible heuristics Research trends for normalcy and investigate possible heuristics

3 Approach to Project First Quarter : Infrastructure Second Quarter : Data Gathering Third Quarter : Data Analysis (Note: some overlap between quarters)

4 Infrastructure Internet Explorer 7, Windows XP SP2 Professional Internet Explorer 7, Windows XP SP2 Professional Main testing environment Main testing environment Norton Antivirus Norton Antivirus Protects against malicious files and scripts Protects against malicious files and scripts Can access logs to determine which sites launched attacks Can access logs to determine which sites launched attacks Integrated into automated site visiting Integrated into automated site visiting

5 Infrastructure CanaryCallback.dll CanaryCallback.dll Plugin into Internet Explorer Plugin into Internet Explorer Able to access most data received by low-level Javascript interpreter Able to access most data received by low-level Javascript interpreter The function being called (DISPID) The function being called (DISPID) The class that the function belongs to (GUID) The class that the function belongs to (GUID) The list of types and values of parameters passed into the function. Examples: The list of types and values of parameters passed into the function. Examples: VT_I4: 4-byte integerVT_I4: 4-byte integer VT_BSTR: Byte stringVT_BSTR: Byte string VT_DISPATCH: ObjectVT_DISPATCH: Object Large part of first and second quarter was spent programming, debugging, and maintaining the functions that would handle the data Large part of first and second quarter was spent programming, debugging, and maintaining the functions that would handle the data Functions to grab data type Functions to grab data type Functions to parse data values (some stored in bitstreams) Functions to parse data values (some stored in bitstreams) Functions to output data to file Functions to output data to file If types did not have an obvious output format (i.e. VT_DISPATCH), we had to create one that would accurately represent as many components of the data as possibleIf types did not have an obvious output format (i.e. VT_DISPATCH), we had to create one that would accurately represent as many components of the data as possible

6 Infrastructure Python Python Scripting language Scripting language Designed to handle parsing with ease Designed to handle parsing with ease Script for infrastructure was used to perform three tasks: Script for infrastructure was used to perform three tasks: Launch Internet Explorer (uses the cPAMIE engine), load website, then close Internet Explorer Launch Internet Explorer (uses the cPAMIE engine), load website, then close Internet Explorer Access and parse Norton’s web attack logs for any attacks launched by website Access and parse Norton’s web attack logs for any attacks launched by website Sort script data from CanaryCallback DLL based on DLL data and attack logs (Was there an attack? Did any scripts run? Etc.) Sort script data from CanaryCallback DLL based on DLL data and attack logs (Was there an attack? Did any scripts run? Etc.) Heretrix Heretrix Open-source webcrawler with high customizability Open-source webcrawler with high customizability Can run specific crawls that target a set of domains, and output minimal information Can run specific crawls that target a set of domains, and output minimal information Uses HTTP requests; does not render crawled sites Uses HTTP requests; does not render crawled sites The purpose is to gather as many URLs with scripts as possible for a large sample base The purpose is to gather as many URLs with scripts as possible for a large sample base

7 Infrastructure: Crawler Heretrix raw data Heretrix parsed dataWWW CrawlerPython parser Step 0: URL queue is “seeded” with domain list URL queue Step 1: Grab URL from queue Step 2: Grab source from URL Step 3: Append URLs to log data and URL queue iff they satisfy our set of rules Step 4: Get rid of excess data, leaving only URL information for each site, and output to new file Repeat steps 1-4 until crawl limit is reached.

8 Infrastructure: Gatherer Python controller Norton Antivirus: CanaryCallback data Heretrix parsed dataInternet Explorer 7 Norton Antivirus: Logs Formatted output Step 1: Python script grabs site from crawl data Step 2: cPAMIE component loads IE and sends it to specified site Step 3: IE7 Javascript interpreter outputs to file containing all DLL data Step 4: IE7 informs PAMIE that it is finished; Python kills IE7 Step 5: Python analyzes callback data and logs to decide whether a site is clean, dirty, or has no scripts Step 6: Python outputs sorted and formatted data to relevant files for future analysis Repeat steps 1-6 until URL list is exhausted.

9 Data gathering Heretrix crawls Heretrix crawls First crawl: 5 seeds, depth 5 First crawl: 5 seeds, depth 5 5 million sites found 5 million sites found Second crawl: 10 seeds, depth 3 Second crawl: 10 seeds, depth 3 3 million sites found 3 million sites found Third crawl: 200 seeds, depth 1 Third crawl: 200 seeds, depth 1 18,500 sites found 18,500 sites found Fourth crawl: 200 seeds, depth 2 Fourth crawl: 200 seeds, depth 2 3 million sites found 3 million sites found First two crawls produced data that was biased towards large, interlinked sites; the last two broad crawls were run to remedy this. First two crawls produced data that was biased towards large, interlinked sites; the last two broad crawls were run to remedy this. CanaryCallback gathering CanaryCallback gathering For first and second crawls, a chosen set of 1,000 or so sites were run through by gatherer component. For first and second crawls, a chosen set of 1,000 or so sites were run through by gatherer component. For third crawl, all sites (18,500) were processed by gatherer For third crawl, all sites (18,500) were processed by gatherer For fourth crawl, several tasks were performed: For fourth crawl, several tasks were performed: 20,000 sites were processed by gatherer 20,000 sites were processed by gatherer In mid-May, the same 1000 sites were processed 28 times (about 4 times per day) from May 7 to May 13 In mid-May, the same 1000 sites were processed 28 times (about 4 times per day) from May 7 to May 13

10 Data analysis setup CanaryCallback data analysis CanaryCallback data analysis Main choice for parsing data was Python scripting language Main choice for parsing data was Python scripting language Too much data for MS Access or even MySQL Too much data for MS Access or even MySQL Python scripts were developed to facilitate analysis in manner similar to SQL Python scripts were developed to facilitate analysis in manner similar to SQL Scripts to aggregate data sets and frequencies Scripts to aggregate data sets and frequencies Scripts to calculate various metrics of data sets, such as: Scripts to calculate various metrics of data sets, such as: Smallest data pointSmallest data point Largest data pointLargest data point Average data pointAverage data point Variance of data pointVariance of data point Total data pointsTotal data points Sum of data pointsSum of data points Scripts to output to file in Excel spreadsheet (CSV) for deeper analysis Scripts to output to file in Excel spreadsheet (CSV) for deeper analysis

11 Individual data analysis Third quarter and last half of second quarter were spent focusing on as wide a range of data as possible Third quarter and last half of second quarter were spent focusing on as wide a range of data as possible To accomplish this, our group split up and pursued a different line of research individually To accomplish this, our group split up and pursued a different line of research individually Individual presentations will follow: Individual presentations will follow: Eyal: Activity categorization Eyal: Activity categorization Benson: Integer argument trend analysis Benson: Integer argument trend analysis Kamron: Byte string argument trend analysis Kamron: Byte string argument trend analysis

12 Activity Categorization

13 Activity Analysis There is an obvious connection between a function and the site using it There is an obvious connection between a function and the site using it Is it possible to quantify this relationship, and establish whether certain functions are used in a specific kind of site? Is it possible to quantify this relationship, and establish whether certain functions are used in a specific kind of site? Characterize a site based on how active it is; i.e, how many function calls are made while the site is loaded Characterize a site based on how active it is; i.e, how many function calls are made while the site is loaded Does there exist a pattern in the data that will be able to distinguish an abnormal usage of any function based on the characteristic of the site? Does there exist a pattern in the data that will be able to distinguish an abnormal usage of any function based on the characteristic of the site?

14 Site Function Usage Statistics Minus outliers: none Minus outliers: none Three Standard Deviations below: 0 Two Standard Deviations below: 0 One Standard Deviation below: 12086 One Standard Deviation above: 1633 Two Standard Deviations above: 510 Three Standard Deviations above: 296 Normal distribution outliers: 323 Total number of sites: 14848 Average function calls per site: 5777 Average function calls per function: 1984 Standard deviation of function calls per function: 25493 Standard deviation of function calls per site: 14181 Median: 1456 First quartile: 438 Third quartile: 4029 Interquartile range: 3591 Minus outliers: none Lower whisker starts at: 0 Upper whisker ends at: 9365 “ Box and whisker ” outliers: 2048

15 Correlation analysis Related each function to the site calling it using the number of function calls on that site Related each function to the site calling it using the number of function calls on that site Each tuple consisted of the number of times a function was called at a particular site, and the number of total function calls that were made at that site Each tuple consisted of the number of times a function was called at a particular site, and the number of total function calls that were made at that site The correlation between the variables in the tuple was made for each individual function The correlation between the variables in the tuple was made for each individual function Many functions were not common, and so not enough data was available to make a conclusion about them Many functions were not common, and so not enough data was available to make a conclusion about them For the functions that had enough (over 100) sites that called them, the correlation values were between.004 and -.01, showing no correlation between the function and the script activity of the site calling it For the functions that had enough (over 100) sites that called them, the correlation values were between.004 and -.01, showing no correlation between the function and the script activity of the site calling it

16 Function Usage Amount An interesting trend arose when analyzing the correlation data An interesting trend arose when analyzing the correlation data There are functions that are called hundreds/thousands of times There are functions that are called hundreds/thousands of times Despite this, sites seem to call a specific function only a couple times. Despite this, sites seem to call a specific function only a couple times. Example: Example: GUID 3050f3fd-98b5-11cf-bb82-00aa00bdec0b, DISPID 1 GUID 3050f3fd-98b5-11cf-bb82-00aa00bdec0b, DISPID 1 Called 346 times, only in 11 sites is it called more than 3 times (3.2%) Called 346 times, only in 11 sites is it called more than 3 times (3.2%)

17

18 Categorization Approach Since no correlation was found, another approach was taken Since no correlation was found, another approach was taken According to trends in the script activity data, divide the sites into distinct categories According to trends in the script activity data, divide the sites into distinct categories Examine the function behavior in each category, as opposed to individual sites Examine the function behavior in each category, as opposed to individual sites Three categories were chosen, roughly along the median and the end of the third quartile Three categories were chosen, roughly along the median and the end of the third quartile This gave one category 50% of the data, while the other two had 25% of the data This gave one category 50% of the data, while the other two had 25% of the data An attempt to avoid bias toward the extremely script-heavy sites An attempt to avoid bias toward the extremely script-heavy sites

19 Categorization Heuristic A heuristic was developed to determine whether a function would be more likely to appear in a certain category A heuristic was developed to determine whether a function would be more likely to appear in a certain category F =((avgl - avgsite)*(L - avgfunc)+(avgm - avgsite)*(M - avgfunc)+(avgh - avgsite)*(H - avgfunc)) / 3 avgl, avgm, and avgh are the average number of function calls per category (542, 2882, and 22745 respectively) avgl, avgm, and avgh are the average number of function calls per category (542, 2882, and 22745 respectively) avgsite is the overall average number of function calls per site (5777) avgsite is the overall average number of function calls per site (5777) avgfunc is the avg number of function calls per function (1984). avgfunc is the avg number of function calls per function (1984). L, M, and H are the specific number of times the function was called in the low, medium, and high category L, M, and H are the specific number of times the function was called in the low, medium, and high category

20 Statistical Variation Among Categories The heuristic separated out the functions into three distinct sections The heuristic separated out the functions into three distinct sections Along the higher values were mostly functions that had few arguments supplied Along the higher values were mostly functions that had few arguments supplied In the middle, there were whole objects represented (a GUID, and all of its related function calls) In the middle, there were whole objects represented (a GUID, and all of its related function calls) At the lowest negative values were functions that were commonly called with arguments At the lowest negative values were functions that were commonly called with arguments

21 Argument Distributions A further analysis was done on whether there exists a difference in the behavior of a function in the separate categories A further analysis was done on whether there exists a difference in the behavior of a function in the separate categories The distributions of BSTR (Byte String) lengths and I4 (4-byte Integer) values were considered The distributions of BSTR (Byte String) lengths and I4 (4-byte Integer) values were considered Several functions were examined, but this specific one (referred to as “Second”, as it had the second highest heuristic value) is exemplary of the trends noticed Several functions were examined, but this specific one (referred to as “Second”, as it had the second highest heuristic value) is exemplary of the trends noticed The argument type frequency of “Second”: The argument type frequency of “Second”: LOW: 0 arguments: 20713 I4 arguments: 0 BSTR arguments: 2634 DISPATCH arguments: 14 NULL arguments: 0 BOOL arguments: 0 MID: 0 arguments: 170861 I4 arguments: 0 BSTR arguments: 9888 DISPATCH arguments: 1 NULL arguments: 0 BOOL arguments: 0 HIGH: 0 arguments: 1215964 I4 arguments: 0 BSTR arguments: 9447 DISPATCH arguments: 19 NULL arguments: 0 BOOL arguments: 0

22

23

24

25 Conclusions of Approach The trend seen is that there is no major statistical difference in the argument value distribution among the categories, but there are distinct characteristic differences seen The trend seen is that there is no major statistical difference in the argument value distribution among the categories, but there are distinct characteristic differences seen Functions that appear more commonly in less- active sites tend to have arguments supplied to them Functions that appear more commonly in less- active sites tend to have arguments supplied to them No general correlation exists between functions and how active the site calling it is No general correlation exists between functions and how active the site calling it is There may exist correlation in some other characteristic, however There may exist correlation in some other characteristic, however

26 Integer analysis

27 Functions through Three Sets Looked through 3 of the runs: Looked through 3 of the runs: 5 seeds, depth 5:1,324 sites 5 seeds, depth 5:1,324 sites 10 seeds, depth 3:1,184 sites 10 seeds, depth 3:1,184 sites 200 seeds, depth 1:15,790 sites 200 seeds, depth 1:15,790 sites Picked three most common functions with integer arguments of the first run to analyze Picked three most common functions with integer arguments of the first run to analyze Goal: Look for consistency throughout function behavior in differing sets of sites Goal: Look for consistency throughout function behavior in differing sets of sites

28 Functions through Three Sets In all three data sets, the values of the argument had a very large range, from 0 to the millions or billions In all three data sets, the values of the argument had a very large range, from 0 to the millions or billions Distributions did not stay consistent through sets, all had differing commonly occurring values Distributions did not stay consistent through sets, all had differing commonly occurring values

29 Functions through Three Sets Similar pattern in all 3 sets Similar pattern in all 3 sets Low values were used Low values were used Numbers near 0 most common, occurrences drop off as values get larger Numbers near 0 most common, occurrences drop off as values get larger

30 Functions through Three Sets Values range from 0 to in the hundreds Values range from 0 to in the hundreds Second data set did not have enough data Second data set did not have enough data Similar common numbers in both sets: 3, 300, and 728 Similar common numbers in both sets: 3, 300, and 728

31 Patterns in DISPID Usage Looked at what DISPIDs were used, without regard to the GUIDS of the calling classes Looked at what DISPIDs were used, without regard to the GUIDS of the calling classes DISPIDs had a large range, from lows of less than -2 billion, to highs of over 3 million DISPIDs had a large range, from lows of less than -2 billion, to highs of over 3 million Out of 743,270 functions analyzed, The vast majority had DISPIDs within 4 distinct ranges Out of 743,270 functions analyzed, The vast majority had DISPIDs within 4 distinct ranges 205 of the function did not fall within these groups, and instead were one of 6 other numbers 205 of the function did not fall within these groups, and instead were one of 6 other numbers Within each of the four ranges, occurrences at specific numbers formed patterns Within each of the four ranges, occurrences at specific numbers formed patterns

32 DISPID Usage – First Range The most common range for DISPIDs – 3,000,000-3,001,286 490,201 functions, about 66% 490,201 functions, about 66% 1,067 out of 1,286 different numbers used 1,067 out of 1,286 different numbers used Numbers nearer to 3 million are most common, higher numbers were used less Numbers nearer to 3 million are most common, higher numbers were used less Number range:Average Occurrences: 3,000,000-3,000,1991,121 3,000,200-3,000,399737 3,000,400-3,000,599471 3,000,600-3,000,799121 3,000,800+1

33 DISPID Usage – Second Range Second common range for DISPIDs – 0-2,313 164,224 functions, about 22% 164,224 functions, about 22% 39 numbers in this range were used 39 numbers in this range were used 0 and 1,103 were the most common 0 and 1,103 were the most common Numbers clumped around 5 groups: 0-9, 127-154, 1002-1168, 1500- 1504, and 2001-2015, with 2313 being an exception Numbers clumped around 5 groups: 0-9, 127-154, 1002-1168, 1500- 1504, and 2001-2015, with 2313 being an exception

34 DISPID Usage – Third Range Third range for DISPIDs – -2,147,417,109 to -2,147,411,105 50,541 functions, about 7% 50,541 functions, about 7% 55 numbers in this range used 55 numbers in this range used Most occurrences were around numbers ending in round thousands Most occurrences were around numbers ending in round thousands

35 DISPID Usage – Fourth Range Fourth range for DISPIDs – 10,001-10,087 38,099 functions, about 5% 38,099 functions, about 5% 75 numbers out of the range were used 75 numbers out of the range were used Uniquely used by 3050f55d-98b5-11cf-bb82-00aa00bdce0b Uniquely used by 3050f55d-98b5-11cf-bb82-00aa00bdce0b DISPIDs 10,001-10,007 are most common DISPIDs 10,001-10,007 are most common

36 Patterns in DISPID Usage Looked at what DISPIDs were used, without regard to the GUIDS of the calling classes Looked at what DISPIDs were used, without regard to the GUIDS of the calling classes DISPIDs had a large range, from lows of less than -2 billion, to highs of over 3 million DISPIDs had a large range, from lows of less than -2 billion, to highs of over 3 million Out of 743,270 functions analyzed, The vast majority had DISPIDs within 4 distinct ranges Out of 743,270 functions analyzed, The vast majority had DISPIDs within 4 distinct ranges Within each of the four ranges, occurrences at specific numbers formed patterns Within each of the four ranges, occurrences at specific numbers formed patterns

37 Function with Multiple Integers Looked for patterns in the relations among the integer arguments of functions taking multiple arguments Looked for patterns in the relations among the integer arguments of functions taking multiple arguments Not very many functions in this category Not very many functions in this category One took two arguments, first was always 0 One took two arguments, first was always 0 One took two arguments, always the same. Arguments were all from (1,1) to (31,31) and (1908,1908) to (1908) One took two arguments, always the same. Arguments were all from (1,1) to (31,31) and (1908,1908) to (1908) All came from 2 signup sites on a particular website All came from 2 signup sites on a particular website Two took two differing arguments, could not find relation between arguments Two took two differing arguments, could not find relation between arguments Other functions did not have a large enough sample size Other functions did not have a large enough sample size

38 Functions with Multiple Integers Function itself had consistent patterns in the values it took: 95% of arguments were (1,1) or (3,2) Function itself had consistent patterns in the values it took: 95% of arguments were (1,1) or (3,2) No consistent relations between arguments No consistent relations between arguments

39 Function Pairs Examined Examined GUID: 3050f55d-98b5-11cf-bb82-00aa00bdce0b DISPIDs: 10001-10062 Out of 38,099 occurrences, 3,595 were followed by: Out of 38,099 occurrences, 3,595 were followed by: GUID: c59c6b12-f6c1-11cf-8835-00a0c911e8b2 DISPID: 0 Second function had no independent occurrences Second function had no independent occurrences Similar arguments: Similar arguments: First function took a variety of numbers and types of arguments First function took a variety of numbers and types of arguments Second function always took a DISPATCH argument, followed by the same arguments as the first function Second function always took a DISPATCH argument, followed by the same arguments as the first function

40 Conclusions of Approach Functions arguments through sets: Functions arguments through sets: Seems to be consistent patterns in certain functions Seems to be consistent patterns in certain functions Range, values taken, values common, value distribution Range, values taken, values common, value distribution DISPID usage DISPID usage 4 ranges with very few exceptions 4 ranges with very few exceptions Common subranges or distribution patterns within each range Common subranges or distribution patterns within each range Multiple arguments Multiple arguments Uncommon type of function Uncommon type of function No noticeable relations in arguments No noticeable relations in arguments Function pairs Function pairs Dependent functions have clear patterns Dependent functions have clear patterns Function position Function position Argument types and values Argument types and values Only one example – do more exist? Only one example – do more exist?

41 Byte string analysis

42 Byte String Analysis Buffer overflows are a common method of exploiting a targeted system One method: create a very long string to break boundary checking, then append shellcode at the end to inject into the assembly code We are interested in the length of BSTR objects feeded into given functions For any given API, what is considered a normal string length?

43 Class-based analysis Initial analyses were done on a class-by-class basis Samples were grouped together and analyzed according to GUID Byte strings are typically very small More than 70% of the commonly called Javascript classes typically received byte strings of less than length 20. (39 out of 55 functions from this crawl) Less than 10% of these ever receive a string greater than 5000 characters in length (4 out of 55 functions from this crawl).

44 Class-based analysis Analysis of individual classes shows same trend toward smaller strings However, analyzing based on classes groups byte strings of all class functions together, which results in inaccuracy and lost information BSTR length Exact length At most this length BSTR length Exact length At most this length 0287 12142330965 11364231338331348 22044314125832606 3745171522132827 490714241627233099 5101424381786833967 612638150761818534152 73865189411920034352 87362263032025034602 9107927382 … 1013962877814450142665 117642954214549142666

45 Parameter-based analysis Second analysis split samples into individual arguments of unique functions of each class Given a sample set with values in the interval (a, b) with average μ and standard deviation σ, we expect values to largely lie within the interval (μ – σ, μ + σ) We also expect (μ – σ, μ + σ) to be smaller than (a, b) The smaller (μ – σ, μ + σ) is in proportion to (a, b), the more well-defined our sample set becomes

46 Parameter-based analysis Length of expected interval: 2σ Length of entire interval: n = b – a + 1 2σ/n represents the ratio of the expected interval to the entire interval Since 2σ < n, 0 < 2σ/n < 1 When 2σ/n = 0, σ = 0 and all values in data set are equal When 2σ/n = 1, σ = n/2 and all values in data equal either a or b As 2σ/n goes from 0 to 1, shape of graph begins to shift

47

48

49

50

51

52

53 Ratio is no more than: Amount of functions:Percentage: 0.026070.468127132 0.127530.494343688 0.230540.548392889 0.333450.600646436 0.436120.648590411 0.540290.723469205 0.643840.78721494 0.748330.867839828 0.852190.937152092 0.954350.975938229 1.055691

54 When ratio is 0, amount of strings is typically low Otherwise, ratio increases as amount of strings decreases The function arguments with the smallest non-zero ratio are the most well-defined

55

56 Ratio is no more than:Amount of functions:Percentage: 0.07310.248133062 0.18760.297352342 0.211770.399524779 0.314690.498642227 0.417290.586897488 0.520630.700271555 0.623550.799389002 0.726300.892735913 0.828550.969110659 0.929290.994229464 1.029461 Only function arguments that see 9 or fewer strings are removed; however… Most zero-ratio functions are pruned (2607 to 731) Many functions with ratio > 0.5 are pruned (1540 to 883) Functions with ratio < 0.5 are affected minimally (1442 to 1332) Analysis with pruning

57 Ratio is no more than:Amount of functions:Percentage: 0.02320.157075152 0.13770.255247123 0.26710.454299255 0.39180.621530129 0.410790.730534868 0.512110.819905213 0.613070.884901828 0.714200.96140826 0.814630.990521327 0.914760.999322952 1.014771 Analysis with pruning Only function arguments that see 99 or fewer strings are removed; however… Almost all zero-ratio functions are pruned (731 to 232) Almost all functions with ratio > 0.5 are pruned (883 to 266) Only some functions with ratio < 0.5 are affected (1332 to 979)

58 String frequency requirement > 1 > 10 > 100 Ratio = 0.02607731232 0.0 < Ratio < 0.514221332979 0.5 < Ratio < 1.01540883266 As a function is seen in the wild more frequently, the byte string lengths it takes in begin to fall into specific intervals. Functions with substantial evidence are well-defined in the lengths of byte strings they tend to receive! Analysis with pruning

59 Comparing w/malicious data Symantec provided us with test samples used for Canary testing These samples trigger browser exploit but do not inject actual shellcode The worst thing they can do is crash the browser Malicious samples fell into one of three categories: Bad BSTR Bad I4 Bad DISPATCH (object) Example: “MSIE Popup Window Address Bar Spoofing Weakness” Callback data: Compare with data from May crawl: 491 strings seen over the 20,416 websites visited during that crawl Smallest: 70 Largest: 80 Average: 76.32 Standard deviation: 2.33 Expected interval: [73.99, 78.65] Entire interval [70, 80] Length 150 is 31.6 standard deviations away from the average length! DISPIDGUIDParamsType 1Value 1 10263050f55f-98b5-11cf-bb82-00aa00bdce0b1BSTR150

60 Trend volatility How does web activity change over time? 28 crawls of 1000 sites over May 7 to May 13 were performedto investigate this RunSize (KB)Size (MB)DLL callsURLs w/scriptsRunSize (KB)Size (MB)DLL callsURLs w/scripts 1151356147.81207171946915157522153.832166498464 2157178153.49215515446116163440159.612263828466 3153634150.03209151546617139088135.832015998462 4162288158.48223670646518159899156.152219125463 5153640150.04209103846519161024157.252220763465 6157247153.56214805846320160108156.362211366464 7152411148.84207438346321149148145.652090184461 8152776149.2208203146322161600157.812232686466 9143086139.73204167246523162083158.282236013465 10158285154.58217235346624155612151.962189636465 11160603156.84221355746525154043150.432176079463 12153172149.58209633046326160625156.862221586463 13165713161.83230493346527157313153.632173366464 14153186149.6216469546128158556154.842199971461 Each crawl differs by several hundred thousand DLL calls Amount of sites with actual scripts change

61 Trend volatility These runs were done ~5.5 hrs apart Change is very slight Zero-ratio functions increase High-ratio functions decrease

62 Trend volatility These runs were done ~1 day apart Change is also very slight Zero-ratio functions decrease Mid-ratio functions (R = 0.5) increase

63 Trend volatility These runs were done ~6 days apart Change is a little more apparent Zero-ratio functions decrease Mid-ratio functions (R = 0.5) increase

64 Trend volatility State of Javascript activity on Web is constantly changing Changes are somewhat unpredictable (and entirely dependent on decisions of webmaster) These changes in the long run are not major; however, they still exist and need to be addressed

65 Conclusions of Approach Substantial evidence in favor of existing trends for byte string arguments This approach can be adapted to anything that can be quantified as a number Changes in state of web will require any heuristic developed to have at least a basic learning capability Plan to continue research over the summer


Download ppt "Intelligent Detection of Malicious Script Code CS194, 2007-08 Benson Luk Eyal Reuveni Kamron Farrokh Advisor: Adnan Darwiche Sponsored by Symantec."

Similar presentations


Ads by Google