Download presentation
Presentation is loading. Please wait.
Published byMartin Owens Modified over 9 years ago
1
A Large Scale Exploratory Analysis of Software Vulnerability Life Cycles Muhammad Shahzad Dept. of Computer Science and Engineering Michigan State University Joint work with Muhammad Zubair Shafiq and Alex X. Liu
2
2 ICSE 2012, Zürich Software Vulnerabilities A Software vulnerability is a weakness in software that allows attackers to compromise the security of a system. An exploit is a means of taking advantage of a software vulnerability to compromise the security of a system. ─ In form of a piece of software, or a sequence of commands. A patch is a means of fixing the vulnerability so that exploit becomes ineffective. Vulnerability lifecycle ICSE 2012, Zürich
3
3 Why Study Software Vulnerability Lifecycle Software vendors are adversely affected by vulnerability announcements. ─ Lost money: vendors loses 0.63% in market value on disclosure date [Telang and Vattal 2007] ─ Lost reputation Goal: to know how the software industry is doing w.r.t vulnerabilities
4
4 ICSE 2012, Zürich Data Set Sources ─ National Vulnerability Database (NVD) ─ Open Source Vulnerability Database (OSVDB) ─ Vulnerability data by Frei et al (FVDB) 46310 vulnerabilities ─ 9667 vulnerabilities with patch dates ─ 15456 vulnerabilities with exploit dates Software vendors ─ Over 11 thousand vendors and 17 thousand products
5
5 ICSE 2012, Zürich Vulnerability Information Risk Score: low, medium, or high ─ Assigned by Common Vulnerability Scoring System (CVSS) Access Vector: Local, Adjacent Network, Network ─ From which place hackers can launch attacks Access Complexity: low, medium, or high ─ Complexity of the attack that exploits a vulnerability Integrity Impact: none, partial, or complete ─ Impact of the attack that exploits a vulnerability Disclosure date: when a vulnerability is disclosed Exploit date: when an exploit is available Patch date: when the patch is available Text description of the vulnerability
6
6 ICSE 2012, Zürich Vulnerability Disclosure Rate
7
7 ICSE 2012, Zürich Access Vector
8
8 ICSE 2012, Zürich Access Complexity
9
9 ICSE 2012, Zürich Integrity Impact
10
Evolution of Different Types of Vulnerabilities
11
11 ICSE 2012, Zürich Vulnerability Clustering Data set does not have vulnerability type. The total number of vulnerability types is unknown. Solution: use clustering algorithms to determine type and number of vulnerabilities. ─ Extracted relevant keywords from text description ─ Keywords used as features for clustering ─ Obtained 7 clusters ● EXE (Executables) ● DoS (Denial of Service) ● BO (Buffer Overflow) ● SQL injection ● XSS (Cross Site Scripting) ● PHP ● Misc
12
12 ICSE 2012, Zürich Vulnerability Evolution by Type
13
Evolution of Exploitation Behavior
14
14 ICSE 2012, Zürich t ed = Exploit Date - Disclosure Date t ed < 0 ─ 2.8% vulnerabilities t ed = 0 ─ 88.2% vulnerabilities t ed > 0 ─ 9% vulnerabilities ─ Sub-ranges ● 0 < t ed ≤ 7: exploit released within a week after disclosure ● 7 < t ed ≤ 30: exploit released after a week but before a month ● t ed > 30: exploit released more than a month after disclosure
15
15 ICSE 2012, Zürich Evolution of Aggregate Exploitation Behavior
16
16 ICSE 2012, Zürich Evolution of Exploitation Behavior by Vendor
17
17 ICSE 2012, Zürich Evolution of Exploitation Behavior by Product
18
Evolution of Patching Behavior
19
19 ICSE 2012, Zürich t pd = Patch Date – Disclosure Date t pd < 0 ─ 10.1% vulnerabilities ● Greater that the corresponding 2.8% of t ed < 0 t pd = 0 ─ 62.2% vulnerabilities ● Lesser compared to 88.2% of t ed = 0 t pd > 0 ─ 27.7% vulnerabilities ─ Sub-ranges ● 0 < t pd ≤ 7: patch released within a week after disclosure ● 7 < t pd ≤ 30: patch released after a week but before a month ● t pd > 30: patch released more than a month after disclosure
20
20 ICSE 2012, Zürich Evolution of Aggregate Patching Behavior
21
21 ICSE 2012, Zürich Evolution of Patching Behavior by Vendor
22
22 ICSE 2012, Zürich Evolution of Patching Behavior by Product
23
23 ICSE 2012, Zürich Conclusions Number of vulnerabilities being disclosed each year has stopped increasing since 2006 Percentage of remotely exploitable vulnerabilities has gradually increased to over 80% The access complexity of vulnerabilities has also been increasing Closed source vendors are faster at patching the vulnerabilities Since 2008, vendors have become very agile in patching the vulnerabilities Still, average time for hackers to exploit a vulnerability is shorter than the time for vendors to patch.
24
24 ICSE 2012, Zürich Questions?
25
25 BACKUP SLIDES
26
26 ICSE 2012, Zürich Evolution of Exploitation Behavior by Type
27
27 ICSE 2012, Zürich Evolution of Patching Behavior by Type
28
28 ICSE 2012, Zürich Data Sources http://nvd.nist.gov/ http://nvd.nist.gov/ www.osvdb.org/ www.osvdb.org/
29
29 ICSE 2012, Zürich Interesting Patterns Mined Using Association Rules Attributes used for association rule mining ─ Vendor name, product name, vulnerability type, Risk, t ed, t pd For Microsoft, majority of high risk vulnerabilities are exploited on the disclosure date ─ vnd=Microsft type=XSS risk=H → ted=0 For Sun’s Solaris, medium risk vulnerabilities are exploited within a week from disclosure ─ vnd=Sun Prod=Solaris risk=M → 0<t ed ≤7 For Mozilla, we saw interesting rules stating that hackers are very quick in exploiting vulnerabilities that have not been patched while very slow for the patched vulnerabilities ─ vnd=Mozilla Prod=Firefox typ=BO t pd =0 → t ed >30 ─ vnd=Mozilla Prod=Firefox typ=BO 7<t pd ≤30 → t ed =0
30
30 ICSE 2012, Zürich Interesting Patterns Mined Using Association Rules Microsoft is quicker in patching vulnerabilities in Windows compared to its other products ─ vnd=Microsoft prod=Windows type=BO → t pd =0 ─ vnd=Microsoft prod=IE type=BO → t pd >30 In case of Mozilla, BO and EXE vulnerabilities are patched very quickly ─ vnd=Mozilla prod=SeaMonkey type=BO → t pd =0
31
31 ICSE 2012, Zürich Implications Observations from this study have important implications in ─ Software Design ─ Code Development Practices ─ Customer assessment of vendors and products
32
32 ICSE 2012, Zürich Software Design Analysis of access requirements, functionality, and risk level ─ can reveal inherent flaws in software design process ─ For example, If a particular software series has abundant BO vulnerabilities ● shows lack of sanity check in socket and read processes DoS vulnerabilities ─ In Solaris 38.85% of all exploited vulnerabilities ─ In OS X only 11.7% of all exploited vulnerabilities ─ Solaris is more susceptible to DoS attacks ─ Solaris developers need to take additional steps to avoid DoS attacks
33
33 ICSE 2012, Zürich Code Development Practices Analysis of life cycles of vulnerabilities can reveal insights into code development and testing practices ─ For example, we observed that percentage of vulnerabilities with t pd >0 for open source vendors are significantly greater than for closed source ─ Shows that open source software have less resources dedicated to security compared to closed source
34
34 ICSE 2012, Zürich Customer Assessment of Vendors and Products This analysis can be used in product assessment, certification, and security recommendations to customers For example, ─ Sun should be preferred if patch response of vendor is of prime importance ─ MAC OS X should be used if a customer infrastructure has less tolerance to DoS attacks ─ Solaris should be used if customer wants to be robust against BO attacks
35
35 ICSE 2012, Zürich Proposed Methodology Preprocess the data ─ Extract relevant keywords from the text description ─ Represent each vulnerability in terms of the keywords Data Mining ─ Cluster the vulnerabilities ─ Identify the types of vulnerabilities in each cluster Post processing ─ Assign each vulnerability a type
36
36 ICSE 2012, Zürich Preprocessing Attributes are required to cluster Representative keywords in the text can act as attributes ─ Take all words in all text descriptions ─ Compare the words with everyday news articles ─ Remove the matching words ─ Manually go through the remaining words ─ Remove the words that are non technical ─ Leaves us with 608 keywords
37
37 ICSE 2012, Zürich Preprocessing Each vulnerability is a data point ─ 608 binary attributes DenialServiceBuffer…Overflow CVE-xxxx- yyyy 001…1 100…1 010…0
38
38 ICSE 2012, Zürich Clustering: Scheme Selection of clustering scheme ─ Same vulnerability type ─ Different vendors ─ E.g., Buffer Overflow vulnerabilities ● Can be subdivided into: Apple BO, Microsoft BO Hierarchical more suitable compared to Partitional ─ Ward ● Less susceptible to noise ● Does not break large clusters ● Ensures that SSE is small
39
39 ICSE 2012, Zürich Clustering: Distance Measure Desired: Jaccard ─ Not implemented in Weka, problems in Matlab Used: Hamming ─ Not implemented in Weka, available in Matlab Euclidean not used ─ Asymmetric data Cosine not used ─ Values in many cases become very small but non zero ─ Matlab does not handle them and results in error
40
40 ICSE 2012, Zürich Clustering: Challenges Hierarchical clustering uses proximity matrix ─ 46261 by 46261 ─ Requires about 15.9GB RAM in Matlab Solution ─ Sampling ─ 10 files randomly generated ● 5% sampling rate If dataset has valid clusters, each random file should generate same centroids
41
41
42
42 ICSE 2012, Zürich Clustering: Centroids 608 attributes ─ Value of each attribute: 0 or 1 ─ Data points lie at the edges of the 608 dimensional unit hypercube Take each cluster at a time and find the centroid ─ Values of each of the 608 attributes lies in [0,1] ─ Value close to 1 means occurred in a large number of data points of the cluster and vice versa ─ Get the attributes which are greater than 0.8 ● appeared in the description of over 80% of vulnerabilities in the cluster ─ e.g., in one cluster ● Denial, Service –Represent DoS attacks We get the centroids ─ Dominant keywords represent type cluster
43
43 ICSE 2012, Zürich Clustering: Number of clusters No universal way of determining exact number of clusters Visualize the dendrogram ─ Decide appropriate number of clusters
44
44 ICSE 2012, Zürich Hierarchical Clustering SQLMiscXSSEXE DoSMisc BOCEXE MiscPHPPHP EXEC- EX E LocalMisc A- EXEA- EXE EXE US-EXEUS-EXE BOA-BOA-BO CEXEBO SQL MiscPHPPHP DoS XSS
45
45 ICSE 2012, Zürich Clustering: Remaining Samples This analysis was on 1 sample Did the same analysis on remaining 9 samples Centroids obtained from all 10 samples are shown next
46
46 ICSE 2012, Zürich Clustering: Intensity Plot of Proximity Matrix
47
47 ICSE 2012, Zürich Final Clustering We have all 7 centroids ─ Assign each of 46261 points to nearest centroid ─ Sizes of each cluster after assigning points PHPSQLBOXSSEXEDoSMisc 8.32%11.2%10.2%12.3%7.25%14.2%36.6%
48
48 ICSE 2012, Zürich Post Processing Evolution of different types of vulnerabilities Evolution for different types in vendors Evolution of exploitation behavior of hackers Evolution of patching behavior of vendors
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.