P REDICTING ZERO - DAY SOFTWARE VULNERABILITIES THROUGH DATA MINING Su Zhang Department of Computing and Information Science Kansas State University 1.

Slides:



Advertisements
Similar presentations
Regularized risk minimization
Advertisements

Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India.
Example One Internet is allowed to access the web server through HTTP protocol and port CVE was identified on web server.
1 Measuring Network Security Using Attack Graphs Anoop Singhal National Institute of Standards and Technology Coauthors: Lingyu Wang and Sushil Jajodia.
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
Software Security Growth Modeling: Examining Vulnerabilities with Reliability Growth Models Andy Ozment Computer Security Group Computer Laboratory University.
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
Civil and Environmental Engineering Carnegie Mellon University Sensors & Knowledge Discovery (a.k.a. Data Mining) H. Scott Matthews April 14, 2003.
Decision Tree Rong Jin. Determine Milage Per Gallon.
Using the Maryland Biological Stream Survey Data to Test Spatial Statistical Models A Collaborative Approach to Analyzing Stream Network Data Andrew A.
CSCI 530L Vulnerability Assessment. Process of identifying vulnerabilities that exist in a computer system Has many similarities to risk assessment Four.
A Kolmogorov Complexity Approach for Measuring Attack Path Complexity By Nwokedi C. Idika & Bharat Bhargava Presented by Bharat Bhargava.
Secure Middleware (?) Patrick Morrison 3/1/2006 Secure Systems Group.
Clementine Server Clementine Server A data mining software for business solution.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Risk Management.
Correlational Designs
Project Risk Management EECS811: IT Project Management Presenter: Gavaskar Ramanathan.
Classification and Prediction: Regression Analysis
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
© Sam Ransbotham The Impact of Immediate Disclosure on Attack Diffusion and Volume Sam Ransbotham Boston College Sabyasachi Mitra Georgia Institute of.
Microsoft Enterprise Consortium Data Mining Concepts Introduction: The essential background Prepared by David Douglas, University of ArkansasHosted by.
Application Threat Modeling Workshop
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
DEEDS Meeting Oct., 26th 2006 Dependable, Embedded Systems and Software Group Department of Computer Science Darmstadt University of Technology Summary.
P REDICTING ZERO - DAY SOFTWARE VULNERABILITIES THROUGH DATA - MINING --T HIRD P RESENTATION Su Zhang 1.
1 Security Risk Analysis of Computer Networks: Techniques and Challenges Anoop Singhal Computer Security Division National Institute of Standards and Technology.
SEC835 Database and Web application security Information Security Architecture.
Introduction: The essential background
Architecting secure software systems
A Framework for Automated Web Application Security Evaluation
Software Assurance Session 15 INFM 603. Bug hunting vs. vulnerability spotting Bugs are your code not behaving as you designed it. Many can be found by.
Slide 1 Using Models Introduced in ISA-d Standard: Security of Industrial Automation and Control Systems (IACS) Rahul Bhojani ISA SP99 WG4 Meeting.
INFORMATION ASSURANCE USING C OBI T MEYCOR C OBI T CSA & MEYCOR C OBI T AG TOOLS.
Information flow-based Risk Assessment in Access Control Systems
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Algorithms for Wireless Sensor Networks Marcela Boboila, George Iordache Computer Science Department Stony Brook University.
Software Security Weakness Scoring Chris Wysopal Metricon August 2007.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Evaluating Network Security with Two-Layer Attack Graphs Anming Xie Zhuhua Cai Cong Tang Jianbin Hu Zhong Chen ACSAC (Dec., 2009) 2010/6/151.
PROFILING HACKERS' SKILL LEVEL BY STATISTICALLY CORRELATING THE RELATIONSHIP BETWEEN TCP CONNECTIONS AND SNORT ALERTS Khiem Lam.
Software Architecture Evaluation Methodologies Presented By: Anthony Register.
Module 5 – Vulnerability Identification  Phase II  Controls Assessment  Scheduling ○ Information Gathering ○ Network Mapping ○ Vulnerability Identification.
+ Moving Targets: Security and Rapid-Release in Firefox Presented by Carlos Bernal-Cárdenas.
Survey of Tools to Support Safe Adaptation with Validation Alain Esteva-Ramirez School of Computing and Information Sciences Florida International University.
CISC Machine Learning for Solving Systems Problems Microarchitecture Design Space Exploration Lecture 4 John Cavazos Dept of Computer & Information.
Chapter 19: Building Systems with Assurance Dr. Wayne Summers Department of Computer Science Columbus State University
Emerging and Evolving Cyber Threats Require Sophisticated Response and Protection Capabilities  Advanced Algorithms  Cyber Attack Detection and Machine.
CISC 849 : Applications in Fintech Vaishnavi Gandra Dept of Computer & Information Sciences University of Delaware Extracting Cybersecurity Related Linked.
Exploitation Development and Implementation PRESENTER: BRADLEY GREEN.
CYSM Risk Assessment Methodology Co-funded by the Prevention, Preparedness and Consequence Management of Terrorism and other Security-related Risks Programme.
Managing Qualitative Knowledge in Software Architecture Assesment Jilles van Gurp & Jan Bosch Högskolan Karlskrona/Ronneby in Sweden Department of Software.
Principles of Information Security, Fourth Edition Risk Management Ch4 Part I.
Computer Science / Risk Management and Risk Assessment Nathan Singleton.
Multiple Regression Reference: Chapter 18 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences
CSE 4705 Artificial Intelligence
Security SIG in MTS 05th November 2013 DEG/MTS RISK-BASED SECURITY TESTING Fraunhofer FOKUS.
Off-line Risk Assessment of Cloud Service Provider
DEFECT PREDICTION : USING MACHINE LEARNING
Predict Failures with Developer Networks and Social Network Analysis
Cybersecurity Threat Assessment
Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.
Autonomous Network Alerting Systems and Programmable Networks
Presentation transcript:

P REDICTING ZERO - DAY SOFTWARE VULNERABILITIES THROUGH DATA MINING Su Zhang Department of Computing and Information Science Kansas State University 1

O UTLINE Motivation. Related work. Proposed approach. Possible techniques. Plan. 2

O UTLINE Motivation. Related work. Proposed approach. Possible techniques. Plan. 3

T HE TREND OF VULNERABILITY NUMBERS 4

ZERO - DAY VULNERABILITY What is zero-day vulnerability? It is a vulnerability which is found by underground hackers before being made public. Increasing threat from zero-day vulnerabilities. Many attacks are attributed to zero-day vulnerabilities. E.g. in 2010 Microsoft confirmed a vulnerability in Internet Explorer, which affected some versions that were released in

O UR GOAL Risk awareness. The possibility of zero-day vulnerability must be considered for comprehensive risk assessment for enterprise networks. 6

E NTERPRISE RISK ASSESSMENT FRAMEWORK 7

8

9

10

E NTERPRISE RISK ASSESSMENT FRAMEWORK 11

P ROBLEM Predict the information of zero – day vulnerabilities from software configurations. 12

O UTLINE Motivation. Related work. Proposed approach. Possible techniques. Plan. 13

R ELATED WORK O. H. Alhazmi and Y. K. Malaiya, Andy Ozment, Kyle Ingols, et al, Miles A. McQueen, et al,

O UTLINE Motivation. Related work Proposed approach. Possible techniques. Plan. 15

P ROPOSED APPROACH Predict the likelihood of zero-day vulnerabilities for specific software applications. NVD Available since Rich data source including the preconditions and consequences of vulnerabilities. It could be used to build our model and validate our work. 16

S YSTEM ARCHITECTURE 17 IEWinXPFireFox… Target Machine Scanner (e.g. Nessus or OVAL) Our Prediction Model Output(MTTNV&CVSS Metrics) CPE (common platform enumeration)

P REDICTION MODEL Predictive data: CPE (common platform enumeration) Indicate software configuration on a host. Predicted data: MTTNV (Mean Time to Next Vulnerability) & CVSS Metrics MTTNV indicates the probability of zero-day vulnerabilities. CVSS metrics indicate the properties of the predicted vulnerabilities. 18

CPE ( COMMON PLATFORM ENUMERATION ) What is CPE? CPE is a structured naming scheme for information technology systems, software, and packages. Example (in primitive format) cpe:/a:acme:product:1.0:update2:pro:en-us Professional edition of the "Acme Product 1.0 Update 2 English". 19

CPE L ANGUAGE 20

CVSS (C OMMON V ULNERABILITY S CORING S YSTEM ) An open framework for communicating the characteristics and impacts of IT vulnerabilities. Metric Vector access complexity (H, M, L) authentication ( R, NR) confidentiality (N, P, C)... CVSS Score: Calculated based on above vector. It indicates the severity of a vulnerability. 21

CVSS USED IN RISK ASSESSMENT We use CVSS to derive a conditional probability. How likely a vulnerability could be successfully exploited, given all preconditions fulfilled. By combining the conditional probability with attack graph one can calculate the cumulative probability, we could obtain a overall estimated likelihood of the given machine being compromised. 22

O UTLINE Motivation. Related work. Proposed approach. Possible techniques. Plan. 23

P OSSIBLE TECHNIQUES Linear Regression ( input are continuous variables). Statistical classification (input are discrete variables). Maximum likelihood and least squares (Determining the parameters of our model). 24

V ALIDATION METHODOLOGY Earlier years of NVD: Building our model. Later years of NVD: Validate our model. Criteria: Closer to the factual value than without considering zero-day vulnerabilities. 25

O UTLINE Motivation. Related work. Proposed approach. Possible techniques. Plan. 26

PLAN Next phase: Study data-mining tools (e.g. Support Vector Machine). Then build up our prediction model. Validate the model on NVD. Final phase: If the previous phase provides a good model, we will incorporate the generated result into MulVAL. Otherwise, we are going to investigate the problem. 27

R EFERENCES [1]Andrew Buttner et al, ”Common Platform Enumeration (CPE) – Specification,” [2]NVD, [3]O. H. Alhazmi et al, “Modeling the Vulnerability Discovery Process,” [4]Omar H. Alhazmi et al, “Prediction Capabilities of Vulnerability Discovery Models,” [5]Andy Ozment, “Improving Vulnerability Discovery Models,” [6]R. Gopalakrishna and E. H. Spafford, “A trend analysis of vulnerabilities,” [7]Christopher M. Bishop, “Pattern Recognition andMachine Learning,” [8]Xinming Ou et al, “MulVAL: A logic-based network security analyzer,” [9] Kyle Ingols et al, “Modeling Modern Network Attacks and Countermeasures Using Attack Graphs” [10] Miles A. McQueen et al, “Empirical Estimates and Observations of 0Day Vulnerabilities,” [11] Alex J. Smola et al, “A Tutorial on Support Vector Regression,”

T HANK YOU ! Questions & Answers 29