Exploring Complexity Metrics as Indicators of Software Vulnerability Yonghee Shin Jason Froehlich October 29, 2008
Definitions Error – human mistake that causes fault in software Fault – encoded human error that causes failure when executed Failure – deviation of a system from required behavior Vulnerability – weakness that makes it possible for potential security violation to occur
Study Objectives Does high complexity contribute to software vulnerability? What metrics can represent the complexity that leads to vulnerabilities? Do vulnerability fixes introduce more complexity?
Hypotheses More complex programs have more vulnerabilities. Complexity metrics can predict vulnerabilities. Modules with vulnerabilities have different complexity than those with faults. Vulnerability fixes introduce more complexity.
Study Methodology Required Data Fault reports (Bugzilla) Vulnerability reports (CVE, NVD) Source code change history (CVS) Model Building Statistical analysis – Logistic Regression Machine learning – decision tree, bagging, boosting, Naïve Bayes, Bayesian networks Evaluation Cross-validation, next release validation
Case Study JavaScript Engine in Mozilla 106 vulnerability bugs reported to Bugzilla Best metric - Nesting complexity low FP (0.9%), but high FN (88.0%) need to develop better metrics