An Empirical Study of OS Errors Chou, Yang, Chelf, Hallem, and Engler SOSP 2001 Characterizing a workload w.r.t reliability.

Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark programs Synthetic benchmark programs Traces Distributions & other statistics monitor analysis generator Synthetic traces Made-up © 2006, Carla Ellis Data sets Linux Compiler analysis

Method: Checkers Evolution: 21 snapshots of Linux over 7 years Structure: 7 main subdirectories Over 1000 unique errors detected.

Metrics Inspected errors – manually reviewed and propagated back through versions Projected errors – automatically found by low false positive checkers Notes – number of time check applied Relative error rate – errors/notes

Caveats Compiler analysis – is the set targeted representative of all bugs? All bugs treated equally vs. important bugs Narrow focus – claim: unlikely to have bad code that doesn’t expose some of the errors they look for Low level bookkeeping operations

Size of Subdirectories

Projected Bug Counts

Where are the errors?

Error-rate by function size

Log series distribution o data points x distribution  = 0.567

Bug Lifetimes

Error birth for 2.4.1

Birth & Death Just Block, Null, and Var (low false positive checkers) Bottom graph – shift to connect peaks Mostly using odd numbered releases toward lifetimes

Kaplan-Meier Estimates of Lifetime Method deals with censoring (truncating) Survives at least as long as… Issues included granularity & interference by finding errors in previous work.

Do bugs cluster? Expect that the #errors would be stable fraction of # notes, but spikey A: 80% errors accounted for by 50% of files with errors B&C: random exp

Global cluster metric c theor uses the log series distr. c > 1 means more clustering than random

Intuitively, why clusters? Wide-spread ignorance of the system rules Poor programming in focused place Cut and paste errors Less executed code is less well-tested

Summary Driver code is error-prone Error distributions seem to fit log series distribution Average lifetime of bugs 1.8 years Clustering exists.

For next Tuesday Chapter 10 Assignment on data presentation. Actually the more examples the better but I’d rather have 1 exceptionally bad example than a survey of garden-variety plots. Of potential interest: BugBench – a benchmark suite of known buggy programs

An Empirical Study of OS Errors Chou, Yang, Chelf, Hallem, and Engler SOSP 2001 Characterizing a workload w.r.t reliability.

Similar presentations

Presentation on theme: "An Empirical Study of OS Errors Chou, Yang, Chelf, Hallem, and Engler SOSP 2001 Characterizing a workload w.r.t reliability."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Empirical Study of OS Errors Chou, Yang, Chelf, Hallem, and Engler SOSP 2001 Characterizing a workload w.r.t reliability.

Similar presentations

Presentation on theme: "An Empirical Study of OS Errors Chou, Yang, Chelf, Hallem, and Engler SOSP 2001 Characterizing a workload w.r.t reliability."— Presentation transcript:

Similar presentations

About project

Feedback