Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jason Flinn Michael Chow, David Devecsery, Xianzheng Dou, Andrew Quinn

Similar presentations


Presentation on theme: "Jason Flinn Michael Chow, David Devecsery, Xianzheng Dou, Andrew Quinn"— Presentation transcript:

1 Learning from the Past: Fast, On-Demand Analysis of Prior Executions with Eidetic Systems
Jason Flinn Michael Chow, David Devecsery, Xianzheng Dou, Andrew Quinn University of Michigan

2 What is an Eidetic System?
Eidetic – Having “Perfect memory” or “Total Recall” Eidetic System – A system which can recall and trace through the lineage of any past computation An eidetic system is one that remembers all computation, and can query the history of it. That means it can look through temporary variables, named pipes, process address space, and track the history and evolution of the computer system. If you were to work at an eidetic workstation you would never have to worry about accidently deleting files, etc. David Devecsery

3 Motivation - Heartbleed
Consider “Heartbleed” People didn’t know if they were exploited. What leaked? Whose data leaked? How important was that data? What data was even available in the web server to leak? Was Heartbleed exploited? What data was leaked? David Devecsery

4 Motivation - Heartbleed
Message Leaked Data Was Heartbleed exploited? - Yes What data was leaked? David Devecsery

5 Motivation - Heartbleed
Leaked Database Rows Heartbleed Message Leaked Data Was Heartbleed exploited? - Yes What data was leaked? David Devecsery

6 Motivation – Wrong Reference
Bad Citation How did I get the wrong citation? David Devecsery

7 Motivation – Wrong Reference
How did I get the wrong citation? David Devecsery

8 Motivation – Wrong Reference
How did I get the wrong citation? David Devecsery

9 Motivation – Wrong Reference
How did I get the wrong citation? What else did this affect? David Devecsery

10 Motivation ConfAid uses taint tracking ConfAid uses taint tracking
How did I get the wrong citation? What else did this affect? David Devecsery

11 Motivation – Configuration Troubleshooting
Config file file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 if (execute_cgi == 1) ERROR() ExecCGI ConfAid uses a top down approach instead of bottom-up that the developers use, and it uses taint tracking techniques to find the causal relationships. So, there is how it works: when the application reads something from the config file, ConfAid assigns a specific taint to it and as the application runs, it propagates the taint via data flow and information flow. At the end when the error happens it uses these taints to links the error back to the configuration tokens that caused it. Mona Attariyan - University of Michigan

12 Arnold First practical eidetic computer system Reasonable overheads
Efficiently records & recalls all user-space computation Process register/memory state Inter-process communication Supports provenance queries What data was affected? What states and outputs were affected? Targeted towards desktop/workstation use Reasonable overheads Record 4 years of data on $150 commodity HD Under 8% performance overhead on most benchmarks David Devecsery

13 Deterministic Record & Replay
Most execution on a computer system is deterministic Can precisely reproduce an execution by logging non-determinism RECORD PLAY Michael Chow

14 Model-Based Compression
Formulate a model of a typical execution Only record deviations from that model ret_val = sys_read (fd, buffer, count); Idea: Partial determinism Encourage the program to conform to the model usually equal e.g. delta encoding, move-to-front transform for X server messages After sys_read We only need to record when.. , otherwise, we need to record nothing The better the model, the less we need in our log. The log size is proportional to the deviations David Devecsery

15 Semi-Determinism Example: Time
Frequent time queries are non-deterministic Use partially deterministic clock Real time clock & deterministic clock Bound deviation if (deterministic_clock – real_time_clock < threshold) { adjust deterministic_clock record deviation } return deterministic_clock David Devecsery

16 Space Optimizations David Devecsery

17 Space Optimizations David Devecsery

18 Space Optimizations David Devecsery

19 Space Optimizations 4 years of data on a $150 4TB commodity HD
David Devecsery

20 Performance Evaluation
David Devecsery

21 Querying Provenance Two types of queries:
Reverse: Where did this data come from? Forward: What did this data affect? How does Arnold support these queries? User specifies initial state Trace the lineage of the computation Intra-process tracking (DIFT) Inter-process tracking (via OS support) David Devecsery

22 Intra-Process Provenance
Use DIFT (taint tracking) for intra-process causality Run retroactively, on recorded execution Query is tuple of <sources, sinks, propagation function> Arnold supports several propagation functions: Copy Only Data Flow Data+Index Flow Control Flow Strong input/output relation Precision Weak input/output Relation Recall May miss relations Misses few relations David Devecsery

23 Inter-Process Lineage
Two notions of inter-process linkage Process graph Tracks lineage through inter-process communication Precise Captures group to group communication Human linkage Handles relations between user inputs and outputs Infers linkages based on data content and time Imprecise – may have false negatives and false positives Can capture linkages the process graph can miss David Devecsery

24 Evaluation – Wrong Reference
Copy Copy Data Data + Index Data Human Linkage Few false positives (font files, latex sty files, libc.so, libXt.so) No false negatives Record Time Replay Time Replay + Pin Time Query Time 96.1s 2.2s 70.0s 209.5s David Devecsery

25 Evaluation – Heartbleed
Data + Index Data + Index Data + Index No false positives or negatives Record Time Replay Time Replay + Pin Time Query Time 230.3s 0.4s 139.5s 235.1s David Devecsery

26 Evaluation – Heartbleed
Data + Index Data + Index Data + Index No false positives or negatives Record Time Replay Time Replay + Pin Time Query Time 230.3s 0.4s 139.5s 235.1s David Devecsery

27 Parallelizing queries
Idea: parallelize across a compute cluster Problem: DIFT is “embarrassingly sequential” (Ruwase ‘07) Solution: use deterministic replay to time-slice DIFT DIFT execution David Devecsery

28 Time-slicing DIFT DIFT propagation function relates sources and sinks
Problem: Each epoch only knows its local sources and sinks Treat all addresses/registers at epoch start as sources Teat all addresses/registers at epoch end as sinks Aggregation phase “stitches together” DIFT results from all epochs A = source1 B = source2 C = A + B D = C Output D A: {1} B: {2} C: {1,2} D = {C} Sink1 <- {C} {<Sink1,1> , <Sink1,2>} David Devecsery

29 Aggregation To scale, aggregation must run in parallel
Problem: fully-parallel version has obscenely high constant costs All-to-all address/register map does not fit in 256 GB of memory! Solution: Organize epochs in chain by program order Stream info about relations to sources forward along chain Stream info about relations to sinks backward along chain David Devecsery

30 Results: nginx David Devecsery

31 Results: Firefox David Devecsery

32 Results: Evince David Devecsery

33 Conclusion Eidetic systems can: Arnold: First practical Eidetic System
Recall any past state Explain the provenance of that state Arnold: First practical Eidetic System Low runtime overhead 4 years of computation on a commodity HD Parallelizing DIFT queries across a cluster Enables interactive response times (seconds, not hours) Supports powerful queries (what affected this value?) David Devecsery


Download ppt "Jason Flinn Michael Chow, David Devecsery, Xianzheng Dou, Andrew Quinn"

Similar presentations


Ads by Google