Download presentation
Presentation is loading. Please wait.
Published byOsborn Lester Modified over 8 years ago
1
Data Representation Synthesis PLDI’2011, ESOP’12, PLDI’12 Peter Hawkins, Stanford University Alex Aiken, Stanford University Kathleen Fisher, Tufts Martin Rinard, MIT Mooly Sagiv, TAU http://theory.stanford.edu/~hawkinsp/
2
Research Interests Verifying properties of low level data structure manipulations Synthesizing low level data structure manipulations – Composing ADT operations Concurrent programming via smart libraries
3
Motivation Sophisticated data structures exist Increase abstraction level Simplifies programming – Concurrency Integrated into modern PL But hard to use – Composing several operations
4
next table [0] [1] [2] [3] [4] [5] null index= 2index = 0index = 3index = 1index= 5 2 03 1 1 thttpd: Web Server Conn Map Representation Invariants: 1. n: Map. v: Z. table[v] =n n->index=v 2. n: Map. n->rc =|{n' : Conn. n’->file_data = n}|
5
static void add_map(Map *m) { int i = hash(m); … table[i] = m ; … m->index= i ; … m->rc++; } thttptd:mmc.c Representation Invariants: 1. n: Map. v:Z. table[v] =n index[n]=v 2. n:Map. rc[n] =|{n' : Conn. file_data[n’] = n}| broken restored
6
attr = new HashMap(); … Attribute removeAttribute(String name){ Attribute val = null; synchronized(attr) { found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } return val; } TOMCAT Motivating Example TOMCAT 5.* Invariant: removeAttribute(name) returns the removed value or null if the name does not exist
7
attr = new ConcurrentHashMap(); … Attribute removeAttribute(String name){ Attribute val = null; /* synchronized(attr) { */ found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } /* } */ return val; } TOMCAT Motivating Example TOMCAT 6.* attr.put(“A”, o); attr.remove(“A”); o Invariant: removeAttribute(name) returns the removed value or null if the name does not exist
8
Multiple ADTs Invariant: Every element that added to eden is either in eden or in longterm public void put(K k, V v) { if (this.eden.size() >= size) { this.longterm.putAll(this.eden); this.eden.clear(); } this.eden.put(k, v); }
9
filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused Filesystem Example
10
Access Patterns Find all mounted filesystems Find cached files on each filesystem Iterate over all used or unused cached files in Least- Recently-Used order filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused
11
11 Desired Properties of Data Representations Correctness Efficiency for access patterns Simple to reason about Easy to evolve
12
Specifying Combination of Shared Data Structures Separation Logic First order logic+ reachability Monadic 2 nd order logic on trees Graph grammars …
13
fsfileinuse 114F 27T 25F 16T 12F filesystems filesystem=1 s_list s_files filesystem=2 s_list s_files file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused {fs, file} {inuse} fsfileinuse 117F 134T 142F 25T 25F Filesystem Example
14
Filesystem Example Containers: View 1 fs:1 fs:2 file:14 file:6 file:2 file:7 file:5 filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused
15
Filesystem Example Containers: View 2 filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused fs: 2, file:7 fs: 1, file:14 fs: 2, file:5 inuse:T inuse:F fs: 1, file:2
16
Filesystem Example Containers: Both Views fs:1 fs:2 file:14 file:6 file:2 file:7 file:5 filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused fs: 2, file:7 fs: 1, file14 fs: 2, file:5 inuse:T inuse:F fs file inuse {fs, file} { inuse}
17
Decomposing Relations High-level shape descriptor A rooted directed acyclic graph The root represents the whole relation The sub-graphs rooted at each node represent sub-relations Edges are labeled with columns Each edge maps a relation into a sub-relation Different types of edges for different primitive data- structures Shared nodes correspond to shared sub-relations fs file inuse {fs, file} { inuse} inuse fs, file fs inuse file list array list
18
Filesystem Example fsfileinuse 114F 27T 25F 16T 12F inuse fs, file fs inuse file inuse 14F 6T 2F fileinuse 7T 5F F T F T F fs:1 fs:2 file:14 file:6 file:2 file:7 file:5
19
Memory Decomposition(Left) inuse fs, file fs inuse file inuse:F inuse:T inuse:F Inuse:T Inuse:F fs:1 fs:2 file:14 file:6 file:2 file:7 file:5
20
Filesystem Example fsfileinuse 114F 27T 25F 16T 12F fsfile 27 16 fsfile 114 25 12 fs:2 file:7 inuse T T F F F fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse fs, file fs inuse file
21
Memory Decomposition(Right) fs:2 file:7 fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse:Tinuse:FInuse:TInuse:F inuse fs, file fs inuse file
22
Decomposition Instance fs:2 file:7 fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse:Tinuse:FInuse:TInuse:F fsfileinuse 114F 27T 25F 16T 12F fs:1 file:14 file:6 file:2 file:7 file:5 fs:2 fs file inuse {fs, file} { inuse} inuse fs, file fs inuse file
23
Shape Descriptor fs:2 file:7 fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse:Tinuse:FInuse:TInuse:F fsfileinuse 114F 27T 25F 16T 12F fs:1 file:14 file:6 file:2 file:7 file:5 fs:2 fs file inuse {fs, file} { inuse} inuse fs, file fs inuse file
24
Memory State fs:2 file:7 fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse:Tinuse:FInuse:TInuse:F fsfileinuse 114F 27T 25F 16T 12F fs:1 file:14 file:6 file:2 file:7 file:5 fs file inuse {fs, file} { inuse} inuse fs, file fs inuse file fs:2 list array list s_list f_fs_list f_list
25
Memory State fsfileinuse 114F 27T 25F 16T 12F fs file inuse {fs, file} { inuse} inuse fs, file fs inuse file list array list filesystems filesystem=1 s_list s_files filesystem=2 s_list s_files file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused
26
Adequacy A decomposition is adequate if it can represent every possible relation matching a relational specification Adequacy enforces sufficient conditions for adequacy Not every decomposition is a good representation of a relation
27
Adequacy of Decompositions All columns are represented Nodes are consistent with functional dependencies – Columns bound to paths leading to a common node must functionally determine each other
28
Respect Functional Dependencies file inuse {file} {inuse}
29
Respect Functional Dependencies file,fs inuse {file, fs} {inuse}
30
Adequacy and Sharing fs, file fs inuse file inuse Columns bound on a path to an object x must functionally determine columns bound on any other path to x {fs, file} {inuse, fs, file}
31
Adequacy and Sharing fs inuse file inuse Columns bound on a path to an object x must functionally determine columns bound on any other path to x {fs, file} {inuse, fs}
32
Decompositions and Aliasing A decomposition is an upper bound on the set of possible aliases Example: there are exactly two paths to any instance of node w
33
The RelC Compiler[PLDI’11] Relational Specification Programmer Decomposition Data structure designer RelC C++
34
The RelC Compiler fs file inuse {fs, file} {inuse} foreach filesystems s.t. fs= 5 do … RelC C++ inuse fs, file fs list inuse file list array list Relational Specification Graph decomposition
35
Query Plans foreach filesystems if inuse=T do … fs, file fs list inuse file inuse list array list scan Cost proportional to the number of files
36
Query Plans foreach filesystems if inuse=T do … fs, file fs list inuse file inuse list array list access scan Cost proportional to the number of files in use
37
Removal and graph cuts remove fs list file inuse list array list fsfileinuse 114F 27T 25F 16T 12F filesystems fs:1 s_list s_files fs:2 s_list s_files file:14 f_list f_fs_list file:6 f_list f_fs_list f_list f_fs_list file:7 f_list f_fs_list file:5 f_list f_fs_list inuse:T inuse:F file:2
38
Removal and graph cuts remove fsfileinuse 114F 27T 25F 16T 12F filesystems fs:2 s_list s_files file:7 f_list f_fs_list file:5 f_list f_fs_list inuse:T inuse:F fs list file inuse list array list
39
Abstraction Theorem If the programmer obeys the relational specification and the decomposition is adequate and if the individual containers are correct Then the generated low-level code maintains the relational abstraction relation remove memory state memory state low level code remove
40
40 Filesystem Example Concurrent ^ Lock-based concurrent code is difficult to write and difficult to reason about Hard to choose the correct granularity of locking Hard to use concurrent containers correctly
41
41 Granularity of Locking Fine-grained locking Coarse-grained locking More complex Easier to reason about Easier to change More scalable Lower overhead Less complex
42
42 Concurrent Containers Concurrent containers are hard to write Excellent concurrent containers exist (e.g., Java’s ConcurrentHashMap and ConcurrentSkipListMap ) Composing concurrent containers is tricky – It is not usually safe to update containers in paralel – Lock association is tricky In practice programmers often * use concurrent containers incorrectly * 44% of the time [Shacham et al., 2011]
43
43 Concurrent Data Representation Synthesis[PLDI’12] RelScala Compiler Concurrent Data Structures, Atomic Transactions Relational Specification Concurrent Decomposition
44
44 Concurrent Decompositions Concurrent decompositions describe how to represent a relation using concurrent containers and locks Concurrent decompositions specify: – the choice of containers – the number and placement of locks – which locks protect which data = ConcurrentHashMap = TreeMap
45
45 Concurrent Interface Interface consisting of atomic operations:
46
46 Goal: Serializability Thread 1 Thread 2 Time Every schedule of operations is equivalent to some serial schedule
47
47 Goal: Serializability Time Thread 1 Thread 2 Every schedule of operations is equivalent to some serial schedule
48
48 Two-Phase Locking Two phase locking protocol: Well-locked: To perform a read or write, a thread must hold the corresponding lock Two-phase: All lock acquisitions must precede all lock releases Attach a lock to each piece of data Theorem [Eswaran et al., 1976]: Well-locked, two-phase transactions are serializable
49
49 Two Phase Locking Attach a lock to every edge Problem 2: Too many locks DecompositionDecomposition Instance We’re done! Problem 1: Can’t attach locks to container entries
50
Lock Placements 1. Attach locks to nodes DecompositionDecomposition Instance
51
51 Coarse-Grained Locking DecompositionDecomposition Instance
52
52 Finer-Grained Locking DecompositionDecomposition Instance
53
53 Lock Striping DecompositionDecomposition Instance
54
54 Lock Placements: Domination DecompositionDecomposition Instance Locks must dominate the edges they protect
55
55 Lock Placements: Path-Closure All edges on a path between an edge and its lock must share the same lock
56
56 Decompositions and Aliasing A decomposition is an upper bound on the set of possible aliases Example: there are exactly two paths to any instance of node w We can reason about the relationship between locks and heap cells soundly
57
57 Lock Ordering Prevent deadlock via a topological order on locks
58
58 Queries and Deadlock 2. lookup(tv) 1. acquire(t) 3. acquire(v) 4. scan(vw) Query plans must acquire the correct locks in the correct order Example: find files on a particular filesystem
59
Autotuner Given a fixed set of primitive types – list, circular list, doubly-linked list, array, map, … A workload Exhaustively enumerate all the adequate decompositions up to certain size The compiler can automatically pick the best representation for the workload
60
Directed Graph Example (DFS) Columns src dst weight Functional Dependencies – {src, dst} {weight} Primitive data types – map, list … src dst weight map list src dst src weight map list map dst src weight map list dst weight map src map weight dst list src list
61
61 Concurrent Synthesis Find optimal combination of Decomposition Shape Container Data Structures ConcurrentHashMap ConcurrentSkipListMap CopyOnWriteArrayList Array HashMap TreeMap LinkedList Lock Implementations ReentrantReadWriteLock ReentrantLock Lock Striping Factors Lock Placement
62
62 Concurrent Graph Benchmark Start with an empty graph Each thread performs 5 x 10 5 random operations Distribution of operations a-b-c-d (a% find successors, b% find predecessors, c% insert edge, d% remove edge) Plot throughput with varying number of threads Based on Herlihy’s benchmark of concurrent maps
63
Black = handwritten, isomorphic to blue Key = ConcurrentHashMap = HashMap... 63 Results: 35-35-20-10 35% find successor, 35% find predecessor, 20% insert edge, 10% remove edge
64
(Some) Related Projects Relational synthesis: [Cohen & Campbell 1993], [Batory & Thomas 1996], [Smaragdakis & Batory 1997], [Batory et al. 2000] [Manevich, 2012] … Two-phase locking and Predicate Locking [Eswaran et al., 1976], Tree and DAG locking protocols [Attiya et al., 2010], Domination Locking [Golan-Gueta et al., 2011] Lock Inference for Atomic Sections: [McCloskey et al.,2006], [Hicks, 2006], [Emmi, 2007]
65
Summary Decompositions describe how to represent a relation using concurrent containers Lock placements capture different locking strategies Synthesis explores the combined space of decompositions and lock placements to find the best possible concurrent data structure implementations
66
The Big Idea Simpler code Correctness by construction Find better combinations of data structures and locking Benefits Relational Specification Low Level Concurrent Code
67
Transactional Objects with Foresight Guy Gueta G. Ramalingam Mooly Sagiv Eran Yahav
68
Motivation Modern libraries provide atomic ADTs Hidden synchronization complexity Not composable – Sequences of operations are not necessarily atomic
69
Transactional Objects with Foresight The client declares the intended use of an API via temporal specifications (foresight) The library utilizes the specification – Synchronize between operations which do not serialize with foresight Foresight can be automatically inferred with sequential static analysis
70
Example: Positive Counter Operation A @mayUse(Inc); Inc() @mayUse( ) Operation B @mayUse(Dec); Dec() @mayUse( ) Composite operations are not atomic even when Inc/Dec are atomic Enforce atomicity using future information Future information can be utilized for effective synchronization
71
Possible Executions (initial value=0)
72
Main Results Define the notion of ADTs that utilize foresight information – Show that they are strict generalization of existing locking mechanisms Create a Java Map ADT which utilizes foresight Static analysis for inferring conservative foresight information Performance is comparable to manual synchronization Can be used for real composite operations
73
The RelC Compiler fs file inuse {fs, file} {inuse} foreach filesystems s.t. fs= 5 do … RelC C++ inuse fs, file fs list inuse file list array list Relational Specification Graph decomposition
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.