Data Representation Synthesis PLDI’2011 *, ESOP’12, PLDI’12 * CACM’12 Peter Hawkins, Stanford University Alex Aiken, Stanford University Kathleen Fisher,

Slides:



Advertisements
Similar presentations
Inferring Locks for Atomic Sections Cornell University (summer intern at Microsoft Research) Microsoft Research Sigmund CheremTrishul ChilimbiSumit Gulwani.
Advertisements

Lists and the Collection Interface Chapter 4. Chapter Objectives To become familiar with the List interface To understand how to write an array-based.
Guy Golan-GuetaTel-Aviv University Nathan Bronson Stanford University Alex Aiken Stanford University G. Ramalingam Microsoft Research Mooly Sagiv Tel-Aviv.
Semantics Static semantics Dynamic semantics attribute grammars
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Sections
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Potential Languages of the Future Chapel,
Concurrency Control Part 2 R&G - Chapter 17 The sequel was far better than the original! -- Nobody.
Paraglide Martin Vechev Eran Yahav Martin Vechev Eran Yahav.
1 CS216 Advanced Database Systems Shivnath Babu Notes 11: Concurrency Control.
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
Data Structures & Java Generics Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Linked Lists Compiled by Dr. Mohammad Alhawarat CHAPTER 04.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
Distributed Systems 2006 Styles of Client/Server Computing.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File-System Interface.
COLT: Effective Testing of Concurrent Collection Usage Alex Aiken Nathan Bronson Stanford University Martin Vechev Eran Yahav IBM Research Mooly Sagiv.
Lock Inference for Systems Software John Regehr Alastair Reid University of Utah March 17, 2003.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.
Presentation Topic 18.7 of Book Tree Protocol Submitted to: Prof. Dr. T.Y.LIN Submitted By :Saurabh Vishal.
C o n f i d e n t i a l Developed By Nitendra NextHome Subject Name: Data Structure Using C Title: Overview of Data Structure.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
By: Pashootan Vaezipoor Path Invariant Simon Fraser University – Spring 09.
Data Structures and Abstract Data Types "Get your data structures correct first, and the rest of the program will write itself." - David Jones.
Access Path Selection in a Relational Database Management System Selinger et al.
Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.
Verifying Atomicity via Data Independence Ohad Shacham Yahoo Labs, Israel Eran Yahav Technion, Israel Guy Gueta Yahoo Labs, Israel Alex Aiken Stanford.
Testing and Verifying Atomicity of Composed Concurrent Operations Ohad Shacham Tel Aviv University Nathan Bronson Stanford University Alex Aiken Stanford.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
111 © 2002, Cisco Systems, Inc. All rights reserved.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Chameleon Automatic Selection of Collections Ohad Shacham Martin VechevEran Yahav Tel Aviv University IBM T.J. Watson Research Center Presented by: Yingyi.
1 Concurrency Control II: Locking and Isolation Levels.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Data Structures Types of Data Structure Data Structure Operations Examples Choosing Data Structures Data Structures in Alice.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Data Representation Synthesis PLDI’2011, ESOP’12, PLDI’12 Peter Hawkins, Stanford University Alex Aiken, Stanford University Kathleen Fisher, Tufts Martin.
Session 1 Module 1: Introduction to Data Integrity
Threads. Readings r Silberschatz et al : Chapter 4.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Roman Manevich Rashid Kaleem Keshav Pingali University of Texas at Austin Synthesizing Concurrent Graph Data Structures: a Case Study.
Parallel Data Structures. Story so far Wirth’s motto –Algorithm + Data structure = Program So far, we have studied –parallelism in regular and irregular.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
1 A Methodology for automatic retrieval of similarly shaped machinable components Mark Ascher - Dept of ECE.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
LINKED LISTS.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
CPE 779 Parallel Computing - Spring Creating and Using Threads Based on Slides by Katherine Yelick
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
CS 440 Database Management Systems
Chapter 8: Concurrency Control on Relational Databases
Data Structure Interview Question and Answers
Compositional Pointer and Escape Analysis for Java Programs
CS216: Data-Intensive Computing Systems
Amir Kamil and Katherine Yelick
Chapter 15 QUERY EXECUTION.
Threads and Memory Models Hal Perkins Autumn 2011
Lecture 21: Concurrency & Locking
Threads and Memory Models Hal Perkins Autumn 2009
Locking Protocols & Software Transactional Memory
Abstraction.
Lecture 22: Intro to Transactions & Logging IV
Concurrency Control E0 261 Prasad Deshpande, Jayant Haritsa
Amir Kamil and Katherine Yelick
Deferred Runtime Pipelining for contentious multicore transactions
Concurrent Cache-Oblivious B-trees Using Transactional Memory
Presentation transcript:

Data Representation Synthesis PLDI’2011 *, ESOP’12, PLDI’12 * CACM’12 Peter Hawkins, Stanford University Alex Aiken, Stanford University Kathleen Fisher, DARPA Martin Rinard, MIT Mooly Sagiv, TAU * Best Paper Award

Background High level formalisms for static program analysis – Circular attribute grammars – Horn clauses Interprocedural Analysis – Context free reachability Implemented in SLAM/SDV Shape Analysis – Low level pointer data structures

next table [0] [1] [2] [3] [4] [5] null index= 2index = 0index = 3index = 1index= thttpd: Web Server Conn Map Representation Invariants: 1.  n: Map.  v: Z. table[v] =n  n->index=v 2.  n: Map. n->rc =|{n' : Conn. n’->file_data = n}|

static void add_map(Map *m) { int i = hash(m); … table[i] = m ; … m->index= i ; … m->rc++; } thttptd:mmc.c Representation Invariants: 1.  n: Map.  v:Z. table[v] =n  index[n]=v 2.  n:Map. rc[n] =|{n' : Conn. file_data[n’] = n}| broken restored

filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list Composing Data Structures

Problem: Multiple Indexes Access Patterns Find all mounted filesystems Find cached files on each filesystem Iterate over all used or unused cached files in Least- Recently-Used order filesystem=1 s_list s_files filesystem=2 s_list s_files filesystems file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused +Concurency

Disadvantages of linked shared data structures Error prone Hard to change Performance may depend on the machine and workload Hard to reason about correctness Concurrency makes it harder – Lock granularity – Aliasing

Our thesis Very high level programs – No pointers and shared data structures – Easier programming – Simpler reasoning – Machine independent The compiler generates pointers and multiple concurrent shared data structures Performance comparable to manually written code

Our Approach Program with “database” – States are tables – Uniform relational operations Hide data structures from the program – Functional dependencies express program invariants The compiler generates low level shared pointer data structures with concurrent operations – Correct by construction The programmer can tune efficiency Autotuning for a given workload

Conceptual Programming Model shared database query … insert … remove insert query … insert … remove …

Relational Specification Program states as relations – Columns correspond to properties – Functional dependencies define global invariants Atomic Operationmeaning r= emptyr := {} insert r s t if s  r then r = r  { } query r S CThe C of all the tuples in r matching tuple remove r sremove from r all the tuples which match s

The High Level Idea Concurrent Compositions of Data Structures, Atomic Transactions Compiler RelScala Scala Decomposition query {fs, file} List * query(FS* fs, File* file) { lock(fs) ; for (q= file_in_use; …) ….

Filesystem Three columns {fs, file, inuse} fs:int  file:int  inuse:Bool Functional dependencies – {fs, file}  { inuse} fsfileinuse 114F 27T 25F 16T 12F fsfileinuse 114F 27T 25F 16T 12F 12T

Filesystem (operations) fsfileinuse 114F 27T 25F 16T 12F query {fs, file }= [, ]

Filesystem (operations) fsfileinuse 114F 27T 25F 16T 12F insert fsfileinuse 114F 27T 25F 16T 12F 115T

Filesystem (operations) remove fsfileinuse 114F 27T 25F 16T 12F 115T fsfileinuse 27T 25F

Directed Graph Data Structure Three columns {src, dst, weight} src  dst  weight Functional dependencies – {src, dst}  { weight} Operations – query {dst, weight} – query {src, weight}

Plan Compiling into sequential code (PLDI’11) Adding concurrency (PLDI’12)

Mapping Relations into Low Level Data Structures Many mappings exist How to combine several existing data structures – Support sharing Maintain the relational abstraction Reasonable performance Parametric mappings of relations into shared combination of data structures – Guaranteed correctness

The RelC Compiler fs  file  inuse {fs, file}  {inuse} foreach  filesystems s.t. fs= 5 do … RelC C++ inuse fs, file fs list inuse file list array list Relational Specification Graph decomposition

Decomposing Relations Represents subrelations using container data structures A directed acyclic graph(DAG) – Each node is a sub-relation – The root represents the whole relation – Edges map columns into the remaining sub- relations – Shared node=shared representation

Decomposing Relations into Functions Currying fs  file  inuse {fs, file}  {inuse} fs  file  inuse fs  file file  inuse inuse group_by {fs, file} group-by {fs} group-by {inuse} group-by {file} FS  (FILE  INUSE) FILE  INUSE INUSE  FS  FILE FS  FILE  INUSE FS  (FILE  INUSE) INUSE  (FS  FILE  INUSE) fs file fs, file inuse

Filesystem Example fsfileinuse 114F 27T 25F 16T 12F inuse fs, file fs inuse file inuse 14F 6T 2F fileinuse 7T 5F F T F T F fs:1 fs:2 file:14 file:6 file:2 file:7 file:5 {fs, file, inuse}

Memory Decomposition(Left) inuse fs, file fs inuse file inuse:F inuse:T inuse:F Inuse:T Inuse:F fs:1 fs:2 file:14 file:6 file:2 file:7 file:5

Filesystem Example fsfileinuse 114F 27T 25F 16T 12F fsfile fsfile fs:2 file:7 inuse T T F F F fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse fs, file fs inuse file {fs, file}  { inuse}

Memory Decomposition(Right) fs:2 file:7 fs:1 file:6 fs:1 file:14 fs:2 file:5 fs:1 file:2 inuse:T inuse:F inuse:T inuse:FInuse:F inuse fs, file fs inuse file {fs, file}  { inuse}

Decomposition Instance fs:1 file:14 fs:1 file:6 fs:1 file:2 fs:2 file:7 fs:2 file:5 inuse:T inuse:F inuse:Tinuse:FInuse:TInuse:F fsfileinuse 114F 27T 25F 16T 12F fs:1 file:14 file:6 file:2 file:7 file:5 fs:2 fs  file  inuse {fs, file}  { inuse} inuse fs, file fs inuse file

Decomposition Instance fs:1 file:14 fs:1 file:6 fs:1 file:2 fs:2 file:7 fs:2 file:5 inuse:T inuse:F inuse:Tinuse:FInuse:TInuse:F fsfileinuse 114F 27T 25F 16T 12F fs:1 file:14 file:6 file:2 file:7 file:5 fs:2 fs  file  inuse {fs, file}  { inuse} fs  file  inuse {fs, file}  { inuse} inuse fs, file fs inuse file list array list s_list f_fs_list f_list

Decomposing Relations Formally(PLDI’11) fs  file  inuse {fs, file}  {inuse} inuse fs, file x yz w fs inuse file clist list array list let w: {fs, file,inuse}  {inuse} = {inuse} in let y : {fs}  {file, inuse} = {file}  list {w} in let z : {inuse }  {fs, file, inuse} = {fs,file}  list {w} in let x: {}  {fs, file, inuse} = {fs}  clist {y}  {inuse}  array {z}

Memory State fsfileinuse 114F 27T 25F 16T 12F fs  file  inuse {fs, file}  { inuse} inuse fs, file fs inuse file list array list filesystems filesystem=1 s_list s_files filesystem=2 s_list s_files file=14 f_list f_fs_list file=6 f_list f_fs_list file=2 f_list f_fs_list file=7 f_list f_fs_list file=5 f_list f_fs_list file_in_use file_unused

Adequacy A decomposition is adequate if it can represent every possible relation matching a relational specification Adequacy enforces sufficient conditions for adequacy Not every decomposition is a good representation of a relation

Adequacy of Decompositions All columns are represented Nodes are consistent with functional dependencies – Columns bound to paths leading to a common node must functionally determine each other

Respect Functional Dependencies file,fs inuse {file, fs}  {inuse}

Adequacy and Sharing fs, file fs inuse file inuse Columns bound on a path to an object x must functionally determine columns bound on any other path to x {fs, file}  {inuse, fs, file}

Adequacy and Sharing fs inuse file inuse Columns bound on a path to an object x must functionally determine columns bound on any other path to x  {fs, file}  {inuse, fs}

The RelC Compiler PLDI’11 Sequential Compositions of Data Structures Compiler ReLC C++ inuse fs, file fs inuse file list array list

Query Plans foreach  filesystems if inuse=T do … fs, file fs list inuse file inuse list array list scan Cost proportional to the number of files

Query Plans foreach  filesystems if inuse=T do … fs, file fs list inuse file inuse list array list access scan Cost proportional to the number of files in use

Removal and graph cuts remove fsfileinuse 114F 27T 25F 16T 12F filesystems fs:2 s_list s_files file:7 f_list f_fs_list file:5 f_list f_fs_list inuse:T inuse:F fs list file inuse list array list

Abstraction Theorem If the programmer obeys the relational specification and the decomposition is adequate and if the individual containers are correct Then the generated low-level code maintains the relational abstraction relation remove low-level state low-level state low level code remove  

Autotuner Given a fixed set of primitive types – list, circular list, doubly-linked list, array, map, … A workload Exhaustively enumerate all the adequate decompositions up to certain size The compiler can automatically pick the best performing representation for the workload

Directed Graph Example (DFS) Columns src  dst  weight Functional Dependencies – {src, dst}  {weight} Primitive data types – map, list … src dst weight map list src dst src weight map list map dst src weight map list dst weight map src map weight dst list src list

Synthesizing Concurrent Programs PLDI’12

Multiple ADTs Invariant: Every element that added to eden is either in eden or in longterm public void put(K k, V v) { if (this.eden.size() >= size) { this.longterm.putAll(this.eden); this.eden.clear(); } this.eden.put(k, v); }

OOPSLA’11 Shacham Search for all public domain collection operations methods with at least two operations Used simple static analysis to extract composed operations – Two or more API calls Extracted 112 composed operations from 55 applications – Apache Tomcat, Cassandra, MyFaces – Trinidad, … Check Linearizability of all public domain composed operations

47% Linearizable 38% Non Linearizable 15% Open Non Linearizable Motivation: OOPSLA’11 Shacham

Relational Specification Atomic operationmeaning r= emptyr := {} insert r s t if s  r then r := r  { } query r S CThe C of all the tuples in r matching tuple remove r sremove from r all the tuples which match s Program states as relations – Columns correspond to properties – Functional dependencies define global invariants

The High Level Idea Concurrent Compositions of Data Structures, Atomic Transactions Compiler RelScala Scala Concurrent Decomposition ConcurrentHashMap HashMap query {fs, file} List * query(FS* fs, File* file) { lock(…) for (q= file_in_use; …) ….

Two-Phase Locking Two phase locking protocol: Well-locked: To perform a read or write, a thread must hold the corresponding lock Two-phase: All lock acquisitions must precede all lock releases Attach a lock to each piece of data Theorem [Eswaran et al., 1976]: Well-locked, two-phase transactions are serializable

Two Phase Locking Attach a lock to every edge Problem 2: Too many locks DecompositionDecomposition Instance We’re done! Problem 1: Can’t attach locks to container entries Two Phase Locking  Serialiazability Butler Lampson/David J. Wheeler: “Any problem in computer science can be solved with another level of indirection.”

Two Phase Locking Attach a lock to every edge Problem 2: Too many locks DecompositionDecomposition Instance We’re done! Problem 1: Can’t attach locks to container entries Two Phase Locking  Serialiazability

Lock Placements 1. Attach locks to nodes DecompositionDecomposition Instance

Coarse-Grained Locking DecompositionDecomposition Instance

Finer-Grained Locking DecompositionDecomposition Instance

Lock Placements: Domination DecompositionDecomposition Instance Locks must dominate the edges they protect

Lock Placements: Path-Closure All edges on a path between an edge and its lock must share the same lock

Lock Ordering Prevent deadlock via a topological order on locks

Queries and Deadlock 2. lookup(tv) 1. acquire(t) 3. acquire(v) 4. scan(vw) Query plans must acquire the correct locks in the correct order Example: find files on a particular filesystem

Deadlock and Aliasing L1 L2 { lock(a) lock(b) // do something unlock(b) unlock(a) } { lock(a) lock(b) // do something unlock(b) unlock(a) } a a b b 

Decompositions and Aliasing A decomposition is an abstraction of the set of potential aliases Example: there are exactly two paths to any instance of node w

Autotuner Given a fixed set of primitive types – array, concurrentHashMap, concurrentSkipListMap, … A workload Exhaustively enumerate all the adequate decompositions up to certain size The compiler can automatically pick the best representation for the workload – Lock granularity

Directed Graph Example (DFS) Columns src  dst  weight Functional Dependencies – {src, dst}  {weight} Primitive data types – map, list … src dst weight map list src dst src weight map list map dst src weight map list dst weight map src map weight dst list src list

Concurrent Synthesis (Autotuner ) Find optimal combination of Decomposition Container Data Structures ConcurrentHashMap ConcurrentSkipListMap CopyOnWriteArrayList Array HashMap TreeMap LinkedList Lock Implementations ReentrantReadWriteLock ReentrantLock Lock Striping Factors Lock Placement

Concurrent Graph Benchmark Start with an empty graph Each thread performs 5 x 10 5 random operations Distribution of operations a-b-c-d (a% find successors, b% find predecessors, c% insert edge, d% remove edge) Plot throughput with varying number of threads Based on Herlihy’s benchmark of concurrent maps

Black = handwritten, isomorphic to blue = ConcurrentHashMap = HashMap... Results: % find successor, 35% find predecessor, 20% insert edge, 10% remove edge ConcurrentHashMap HashMap

(Some) Related Projects SETL Relational synthesis: [Cohen & Campbell 1993], [Batory & Thomas 1996], [Smaragdakis & Batory 1997], [Batory et al. 2000] [Manevich, 2012] … Two-phase locking and Predicate Locking [Eswaran et al., 1976], Tree and DAG locking protocols [Attiya et al., 2010], Domination Locking [Golan-Gueta et al., 2011] Lock Inference for Atomic Sections: [McCloskey et al.,2006], [Hicks, 2006], [Emmi, 2007]

Synchronization Methods Pessimistic Optimistic with rollback Non-Commutativity with future operations – Static analysis for inferring future – Efficient libraries for runtime synchronization – Guaranteed serializability and deadlock-freedom – No rollbacks – Reasonable performance

Summary Programming with uniform relational abstraction – Increase the gap between data abstraction and low level implementation Comparable performance to manual code Easier to evolve Automatic data structure selection Easier for program reasoning

Concurrent Libraries with Foresight PLDI’13 Guy Gueta(TAU) G. Ramalingam (MSR) M. Sagiv (TAU) E. Yahav (Technion)

Transactional Libraries with Foresight Enforce atomicity of arbitrary sequences The client declares intended operations – foresight The library utilizes the specification – Synchronize between operations which do not serialize with foresight Methodology for creating libraries with foresights – Maps Foresight can be automatically inferred by sequential static program analysis

Example: Positive Counter Operation  ) Operation  )

ComputeIfAbsent (single Map)

GossipRouter (multiple Maps)

Summary Methods for enforcing atomicity of sequences of operations Provably correct Simplifies reasoning – Sequential reasoning – High level data structures & invariants Is that efficient enough? – Pessimistic concurrency – Optimistic concurrency