Bloom: Big Systems, Small Programs Neil Conway UC Berkeley.

Slides:



Advertisements
Similar presentations
Disorderly Distributed Programming with Bloom
Advertisements

Edelweiss: Automatic Storage Reclamation for Distributed Programming Neil Conway Peter Alvaro Emily Andrews Joseph M. Hellerstein University of California,
One Dimensional Arrays
Implementing Declarative Overlays From two talks by: Boon Thau Loo 1 Tyson Condie 1, Joseph M. Hellerstein 1,2, Petros Maniatis 2, Timothy Roscoe 2, Ion.
BloomUnit Declarative testing for distributed programs Peter Alvaro UC Berkeley.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Declarative Distributed Programming with Dedalus and Bloom Peter Alvaro, Neil Conway UC Berkeley.
Logic and Lattices for Distributed Programming Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein UC Berkeley David Maier Portland State.
Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
More on Replication and Consistency CS-4513, D-Term More on Replication and Consistency CS-4513 D-Term 2007 (Slides include materials from Operating.
Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat.
Overview Distributed vs. decentralized Why distributed databases
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Replication and Consistency CS-4513 D-term Replication and Consistency CS-4513 Distributed Computing Systems (Slides include materials from Operating.
Wide-area cooperative storage with CFS
More on Replication and Consistency CS-4513 D-term More on Replication and Consistency CS-4513 Distributed Computing Systems (Slides include materials.
Chapter One Overview of Database Objectives: -Introduction -DBMS architecture -Definitions -Data models -DB lifecycle.
Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos.
Disorderly programming for a distributed world Peter Alvaro UC Berkeley.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
…using Git/Tortoise Git
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
Cloud Programming: From Doom and Gloom to BOOM and Bloom Peter Alvaro, Neil Conway Faculty Recs: Joseph M. Hellerstein, Rastislav Bodik Collaborators:
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Intuitions for Scaling Data-Centric Architectures
Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.
CS 240A: Databases and Knowledge Bases Analysis of Active Databases Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
BASE Dan Pritchett, Ebay ACM Queue, May/June 2008.
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Agenda  Quick Review  Finish Introduction  Java Threads.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
CS422 Principles of Database Systems Stored Procedures and Triggers Chengyu Sun California State University, Los Angeles.
Consistency Analysis in Bloom: a CALM and Collected Approach Authors: Peter Alvaro, Neil Conway, Joseph M. Hellerstein, William R. Marczak Presented by:
CSCI5570 Large Scale Data Processing Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
Cassandra - A Decentralized Structured Storage System
Amazon Simple Storage Service (S3)
Distributed Database Management Systems
October 2nd – Dictionary ADT
Greetings. Those of you who don't yet know me... Today is... and
GC211Data Structure Lecture2 Sara Alhajjam.
NOSQL.
Introduction to NewSQL
CSE 486/586 Distributed Systems Global States
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Transactional Replication A Deeper Dive Drew Furgiuele, Senior DBA IGS
EECS 498 Introduction to Distributed Systems Fall 2017
1 Demand of your DB is changing Presented By: Ashwani Kumar
CAP THEOREM Have the rules really changed?
Distributed P2P File System
EECS 498 Introduction to Distributed Systems Fall 2017
CS 440 Database Management Systems
Dynamo Recap Hadoop & Spark
CRDTs and Coordination Avoidance (Lecture 8, cs262a)
Distributed Databases
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Data-Centric Networking
Indirect Communication Paradigms (or Messaging Methods)
Indirect Communication Paradigms (or Messaging Methods)
Combinatorial Optimization of Multicast Key Management
CSE 486/586 Distributed Systems Global States
Caching 50.5* + Apache Kafka
Presentation transcript:

Bloom: Big Systems, Small Programs Neil Conway UC Berkeley

Distributed Computing

Programming Languages

Optimization Loop unrolling Register allocation Data prefetching Function inlining Global coordination, waiting Caching, indexing Replication, data locality Partitioning, load balancing

Warnings Undeclared variables Type mismatches Sign conversion mistakes Replica divergence Inconsistent state Deadlocks Race conditions

Debugging Stack traces Log files, printf gdb Full stack visualization, analytics Consistent global snapshots Provenance analysis

Developer productivity is a major unsolved problem in distributed computing.

We can do better! … provided we’re willing to make changes.

Design Principles

Centralized Computing Predictable latency No partial failure Single clock Global event order

Taking Order For Granted Data(Ordered) array of bytes Compute(Ordered) sequence of instructions Global event order

Distributed Computing Unpredictable latency Partial failures No global event order

Alternative #1: Enforce global event order at all nodes (“Strong Consistency”)

Alternative #1: Enforce global event order at all nodes (“Strong Consistency”) Problems: Availability (CAP) Latency

Alternative #2: Ensure correct behavior for any network order (“Weak Consistency”)

Alternative #2: Ensure correct behavior for any network order (“Weak Consistency”) Problem: With traditional languages, this is very difficult.

The “ACID 2.0” Pattern Associativity: X ○ (Y ○ Z) = (X ○ Y) ○ Z “batch tolerance” Commutativity: X ○ Y = Y ○ X “reordering tolerance” Idempotence: X ○ X = X “retry tolerance”

“When I see patterns in my programs, I consider it a sign of trouble … [they are a sign] that I'm using abstractions that aren't powerful enough.” —Paul GrahamPaul Graham

Bounded Join Semilattices A triple h S, t, ?i such that: – S is a set – t is a binary operator (“least upper bound”) Induces a partial order on S: x · S y if x t y = y Associative, Commutative, and Idempotent – 8 x 2 S: ? t x = x

Bounded Join Semilattices Lattices are objects that grow over time. An interface with an ACID 2.0 merge() method – Associative – Commutative – Idempotent

Time Set (Merge = Union) Increasing Int (Merge = Max) Boolean (Merge = Or)

CRDTs: Convergent Replicated Data Types – e.g., registers, counters, sets, graphs, trees Implementations: – Statebox – Knockbox – riak_dt

Lattices represent disorderly data. How can we represent disorderly computation?

f : S  T is a monotone function iff: 8 a,b 2 S : a · S b ) f(a) · T f(b)

Time Set (Merge = Union) Increasing Int (Merge = Max) Boolean (Merge = Or) size() >= 3 Monotone function: set  increase-int Monotone function: increase-int  boolean

Consistency As Logical Monotonicity

Case Study

Questions 1. Will cart replicas eventually converge? – “Eventual Consistency” 2. What will client observe on checkout? – Goal: checkout reflects all session activity 3. To achieve #1 and #2, how much ordering is required?

if kvs[x] exists: old = kvs[x] kvs.delete(x) if old > c kvs[x] = old – c if kvs[x] exists: old = kvs[x] kvs.delete(x) else old = 0 kvs[x] = old + c Design #1: Mutable State Add(item x, count c): Remove(item x, count c): Non-monotonic!

Conclusion: Every operation might require coordination! Non-monotonic!

Design #2: “Disorderly” Add(item x, count c): Add x,c to add_log Remove(item x, count c): Add x,c to del_log Checkout(): Group add_log by item ID; sum counts. Group del_log by item ID; sum counts. For each item, subtract deletes from adds. Non-monotonic!

Conclusion: Replication is safe; might need to coordinate on checkout Monotonic

Takeaways Avoid: mutable state update Prefer: immutable data, monotone growth Major difference in coordination cost! – Coordinate once per operation vs. Coordinate once per checkout We’d like a type system for monotonicity

Language Design

Disorderly Programming Order-independent: default Order dependencies: explicit Order as part of the design process Tool support – Where is order needed? Why?

The Disorderly Spectrum ASM C Java Lisp, ML Haskell SQL, LINQ, Datalog Bloom High-level “Declarative” Powerful optimizers

Bloom ≈ declarative agents Processes that communicate via asynchronous message passing Each process has a local database Logical rules describe computation and communication (“SQL++”)

Each agent has a database of values that changes over time. All values have a location and timestamp.

left-hand-side <= right-hand-side If RHS is true ( SELECT … ) Then LHS is true ( INTO lhs ) When and where is the LHS true?

Temporal Operators 1. Same location, same timestamp 2. Same location, next timestamp 3. Different location, non-deterministic timestamp <= <+ <- <~ Computation Persistence Deletion Communication

ObserveComputeAct

Our First Program: PubSub

class Hub include Bud state do channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] table :sub, [:addr, :topic] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.addr, p.topic, p.val] } end Ruby DSL

class Hub include Bud state do channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] table :sub, [:addr, :topic] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.addr, p.topic, p.val] } end State declarations

class Hub include Bud state do channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] table :sub, [:addr, :topic] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.addr, p.topic, p.val] } end Rules

class Hub include Bud state do table :sub, [:client, :topic] channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.addr, p.topic, p.val] } end Persistent state: set of subscriptions Schema

class Hub include Bud state do table :sub, [:client, :topic] channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.addr, p.topic, p.val] } end Network input, output Destination address

class Hub include Bud state do table :sub, [:client, :topic] channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.client, p.topic, p.val] } end Remember subscriptions

class Hub include Bud state do table :sub, [:client, :topic] channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.client, p.topic, p.val] } end Send events to subscribers Join (as in SQL) Join key

Persistent State Ephemeral Events “Push-Based”

Ephemeral Events Persistent State “Pull-Based”

class Hub include Bud state do table :sub, [:client, :topic] channel :subscribe, :topic, :client] channel :pub, :topic, :val] channel :event, :topic, :val] end bloom do sub <= subscribe {|s| [s.client, s.topic]} event :topic) {|p,s| [s.client, p.topic, p.val] } end

class HubPull include Bud state do table :pub, [:topic, :val] channel :publish, :topic, :val] channel :sub, :topic, :client] channel :event, :topic, :val] end bloom do pub <= publish {|p| [p.topic, p.val]} event :topic) {|p,s| [s.client, p.topic, p.val] } end

Suppose we keep only the most recent message for each topic (“last writer wins”). Publish  Put Subscribe  Get Event  Reply Pub  DB Topic  Key Rename:

class KvsHub include Bud state do table :db, [:key, :val] channel :put, :key, :val] channel :get, :key, :client] channel :reply, :key, :val] end bloom do db <+ put {|p| [p.key, p.val]} db :key) reply :key) {|d,g| [g.client, d.key, d.val] } end Update = delete + insert

class KvsHub include Bud state do table :db, [:key, :val] channel :put, :key, :val] channel :get, :key, :client] channel :reply, :key, :val] end bloom do db <+ put {|p| [p.key, p.val]} db :key) reply :key) {|d,g| [g.client, d.key, d.val] } end

Takeaways Bloom: Concise, high-level programs State update, asynchrony, and non- monotonicity are explicit in the syntax Design Patterns: Communication vs. Storage Queries vs. Data Push vs. Pull Actually not so different!

Conclusion Traditional languages are not a good fit for modern distributed computing Principle: Disorderly programs for disorderly networks Practice: Bloom – High-level, disorderly, declarative – Designed for distribution

Thank You! gem install bud Peter Alvaro Emily Andrews Peter Bailis David Maier Bill Marczak Joe Hellerstein Sriram Srinivasan Collaborators: