Cloud Programming: From Doom and Gloom to BOOM and Bloom Neil Conway UC Berkeley Joint work with Peter Alvaro, Ras Bodik, Tyson Condie, Joseph M. Hellerstein,

Slides:



Advertisements
Similar presentations
Declarative Networking: Language, Execution and Optimization Boon Thau Loo 1, Tyson Condie 1, Minos Garofalakis 2, David E. Gay 2, Joseph M. Hellerstein.
Advertisements

Declarative Networking Mothy Joint work with Boon Thau Loo, Tyson Condie, Joseph M. Hellerstein, Petros Maniatis, Ion Stoica Intel Research and U.C. Berkeley.
Inner Architecture of a Social Networking System Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner.
Disorderly Distributed Programming with Bloom
Three Perspectives & Two Problems Shivnath Babu Duke University.
The Datacenter Needs an Operating System Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica.
Edelweiss: Automatic Storage Reclamation for Distributed Programming Neil Conway Peter Alvaro Emily Andrews Joseph M. Hellerstein University of California,
Implementing declarative overlays Boom Thau Loo Tyson Condie Joseph M. Hellerstein Petros Maniatis Timothy Roscoe Ion Stoica.
Implementing Declarative Overlays From two talks by: Boon Thau Loo 1 Tyson Condie 1, Joseph M. Hellerstein 1,2, Petros Maniatis 2, Timothy Roscoe 2, Ion.
BloomUnit Declarative testing for distributed programs Peter Alvaro UC Berkeley.
SDN + Storage.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
Berkeley dsn declarative sensor networks problem David Chu, Lucian Popa, Arsalan Tavakoli, Joe Hellerstein approach related dsn architecture status  B.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
UC Berkeley Online System Problem Detection by Mining Console Logs Wei Xu* Ling Huang † Armando Fox* David Patterson* Michael Jordan* *UC Berkeley † Intel.
© Chinese University, CSE Dept. Software Engineering / Software Engineering Topic 1: Software Engineering: A Preview Your Name: ____________________.
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Chapter 13 (Web): Distributed Databases
CLF: A Concurrent Logical Framework David Walker Princeton (with I. Cervesato, F. Pfenning, K. Watkins)
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Overview Distributed vs. decentralized Why distributed databases
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Distributed Systems (15-440) Mohammad Hammoud December 4 th, 2013.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Architectures of distributed systems Fundamental Models
The Limitation of MapReduce: A Probing Case and a Lightweight Solution Zhiqiang Ma Lin Gu Department of Computer Science and Engineering The Hong Kong.
Cloud Programming: From Doom and Gloom to BOOM and Bloom Peter Alvaro, Neil Conway Faculty Recs: Joseph M. Hellerstein, Rastislav Bodik Collaborators:
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Presenters: Rezan Amiri Sahar Delroshan
1 Distributed Databases BUAD/American University Distributed Databases.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Presented by: Katie Woods and Jordan Howell. * Hadoop is a distributed computing platform written in Java. It incorporates features similar to those of.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Programming Sensor Networks Andrew Chien CSE291 Spring 2003 May 6, 2003.
Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Cloud Distributed Computing Environment Hadoop. Hadoop is an open-source software system that provides a distributed computing environment on cloud (data.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Next Generation of Apache Hadoop MapReduce Owen
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
BIG DATA/ Hadoop Interview Questions.
Bloom: Big Systems, Small Programs Neil Conway UC Berkeley.
Introduction to Distributed Platforms
Spark Presentation.
The Client/Server Database Environment
Introduction to Spark.
湖南大学-信息科学与工程学院-计算机与科学系
Probabilistic Databases
MAPREDUCE TYPES, FORMATS AND FEATURES
Architectures of distributed systems
Distributed Systems (15-440)
Presentation transcript:

Cloud Programming: From Doom and Gloom to BOOM and Bloom Neil Conway UC Berkeley Joint work with Peter Alvaro, Ras Bodik, Tyson Condie, Joseph M. Hellerstein, David Maier (PSU), William R. Marczak, and Russell Sears (Yahoo! Research) Datalog 2.0 Workshop

Cloud Computing: The Next Great Computing Platform

The Problem Writing reliable, scalable distributed software remains extremely difficult.

Doom and Gloom! “...when we start talking about parallelism and ease of use of truly parallel computers, we’re talking about a problem that’s as hard as any that computer science has faced.... I would be panicked if I were in industry.” -- John Hennessey, Stanford

A Ray of Light We understand data-parallel computing – MapReduce, parallel DBs, etc. Can we take a hard problem and transform it into an easy one?

Everything is Data Distributed computing is all about state – System state – Session state – Protocol state – User and security-related state –... and of course, the actual “data” Computing = Creating, updating, and communicating that state

Datalog to the Rescue! 1.Data-centric programming – Explicit, uniform state representation: relations 2.High-level declarative queries – Datalog + asynchrony + state update

Outline 1. The BOOM Project – Cloud Computing stack built w/ distributed logic – BOOM Analytics: MapReduce and DFS in Overlog 2. Dedalus: Datalog in Time (and Space) 3.(Toward) The Bloom Language – Distributed Logic for Joe the Programmer

The BOOM Project Berkeley Orders Of Magnitude – OOM more scale, OOM less code – Can we build Google in 10k LOC? Build “Real Systems” in distributed logic – Begin w/ an existing variant of Datalog (“Overlog”) – Inform the design of a new language for distributed computing (Bloom)

BOOM Analytics Typical “Big Data” stack: MapReduce (Hadoop) + distributed file system (HDFS) We tried two approaches: – HDFS: clean-slate rewrite – Hadoop: replace job scheduling logic w/ Overlog Goals 1.Replicate existing functionality 2.Add Hard Stuff (and make it look easy!)

Overlog: Distributed Datalog Originally designed for routing protocols and overlay networks (Loo et al., SIGMOD’06) – Routing = recursive query over distributed DB Datalog w/ aggregation, negation, functions Distribution = horizontal partitioning of tables – Data placement induces communication

Yet Another Transitive Closure link(X, Y, C); path(X, Y, C) :- link(X, Y, C); path(X, Z, C1 + C2) :- link(X, Y, C1), path(Y, Z, C2); mincost(X, Z, min ) :- path(X, Z, C);

Overlog Example Y, C); Y, C) :- Y, C); Z, C1 + C2) :- Y, C1), Z, C2); Z, min ) :- Z, C);

Overlog Timestep Model

Hadoop Distributed File System Based on the Google File System (SOSP’03) – Large files, sequential workloads, append- only – Used by Yahoo!, Facebook, etc. Chunks 3x replicated at data nodes for fault tolerance

BOOM-FS Hybrid system – Complex logic: Overlog – Performance- critical (but simple!): Java Clean separation between policy and mechanism

BOOM-FS Example: State Represent file system metadata with relations. NameDescriptionAttributes fileFilesfileID, parentID, name, isDir fqpathFully-qualified path namesfileID, path fchunkChunks per filechunkID, fileID datanodeDataNode heartbeatsnodeAddr, lastHbTime hb_chunkChunk heartbeatsnodeAddr, chunkID, length

BOOM-FS Example: Query Represent file system metadata with relations. // Base case: root directory has null parent fqpath(FileId, Path) :- file(FileId, FParentId, FName, IsDir), IsDir = true, FParentId = null, Path = "/"; fqpath(FileId, Path) :- file(FileId, FParentId, FName, _), fqpath(FParentId, ParentPath), // Do not add extra slash if parent is root dir PathSep = (ParentPath = "/" ? "" : "/"), Path = ParentPath + PathSep + FName;

BOOM-FS Example: Query Distributed protocols: join between event stream and local database // "ls" for extant path => return listing for path RequestId, true, DirListing) :- RequestId, Source, "Ls", Path), FileId, Path), FileId, DirListing); // "ls" for nonexistent path => error RequestId, false, null) :- RequestId, Source, "Ls", Path), notin _, Path);

Comparison with Hadoop Competitive performance (~20%) Lines of JavaLines of Overlog HDFS~21,7000 BOOM-FS1, New Features: 1.Hot Standby for FS master nodes using Paxos 2.Partitioned FS master nodes for scalability – ~1 day! 3.Monitoring, tracing, and invariant checking

Lessons from BOOM Analytics Overall, Overlog was a good fit for the task – Concise programs for real features, easy evolution Data-centric design: language-independent – Replication, partitioning, monitoring all involve data management – Node-local invariants are “cross-cutting” queries Specification is enforcement Policy vs. mechanism  Datalog vs. Java

Challenges from BOOM Analytics Poor perf, cryptic syntax, little/no tool support – Easy to fix! Many bugs related to updating state – Ambiguous semantics (in Overlog) We avoided distributed queries – “The global database is a lie!” – Hand-coding protocols vs. stating distributed invariants

Outline 1. The BOOM Project – Cloud Computing stack built w/ distributed logic – BOOM Analytics: MapReduce and DFS in Overlog 2. Dedalus: Datalog in Time (and Space) 3.(Toward) The Bloom Language – Distributed Logic for Joe the Programmer

Dedalus Dedalus: a theoretical foundation for Bloom In Overlog, the Hard Stuff happens between time steps – State update – Asynchronous messaging Can we talk about the Hard Stuff with logic?

State Update Updates in Overlog: ugly, “outside” of logic Difficult to express common patterns – Queues, sequencing Order doesn’t matter... except when it does! Val + 1) :- Val), _);

Asynchronous Messaging Overlog notation describes space Logical interpretation unclear: Upon reflection, time is more fundamental – Model failure with arbitrary delay Discard illusion of global DB B) :- A);

Dedalus: Datalog in Time (1)Deductive rule: (Pure Datalog) (2) Inductive rule: (Constraint across “next” timestep) (3) Async rule: (Constraint across arbitrary timesteps) p(A, B) :- q(A, B); p(A, :- q(A, B); p(A, q(A, B);

Dedalus: Datalog in Time (1)Deductive rule: (Pure Datalog) (2) Inductive rule: (Constraint across “next” timestep) (3) Async rule: (Constraint across arbitrary timesteps) p(A, B, S) :- q(A, B, T), T = S; p(A, B, S) :- q(A, B, T), successor(T, S); p(A, B, S) :- q(A, B, T), time(S), choose((A, B, T), (S)); All terms in body have same time

State Update in Dedalus p (A, :- p (A, B), notin p_neg(A, B); p (1, p (1, p_neg(1, Timep(1, 2)p(1, 3)p_neg(1, 2)

Counters in Dedalus counter(A, Val + :- counter(A, Val), event(A, _); counter(A, :- counter(A, Val), notin event(A, _);

Asynchrony in Dedalus Unreliable Broadcast: sbcast(#Target, Sender, :- new_message(#Sender, Message), members(#Sender, Target); More satisfactory logical interpretation Can build Lamport clocks, reliable broadcast, etc. What about “space”? Space is the unit of atomic deduction w/o partial failure

Asynchrony in Dedalus Unreliable Broadcast: sbcast(#Target, Sender, N, :- new_message(#Sender, members(#Sender, Target); More satisfactory logical interpretation Can build Lamport clocks, reliable broadcast, etc. What about “space”? Space is the unit of atomic deduction w/o partial failure Project sender’s local time

Dedalus Summary Logical, model-theoretic semantics for two key features of distributed systems 1.Mutable state 2.Asynchronous communication All facts are transient – Persistence and state update are explicit Initial correctness checks 1.Temporal stratifiability (“modular stratification in time”) 2.Temporal safety (“eventual quiescence”)

Directions: Bloom 1.Bloom: Logic for Joe the Programmer – Expose sets, map/reduce, and callbacks? – Translation to Dedalus 2.Verification of Dedalus programs 3.Network-oriented optimization 4.Finding the right abstractions for Distributed Computing – Hand-coding protocols vs. stating distributed invariants 5.Parallelism and monotonicity?

Questions? Thank you! Initial Publications: BOOM Analytics: EuroSys’10, Alvaro et al.BOOM Analytics Paxos in Overlog: NetDB’09, Alvaro et al.Paxos in Overlog Dedalus: UCB TR # , Alvaro et al.Dedalus

Temporal Stratifiability Reduction to Datalog not syntactically stratifiable: :- p(X), notin p_neg(X); p_neg(X) :- event(X, _), p(X);