Adaptive Dataflow Joe Hellerstein UC Berkeley
Overview Trends Driving Adaptive Dataflow Lessons –networking flow control, event programming, app-level routing –query processing distributed & “shared nothing” parallel query systems Adaptive Dataflow –Rivers, Eddies Telegraph –Facts and Figures on the Web (FFW) –sensor-based information systems –software traces and adaptive distributed systems FFW Questions
Recent Trends 1990s: shift in the focus of academic CS –driving example applications changed –everybody on the information bandwagon 1990s: tightening of bottlenecks –data growth double Moore’s Law 90’s systems R&D: Parallel & Distributed Information Services –infocentric, multi-user, highly available –“shared-nothing” clusters, not “parallelism” a la 1989 –distributed data, not “distributed OS” a la 1989
Systems Research: Up or Down UP: Global Federations –internet services as procedure calls with fees and lawyers! –B2B e-commerce has this problem today Cohera examples DOWN: Sensor/Actuator Networks –UPC codes? Clickstream? Smart dust! –HUGE, noisy data volumes –new architectures, major challenges
Core Technology Not There Yet Key component: dataflow –The plumbing is coming XML/http, WML/WAP, etc. give LCD communication glue at the boundaries ho hum –Systems challenge: move the data efficiently robustly intelligently –Language challenge too programming and debugging tools interfaces and economic models
What’s So Hard Here? Volatile regime –Data flows unpredictably from sources –Code performs unpredictably along flows –Continuous volatility due to many decentralized systems Lots of choices –Choice of services –Choice of machines –Choice of info: sensor fusion, data reduction, etc. –Order of operation Maintenance –Federated world –Partial failure is the common case Adaptivity required!
A Networking Problem!? Networks do dataflow! Significant history of adaptive techniques –E.g. TCP congestion control –E.g. routing But traditionally much lower function –Ship bitstreams –Minimal, fixed code Lately, moving up the foodchain? –app-level routing –active networks –politics of growth assumption of complexity = assumption of liability
Networking Code as Dataflow? States & Events, Not Threads –Asynchronous events natural to networks –State machines in protocol specification and system code –Low-overhead, spreading to big systems –Totally different programming style remaining area of hacker machismo Eventflow optimization –Can’t eventflow be adaptively optimized like dataflow? –Why didn’t that happen years ago? –Hold this thought
Query Plans are Dataflow Too Programming model: iterators –old idea, widely used in DB query processing –object with three methods: Init(), GetNext(), Close() –input/output types –query plan: graph of iterators pipelining: iterators that return results before children Close()
Distributed/Parallel Databases? Query plans across machines Bloom filters, query optimization minimize bandwidth –Send lossily compressed signatures, not just data –Model network, disk, CPU costs in dataflow optimization –A “Distributed DB” contribution App-level optimization … code to data or data to code –DB research highest up the food chain Data partitioning, query opt. parallelizes dataflow –Pipeline & partition parallelism: natural! –Model resource consumption and response time –A “Parallel DB” contribution Challenge: move down the foodchain to serve all –Biggest problem: limited adaptivity
Adaptive Systems: General Flavor Repeat: 1.Observe (model) environment 2.Use observation to choose behavior 3.Take action
Adaptive Dataflow in DBs: History Rich But Unacknowledged History –Codd's data independence predicated on adaptivity! adapt opaquely to changing schema and storage –Query optimization does it! statistics-driven optimization key differentiator between DBMSs and other systems
Adaptivity in Current DBs Limited & coarse grain Repeat: 1.Observe (model) environment –runstats (once per week!!): model changes in data 2.Use observation to choose behavior –query optimization: fixes a single static query plan 3.Take action –query execution: blindly follow plan
Adaptive Query Processing Work –Late Binding: Dynamic, Parametric [HP88,GW89,IN+92,GC94,AC+96,LP97] –Per Query: Mariposa [SA+96], ASE [CR94] –Competition: RDB [AZ96] –Inter-Op: [KD98], Tukwila [IF+99] –Query Scrambling: [AF+96,UFA98] Survey: Hellerstein, Franklin, et al., DE Bulletin 2000 System R Late Binding Per Query Competition & Sampling Inter-Operator Query Scrambling Eddies Ingres DECOMP Frequency of Adaptivity Future Work
Some Solutions We’re Focusing On Rivers –Adaptive partitioning of work across machines Eddies –Adaptive ordering of pipelined operations Quality of Service –Online aggregation & data reduction: CONTROL –MUST have app-semantics –Often may want user interaction UI models of temporal interest Data Dissemination –Adaptively choosing what to send, what to cache
Dataflow Parallelism in DBs Volcano: “exchange” iterator [Graefe] –encapsulate exchange logic in an iterator –not in the dataflow system –Box-and-arrow programming can ignore parallelism
River We built the world’s fastest sorting machine –On the “NOW”: 100 Sun workstations + SAN –Only beat the record under ideal conditions No such thing in practice! River: adaptive dataflow on clusters –One main idea: Distributed Queues adaptive exchange operator –Simplifies management and programming
River
Multi-Operator Query Plans Deal with pipelines of commutative operators Adapt at finer granularity than current DBMSs
Continuous Adaptivity: Eddies A pipelining tuple-routing iterator –just like join or sort or exchange Works great with other pipelining operators –like Ripple Joins, online reordering, etc. Eddy Avnur & Hellerstein SIGMOD 2000
Continuous Adaptivity: Eddies How to order and reorder operators over time – based on performance, economic/admin feedback Vs.River: –River optimizes each operator “horizontally” –Eddies optimize a pipeline “vertically” Eddy
Continuous Adaptivity: Eddies Adjusts flow adaptively –Tuples routed through ops in different orders –Visit each op once before output Naïve routing policy: –All ops fetch from eddy as fast as possible A la River –Turns out, doesn’t quite work Only measures rate of work, not benefit
An Aside: n-Arm Bandits A little machine learning problem: –Each arm pays off differently –Explore? Or Exploit? Sometimes want to randomly choose an arm Usually want to go with the best If probabilities are stationary, dampen exploration over time
Eddies with Lottery Scheduling Operator gets 1 ticket when it takes a tuple –Favor operators that run fast (low cost) Operator loses a ticket when it returns a tuple –Favor operators with high rejection rate Low selectivity Lottery Scheduling: –When two ops vie for the same tuple, hold a lottery –Never let any operator go to zero tickets Support occasional random “exploration” Set up “inflation” (forgetting) to adapt over time –E.g. tix’ = oldtix + newtix
Promising! Initial performance results Ongoing work on proofs of convergence –have analysis for contrained case
To Be Continued Tune & formalize policy Competitive eddies –Source & Join selection –Requires duplicate management Parallelism –Eddies + Rivers? Reliability –Long-running flows –Rivers + RAID-style computation Eddy R2R1 R3 S1S2 S3 hash blockindex1 index2
To Be Continued, cont. What about wide area? –data reduction –sensor fusion –asynchronous communication Continuous queries –events –disconnected operation Lower-level eventflow? –can eddies, rivers, etc. be brought to bear on programming?
Telegraph: An Adaptive Dataflow System An adaptive dataflow system Currently cluster + http –Rivers and Eddies –Web wrappers Sensor nets next Target applications –Facts and Figures on the Web (FFW) –Distributed Introspection Services –Sensor Stream Services w/Mike Franklin, Sirish Chandrasekaran, Amol Deshpande, Nick Lanham, Sam Madden, VijayShankar Raman, Mehul Shah
Facts & Figures on the Web “Deep” Web –“Hidden” Web, “Dark Matter” More interesting: Facts & figures, not text –“search” is not the main problem “search” was always easy ranking often not apropos to facts –combine, transform, summarize, analyze
FFW Election 2000 Campaign Finance Drill-down –Bush/Gore donations Personal and industrial –Industry data –Neighborhood data –Personal data –Historical voting data Live demo, online aggregation
Web Research Revisited Crawling, Caching, Relationship Graphs –Transitive Closure –The Berkeley Bindings & graph analysis? –Form identification & APIs –“Semantic” caching Socio-Techno-Legal Issues –Privacy: Statistical DBs + Federation WhitePages |x| WhoIs |x| DoubleClick |x| CDC Wonder –Stats in the wrong hands –Accuracy of derived results –Intellectual property Etc!
Summary Adaptive software systems must happen –federation & scaling require it –systems and stats must marry Dataflow programming natural –for many applications –best hope for large-scale apps Terrific nexus of research –DB, Networking, Learning/Stat –Lots of work to be done! Drop by the Telegraph FFW demo! –
Backup slides The rest of the slides are backup to answer questions…
Prior Progress in DB Adaptivity Per-query adaptivity –E.g. Mariposa [Sto95] 1 st distributed DBMS to consider scalability economic APIs for federation, give limited adaptivity too One-time intra-query –DEC Rdb competition [AZ96] –Sampling [lots] Intra-query, inter-operator –“Query Scrambling”: reoptimize in face of delays [UFA98] –Kabra/DeWitt ‘98: dam the flow, reoptimize downstream