Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin UC Berkeley Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony Tomasic.

Slides:



Advertisements
Similar presentations
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Advertisements

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
1 Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa
IntroductionAQP FamiliesComparisonNew IdeasConclusions Adaptive Query Processing in the Looking Glass Shivnath Babu (Stanford Univ.) Pedro Bizarro (Univ.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Continuously Adaptive Processing of Data and Query Streams Michael Franklin UC Berkeley April 2002 Joint work w/Joe Hellerstein and the Berkeley DB Group.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
Interactive query processing partial query results dynamic user control during query execution adaptive query execution interactive data cleaning and transformation.
Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin University of Maryland Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony.
Shangri-La: Achieving High Performance from Compiled Network Applications while Enabling Ease of Programming Michael K. Chen, Xiao Feng Li, Ruiqi Lian,
Freddies: DHT-Based Adaptive Query Processing via Federated Eddies Ryan Huebsch Shawn Jeffery CS Peer-to-Peer Systems 12/9/03.
XJoin: Getting Fast Answers From Slow and Bursty Networks T. Urhan M. J. Franklin IACS, CSD, University of Maryland Presented by: Abdelmounaam Rezgui CS-TR-3994.
Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin UC Berkeley Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony Tomasic.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.
CS561 - XJoin1 XJoin: A Reactively-Scheduled Pipelined Join Operator IEEE Bulletin, 2000 by Tolga Urhan and Michael J. Franklin.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
Data-Intensive Systems Michael Franklin UC Berkeley
1 04/18/2005 Flux Flux: An Adaptive Partitioning Operator for Continuous Query Systems M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin UC.
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
1 The Google File System Reporter: You-Wei Zhang.
Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results Ramon Lawrence University of Iowa
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
CSC271 Database Systems Lecture # 30.
NiagaraCQ : A Scalable Continuous Query System for Internet Databases (modified slides available on course webpage) Jianjun Chen et al Computer Sciences.
CONGRESSIONAL SAMPLES FOR APPROXIMATE ANSWERING OF GROUP-BY QUERIES Swarup Acharya Phillip Gibbons Viswanath Poosala ( Information Sciences Research Center,
1 XJoin: Faster Query Results Over Slow And Bursty Networks IEEE Bulletin, 2000 by T. Urhan and M Franklin Based on a talk prepared by Asima Silva & Leena.
Loading a Cache with Query Results Laura Haas, IBM Almaden Donald Kossmann, Univ. Passau Ioana Ursu, IBM Almaden.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Query Optimization. overview Histograms A histogram is a data structure maintained by a DBMS to approximate a data distribution Equiwidth vs equidepth.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Distributed Database Systems Overview
© ETH Zürich Eric Lo ETH Zurich a joint work with Carsten Binnig (U of Heidelberg), Donald Kossmann (ETH Zurich), Tamer Ozsu (U of Waterloo) and Peter.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
PermJoin: An Efficient Algorithm for Producing Early Results in Multi-join Query Plans Justin J. Levandoski Mohamed E. Khalefa Mohamed F. Mokbel University.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
An Example Data Stream Management System: TelegraphCQ INF5100, Autumn 2009 Jarle Søberg.
Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Z. Joseph, CSE-UT Arlington.
A Roadmap towards Machine Intelligence
BarrierWatch: Characterizing Multithreaded Workloads across and within Program-Defined Epochs Socrates Demetriades and Sangyeun Cho Computer Frontiers.
A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.
Efficient Evaluation of Queries in a Mediator for WebSources Louiqa Raschid University of Maryland Joint work with Zadorozhny, Vidal, Urhan, Bright.
University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid.
For a good summary, visit:
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
Cost-based Query Scrambling for Initial Delays Tolga Urhan Michael J. Franklin Laurent Amsaleg.
BIG DATA/ Hadoop Interview Questions.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
Packing Tasks with Dependencies
Ripple Joins for Online Aggregation
Human Complexity of Software
Database Query Execution
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Outline Introduction Background Distributed DBMS Architecture
Presented By: Darlene Banta
Adaptive Query Processing (Background)
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Adaptive Query Processing for Wide-Area Distributed Data Michael Franklin UC Berkeley Joint work with Tolga Urhan, Laurent Amsaleg, and Anthony Tomasic

M. Franklin, 12/3/99 2 Motivation n Pervasive network connectivity enables global-scale federated DBMSs. n Improvements in heterogeneous DBMS and emerging standards enable Internet query processing. n Telegraph: Flow-based composition of data- intensive Internet services.

M. Franklin, 12/3/99 3 u Sources may be unreachable or slow to respond. u Data delivery may be: F F slower than expected F F bursty F F interrupted u Data statistics/cost estimates may be unavailable or unreliable. Traditional, static query processing approaches cannot cope with such problems at run-time. Wide-area + Wrapped sources  Unpredictability

M. Franklin, 12/3/99 4 Some Solutions n Adaptive Query Processing u Query Scrambling - “Reactive Query Execution” u XJoin – non-blocking, reactive query operator. u and beyond! n Risk-Aware Query Planning u Producing robust plans. n Exploiting Alternative Sources u Mirrors or “not exactly”. n Relaxing Query Semantics u Partial, Fuzzy, or Alternative answers

M. Franklin, 12/3/99 5 Query Scrambling - Introduction n Goal: Overcome limitations of static QP for unexpected delays. n A Reactive Approach: u Start with an optimized plan. u Modify the plan on-the-fly if problems are detected. u Hide delays by performing other useful work. n Assumptions: u Focus on Initial Delay u Query processing at client; Iterator model u No replication.

M. Franklin, 12/3/99 6 n An iterative algorithm. n Monitor input and scramble when problems are detected. Query Scrambling - Overview n Phase 1: n Phase 1: Reschedule “runable” operators. n Phase 2: n Phase 2: Operator synthesis: create new operators. Phase 1 Phase 2 Scrambling Normal Execution Source(s) responded Source(s) delayed Still delayed

M. Franklin, 12/3/99 7 Query Scrambling Example 1 4 A CDE B Reschedule A CDEB New Operators BCDEA Initial PlanReschedule A BCDE ABCDE

M. Franklin, 12/3/99 8 n A thread per operator. n Monitoring and scheduling. n A “smart” materialization operator. n Multi-threaded query operators? Building a Scrambling Engine Not Started Active Stalled Suspended Closed open done timeout data_arrival de-schedule resume

M. Franklin, 12/3/99 9 Directing Scrambling [SIGMOD 98] n Original formulation [PDIS 96] was based on heuristics. n Demonstrated the ability for QS to hide delays, but was susceptible to making bad choices. n Query optimizers are able to choose good plans, but how to use an optimizer to do scrambling? u Phase I F Issue: where to place the materialization operator? F Answer: Choose subtree with best overhead/useful work ratio. u Phase II is trickier.

M. Franklin, 12/3/99 10 n If no runable subtrees, create new ones. n Needed: an optimizer that: 1) is lightweight & incremental, and 2) understands delays. n Most QP systems optimize for total work. n But, delay is inherently a response-time issue. but only if it knows the duration of the delay! n Response-time optimization can “magically” move delayed operators to the “best” point in the plan, Phase II - Operator Synthesis

M. Franklin, 12/3/99 11 n Invokes the optimizer with a very large delay value. n Optimizer pushes the delayed relation as far back as is useful. n Large delay estimation Aggressive Include Delayed (ID) Algorithm

M. Franklin, 12/3/99 12 Estimated Delay (ED) Algorithm n Initially calls the RT optimizer with a small delay u Small value = 25 % of the RT of the original query n Successively increases the delay estimation. u 50% and then 100% of the original RT. n Increasing estimates Adaptive

M. Franklin, 12/3/99 13 Experimental Environment n Workload: Queries derived from TPC-D benchmark u TPC-D (5), TPC-D(8), TPC-D(9), (1 GB base data) n Optimizer (built from scratch): u Two Phase Randomized Optimizer F F a la [Ioannidis 90]. u Optimizes for Total Work or Response Time (GHK 92). u Search space = bushy plans n Studied algorithms on a simulated environment u Network, remote sites, query engine etc. u Subsequently validated with Predator-based implementation.

M. Franklin, 12/3/99 14 National Market Share Query (TPC-D 8) n Experiments with several memory sizes n Delayed relation (Part) is an important relation. n Used hash joins only. n Lineitem is the largest relation, Part is a “reducer” n Optimizer initially chooses to go left-to-right. PartLineItem Supplier Nation Customer Region Order 1/1502/7 1/5

M. Franklin, 12/3/99 15 National Market Share Query (large memory) > 4 MB Delay No Scramb

M. Franklin, 12/3/99 16 National Market Share Query (Sm. memory) Scrambling becomes more expensive Pair: Local Decisions, lack of global view IN : Poor performance for short delays. ED : Good for a wide range of delay values. No Scramb. Delay

M. Franklin, 12/3/99 17 Cost-Based Query Scrambling Summary: u Traditional static query processing does not scale to the wide-area environment. u A reactive approach is needed. u This requires a multi-threaded engine and a scrambling-enabled optimizer. Experimental Results: u Avoids many of the problems of heuristic algorithms. u Response time-based optimization is needed. u Fundamental tradeoffs arise in the absence of good delay predictions.

M. Franklin, 12/3/99 18 XJoin - Improving Responsiveness n QS can speed up the delivery of the entire answer. n But, its ability to hide delays is limited by the amount of useful work that can be done in the query. n XJoin is a new query operator that: u Produces results incrementally as they become available. u Allows progress to be made in highly erratic situations. u Has a small memory footprint. u Tolerates bursty and slow behavior.

M. Franklin, 12/3/99 19 u Traditional Hash Joins block when one input stalls. Hash Join Build Probe Source A Source B Hash Table A Hash Table B u Symmetric Hash Join (SHJ) blocks only if both stall. u Processes tuples as they arrive from sources. u Produces all tuples in the join and no duplicates. Symmetric Hash Join

M. Franklin, 12/3/99 20 Memory Utilization n As originally specified, SHJ requires both inputs to be memory resident. n For a complex query, this means all intermediate results must be in memory. n This is wasteful and can result in thrashing. n XJoin extends SHJ to allow it to work with limited memory (like “Hybrid Hash”). n Spilled tuples are processed by a reactively- scheduled background thread.

M. Franklin, 12/3/99 21 Partitioning n XJoin is a partitioned hash join method. n When allocated memory is exhausted, a partition is flushed to disk. n Join processing continues on memory-resident data. n Disk-resident tuples are handled in background.

M. Franklin, 12/3/99 22 The 3 Stages of XJoin n Stage 1 - Symmetric hash join (memory-to-memory) n Stage 2- Disk-to-memory u Separate thread - runs when stage 1 blocks. u Stage 1 and 2 trade off until all input has been received. n Stage 3 - Clean up stage u Stage 1 misses pairs that were not in memory concurrently. u Stage 2 misses pairs when both are on disk, and may not get to run to completion.

M. Franklin, 12/3/99 23 XJoin - Details n The asynchronous/multi-threaded nature of XJoin combined with its small footprint allows it to be fully pipelined, but… n Duplicate result tuples can be introduced during stages 2 and 3. These are avoided using timestamps. u Each tuple is given an Arrival Timestamp (ATS) and a Departure Timestamp (DTS). u Two tuples with overlaping ATS-DTS ranges have already been matched in stage 1. u Timestamp of when disk-resident partition was used allows detection of tuples matched during stage 2. n Second stage can be further optimized, at the expense of a bit of memory and some additional duplicate detection.

M. Franklin, 12/3/99 24 XJoin-Performance n We implemented XJoin in our multi-threaded version of the PREDATOR ORDBMS (from Cornell). n We modeled network delays using traces obtained from accessing sites across the Internet. u Replaying these traces provides repeatable results. n Focus on a “slow” (24.1 KB/sec) and “fast” (132.8 KB/Sec) trace - both exhibit bursty behavior. n Workload is simple join queries on Wisconsin Benchmark relations.

M. Franklin, 12/3/99 25 XJoin H Fast Build, Slow Probe XJ-2 XJoin H Slow Build, Slow Probe XJ-2 Results - 2-Way Joins (Time in seconds to n th tuple) XJoin H Fast Build, Fast Probe XJ-2 XJoin Slow Build, Fast Probe H XJ-2

M. Franklin, 12/3/99 26 Taming the Second Stage H XJoin XJoin-A Fast Build, Fast Probe n Impact of the second stage decreases during the execution of an XJoin. Scheduling can be adjusted to account for this.

M. Franklin, 12/3/99 27 Results – Multiway Joins SLOW FAST Delivery Times (in Seconds)

M. Franklin, 12/3/99 28 XJoin - Summary n A non-blocking, small footprint join operator. n It is multi-threaded, consisting of three stages. u These stages allow XJoin to make progress when input blocks, but they can introduce duplicates. n XJoin is optimized for streaming results to users as fast as they are created. n Like QS, XJoin hides delays with useful work, but at the operator level rather than at the plan level. n Experiments showed   order-of-magnitude improvements in time to get initial results.

M. Franklin, 12/3/99 29 Eddy – Continuous Optimization n Flow-based (“Rivers”) n Tuples are routed via a ticket-based scheme and back-pressure. n Hellerstein and Avnur 99 Eddy Join ST Join RS R S T

M. Franklin, 12/3/99 30 Adaptive Approaches n Increased uncertainty argues for increased adaptivity. u Wide-area nets and admin domains introduce uncertainty. u Pesky users introduce uncertainty. u Non-traditional data sources introduce uncertainty. n Implications for data-intensive Internet services. Dynamic, Parametric, Competitive, … static plans anarchy late binding reopt. continuous opt. current DBMS Query Scrambling, Kabra/DeWitt Eddy XJoin ???

M. Franklin, 12/3/99 31 The Telegraph Project n Adaptive data management for Internet-scale composition of services. u Dataflow-based scheduling. u Cross-domain negotiation. u “User-in-the-loop” u Adaptation and learning over varying granularities F individual long-running jobs F many similar short jobs F continuous data flows and filters. 

M. Franklin, 12/3/99 32Conclusions n Current static query processing technology cannot cope with the wide-area environment. n A key concern is unpredictability. u Query Scrambling is a reactive execution approach. u XJoin is a pipelined operator that streams answers. u Even more adaptive approaches are possible. n Complementary approaches: u Alternative sources, optimizing for robustness, relaxing semantics. n These ideas extend to the composition of Internet services.

FINE

M. Franklin, 12/3/99 34 Future Work n Investigating the properties of query plans that make them robust in the presence of network problems. u Will use these properties in the objective function for query optimization. n Next step is to use alternative, but not necessarily equivalent sources. n Further progress will involve relaxing the guarantees on semantics that the query system provides. u The WWW has shown us that users will accept this!

M. Franklin, 12/3/99 35 n Semantic Interoperability n Source Discovery n Performance n Responsiveness and Availability u Distributed database technology, caching, etc. u Unpredictability: how to build responsive systems? u This is the focus of this talk. QP on the Internet? — Issues u Wrapper/Mediator Architecture. u XML,XMI, CWMI,OLE-DB,... u Metadata Repositories and Directories.

M. Franklin, 12/3/99 36 Databases to the Rescue? n DB query languages used to be navigational. n Relational languages are more useful for many tasks. u Powerful, and (more or less) declarative. u Queries are written without regard to the physical structure/location/etc. of data. (Data Independence) u Easily extended to distributed systems. n DB query languages and optimization techniques have been developed over decades. n This technology is unavailable to the Internet user.

M. Franklin, 12/3/99 37 Distributed Query Processing (QP) SELECT eid,ename,title,salary FROM Emp, Proj, Assign WHERE Emp.eid = Assign.eid AND Proj.pid = Assign.pid AND Emp.loc <> Proj.loc n System handles query plan generation & optimization; ensures correct execution. n Originally conceived for corporate networks. ©1998 Ozsu and Valduriez

M. Franklin, 12/3/99 38Conclusions n Current Internet querying and data manipulation capabilities are too limited. u Unexpressive, too coarse grained, etc. u Do not support manipulating data from multiple sites. n Distributed querying technology addresses these concerns but is not applicable on the Internet. n A key concern is unpredictability. u Query Scrambling is a reactive execution approach. u XJoin is a pipelined operator that streams answers. u Lots more interesting work to be done in this area.