Download presentation
Presentation is loading. Please wait.
1
CS561 - XJoin1 XJoin: A Reactively-Scheduled Pipelined Join Operator IEEE Bulletin, 2000 by Tolga Urhan and Michael J. Franklin
2
CS561 - XJoin 2 Goal of XJoin Efficiently evaluate equi-join in online query processing over distributed data sources Optimization objectives: Having small memory footprint Fast initial result delivery Hiding intermittent delays in data arrival
3
CS561 - XJoin 3 Outline Hash Join History Motivation of XJoin Challenges in Developing XJoin Three Stages of XJoin Preventing Duplicates Experimental Results Conclusion
4
CS561 - XJoin 4 Classic Hash Join key2R tuples key1 R tuples key3R tuples key4R tuples Key5R tuples 1. Build S tuple 1 S tuple 2 S tuple 3 S tuple 4 S tuple 5 2. Probe 2-phase: build and probe Only one table is hashed in memory
5
CS561 - XJoin 5 Hybrid Hash Join One table is hashed both to disk and memory (partitions) G. Graefe, “Query Evaluation Techniques for Large Databases”. ACM 1993. Disk Bucket i Bucket i+1 Bucket i+2 Bucket … Bucket j-1 Bucket j R tuples Bucket n Bucket n+1 Bucket n+2 Bucket … Bucket m-1 Bucket m R tuples MemoryS tuple 1 S tuple 2 S tuple 3 S tuple 4 S tuple …
6
CS561 - XJoin 6 Symmetric Hash Join (Pipelined) Both tables are hashed (both kept in main memory only) A. Wilschut, P. M.G. Apers, “Dataflow Query Execution in a Parallel Main-Memory Environment”, DPD 1991. Source R OUTPUT Source S Key n Key n+1 Key n+2 Key … Key m-1 Key m R tuples BUILD PROBE R tuple S tuple Key i Key i+1 Key i+2 Key … Key j-1 Key j S tuples BUILD PROBE R tuple S tuple
7
CS561 - XJoin 7 Problems of SHJ: Rather memory intensive Won’t work for large input streams. Won’t allow for many joins to be processed in a pipeline (or even in parallel).
8
CS561 - XJoin 8 New Problems in Online Query Processing over Distributed Data Sources Unpredictable data access due to link congestion, load balances, etc. Three classes of delays Initial Delay: first tuple arrives from remote source more slowly than usual Slow Delivery: data arrives at a constant, but slower than expected rate Bursty Arrival: data arrives in a fluctuating manner
9
CS561 - XJoin 9 Question: Why are delays undesirable? Prolongs the time for first output Slows the processing if wait for data to first be there before acting If too fast, you want to avoid loosing any data Waste time if you sit idle while no data is coming Unpredictable, one single strategy won’t work
10
CS561 - XJoin 10 Motivation of XJoin Produce results incrementally when available Tuples returned as soon as produced Exploit available main memory as long as possible Favor main-memory join when possible Allow progress to be made when one or more sources experience delays by: Background processing performed on previously received tuples so results are produced even when both inputs are stalled
11
CS561 - XJoin 11 XJoin Design Tuples are stored in partitions (Hash Join): A memory-resident (m-r) portion A disk-resident (d-r) portion
12
CS561 - XJoin 12 Memory-resident partitions of source B Tuple B hash(Tuple B) = n SOURCE-BSOURCE-A D I S K M E M O R Y 1... n n 1 Memory-resident partitions of source A 1...... n 1 Disk-resident partitions of source A... n Disk-resident partitions of source B... 1 n k k flush Tuple A hash(Tuple A) = 1
13
CS561 - XJoin 13 Challenges in Developing XJoin Manage flow of tuples between memory and secondary storage (when and how to do it) Control background processing when inputs are delayed (reactive scheduling idea) Provide both quick initial result as well as good overall throughput Ensure the full answer is produced Ensure duplicate tuples are not produced
14
CS561 - XJoin 14 XJoin Stages XJoin proceeds in 3 stages (separate threads) M : M M : D D : D
15
CS561 - XJoin 15 M E M O R Y Partitions of source B......... i j SOURCE-B hash(record B) = j Tuple B SOURCE-A Tuple A hash(record A) = i i j Partitions of source A......... Output Insert Probe Insert Probe 1 st Stage: Memory-to-Memory Join
16
CS561 - XJoin 16 1 st Stage: Memory-to-Memory Join Join processing continues as long as: Memory permits, and One of the inputs is producing tuples If memory is full, one partition is picked to be flushed to disk and appended to end of disk- resident portion If no new input, then stage 1 is blocked and stage 2 starts
17
CS561 - XJoin 17 Why Stage 1? In-memory operations are much faster and cheaper than on-disk operations Thus this guarantees that results are produced as soon as possible.
18
CS561 - XJoin 18 Question: What does the 2 nd Stage do? When does the 2 nd Stage start? Hint: What occurs when data input (tuples) are too large for memory? Answer: The 2 nd Stage joins Memory-to-Disk Occurs when both inputs are blocking
19
CS561 - XJoin 19 Output i....... i M E M O R Y Partitions of source BPartitions of source A D I S K Partitions of source BPartitions of source A i i..... DP iA MP iB Stage 2
20
CS561 - XJoin 20 2 nd Stage: Memory-to-Disk Join Activated when 1 st Stage is blocked Performs 3 steps: 1. Choose partition according to throughput and size of partition from one source 2. Use tuples from d-r portion to probe m-r portion of other source and output matches, until d-r completely processed 3. Check if either input resumed producing tuples. If yes, resume 1 st Stage. If no, choose another d-r portion and continue 2 nd Stage.
21
CS561 - XJoin 21 Controlling 2 nd Stage Cost of 2 nd Stage is hidden when both inputs experience delays Tradeoffs ? What are the benefits of using second stage? Produces results when input sources are stalled Allows varying input rates What is the disadvantage? The second stage must complete a d-r portion before checking for new input (overhead) To address tradeoff, use an activation threshold: Pick a partition likely to produce many tuples right now
22
CS561 - XJoin 22 3 rd Stage: Disk-to-Disk Join Clean-up stage Assume that all data for both inputs has arrived Assume that 1 st and 2 nd stage have completed Why is this step necessary? Completeness of answer: make sure that all result tuples are being produced. Reason: some tuples in disk-resident portions may not have had chance to join each other.
23
CS561 - XJoin 23 Preventing Duplicates When could duplicates be produced? Duplicates could be produced in both 2 nd and 3 rd stages which may perform overlapping work. How to address it? XJoin prevents duplicates with timestamps. When address this? During processing when trying to join two tuples.
24
CS561 - XJoin 24 Time Stamping : Part 1 2 fields are added to each tuple: Arrival TimeStamp (ATS) Indicates time when tuple first arrived in memory Departure TimeStamp (DTS) Indicates time when tuple was flushed to disk [ATS, DTS] indicates when tuple was in memory When did two tuples get joined in 1 st state? If Tuple A’s DTS is within Tuple B’s [ATS, DTS] Tuples that meet this overlap condition are not considered for joining at 2 nd or 3 rd stage
25
CS561 - XJoin 25 Tuple B1178198 Tuples joined in first stage B1 arrived after A and before A was flushed to disk Tuple A102234 DTSATS Tuple B2348601 Tuples not joined in first stage B2 arrived after A and after A was flushed to disk Tuple A102234 DTSATS Non-Overlapping Detecting Tuples Joined in 1 st Stage Overlapping
26
CS561 - XJoin 26 Time Stamping : Part 2 For each partition, keep track of : ProbeTS: time when a 2 nd stage probe was done DTS last : the DTS of last tuple of disk-resident portion Several such probes may occur Keep an ordered history of such probe descriptors Meaning : All tuples before and including at time DTS last were joined in stage 2 with all tuples in main memory at time ProbeTS
27
CS561 - XJoin 27 Detecting Tuples Joined in 2 nd stage All A tuples in Partition 2 up to DTSlast 350, were joined with m-r tuples that arrived before Partition 2’s ProbeTS. 100300800900 20340350550700900Tuple A100200 Tuple B500600 ATSDTS ATSDTS overlap DTS last ProbeTS History list for corresponding partition. Partition 2
28
CS561 - XJoin 28 Experiments HHJ (Hybrid Hash Join) XJoin (with 2 nd stage and with caching) XJoin (without 2 nd stage) XJoin (with aggressive usage of 2 nd stage)
29
CS561 - XJoin 29 Case 1: Slow Network Both Sources Are Slow
30
CS561 - XJoin 30 Case 1: Slow Network Both Sources Are Slow (Bursty) XJoin improves delivery time of initial answers -> interactive performance The reactive background processing is an effective solution to exploit intermittent delays to keep continued output rates Shows that 2 nd stage is very useful if there is time for it
31
CS561 - XJoin 31 Case 2: Fast Network Both Sources Are Fast
32
CS561 - XJoin 32 Case 2: Fast Network Both Sources Are Fast All XJoin variants deliver initial results earlier. XJoin also can deliver the overall result in equal time to HHJ HHJ delivers the 2nd half of the result faster than XJoin. 2 nd stage cannot be used too aggressively if new data is coming in continuously
33
CS561 - XJoin 33 Conclusion Can be conservative on space (small footprint) Can produce initial result as early as possible Can hide intermittent data delays Can be used in conjunction with online query processing to manage data streams (limited)
34
CS561 - XJoin 34 How to Further Optimize XJoin? Resuming Stage 1 as soon as data arrives Removing no-longer-joining tuples in timely manner Other ideas ? …
35
CS561 - XJoin 35 References Urhan, Tolga and Franklin, Michael J. “XJoin: Getting Fast Answers From Slow and Bursty Networks.” Urhan, Tolga and Franklin, Michael J. “XJoin: A Reactively- Scheduled Pipelined Join Operator.” Hellerstein, Franklin, Chandrasekaran, Deshpande, Hildrum, Madden, Raman, and Shah. “Adaptive Query Processing: Technology in Evolution”. IEEE Data Engineering Bulletin, 2000. Hellerstein and Avnur, Ron. “Eddies: Continuously Adaptive Query Processing.” Babu and Wisdom, Jennifer. “Continuous Queries Over Data Streams”.
36
CS561 - XJoin 36 Stream: New Query Context Challenges faced by XJoin P otentially unbounded growing join state Indefinite delay of some join results Solutions Exploit semantic constraints to remove no-longer- joining data in timely manner Constraints: sliding window punctuations
37
CS561 - XJoin 37 Punctuation Punctuation is predicate on stream elements that evaluates to false for every element following the punctuation. 9961234Edward17 9961235Justin19 9961238Janet18 **(0, 18] no more tuples for students whose age are less than or equal to 18! IDNameAge 9961256Anna20 …
38
CS561 - XJoin 38 An Example Open Stream Group-by item_id (sum(…) ) Open Stream item_id | seller_id | open_price | timestamp 1080 | jsmith | 130.00 | Nov-10-03 9:03:00 1082 | melissa | 20.00 | Nov-10-03 9:10:00 … item_id | bidder_id | bid_price | timestamp 1080 | pclover | 175.00 | Nov-14-03 8:27:00 1082 | smartguy | 30.00 | Nov-14-03 8:30:00 1080 | richman | 177.00 | Nov-14-03 8:52:00 … Bid Stream Query: For each item that has at least one bid, return its bid-increase value. Select O.item_id, Sum (B.bid_price - O.open_price) From Open O, Bid B Where O.item_id = B.item_id Group by O.item_id Bid Stream Join item_id Out 1 (item_id) Out 2 (item_id, sum) No more bids for item 1080!
39
CS561 - XJoin 39 PJoin Execution Logic Hash Table Join State (Disk-Resident Portion) Join State (Memory-Resident Portion) … 3539935399 … Hash Table 59355935 … State of Stream A (S a ) State of Stream B (S b ) Stream A Stream B 3 Hash(t a ) = 1 Tuple t a 3 3 Purge Cand. Pool 3 Hash Table … 1 2 4 3 <10 Punct. Set (PS b )Punct. Set (PS a )
40
CS561 - XJoin 40 PJoin Execution Logic Hash Table Join State (Disk-Resident Portion) Join State (Memory-Resident Portion) … 3539935399 … Hash Table 59355935 … State of Stream A (S a ) State of Stream B (S b ) Stream A Stream B 3 Hash(p a ) = 1 Punctuation p a Purge Cand. Pool 3 Hash Table … <10 Punct. Set (PS b )Punct. Set (PS a )
41
CS561 - XJoin 41 PJoin vs. XJoin: Memory Overhead Tuple inter-arrival: 2 milliseconds Punctuation inter-arrival: 40 tuples/punctuation
42
CS561 - XJoin 42 PJoin vs. XJoin: Tuple Output Rate Tuple inter-arrival: 2 milliseconds Punctuation inter-arrival: 30 tuples/punctuation
43
CS561 - XJoin 43 Conclusion Memory requirement for PJoin state almost insignificant compared to XJoin’s. Increase in join state of XJoin leading to increasing probe cost, thus affecting tuple output rate. Eager purge is best strategy for minimizing join state. Lazy purge with appropriate purge threshold provides significant advantage in increasing tuple output rate.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.