Download presentation
Presentation is loading. Please wait.
Published bySamson Long Modified over 9 years ago
1
COMPUTING ON JETSTREAM: STREAMING ANALYTICS IN THE WIDE-AREA Matvey Arye Joint work with: Ari Rabkin, Sid Sen, Mike Freedman and Vivek Pai
2
THE RISE OF GLOBAL DISTRIBUTED SYSTEMS Image showsCDN
3
TRADITIONAL ANALYTICS Image showsCDN Centralized Database
4
BANDWIDTH IS EXPENSIVE Price Trends 2005-2008 [Above the Clouds, Armbrust et. al.]
5
BANDWIDTH TRENDS [TeleGeography's Global Bandwidth Research Service]
6
BANDWIDTH TRENDS [TeleGeography's Global Bandwidth Research Service] 20%
7
BANDWIDTH COSTS Amazon EC2 bandwidth: $0.05 per GB Wireless broadband: $2 per GB Cell phone broadband (ATT/Verizon): $6 per GB – (Other providers are similar) Satellite Bandwidth $200 - $460 per GB – May drop to ~$20
8
THIS APPROACH IS NOT SCALABLE Image showsCDN Centralized Database
9
THE COMING FUTURE: DISPERSED DATA Dispersed Databases
10
WIDE-AREA COMPUTER SYSTEMS Web Services – CDNs – Ad Services – IaaS – Social Media Infrastructure – Energy Grid Military – Global Network – Drones – UAVs – Surveillance
11
NEED QUERIES ON A GLOBAL VIEW CDNs: – Popularity of websites globally – Tracking security threats Military – Threat “chatter” correlation – Big picture view of battlefield Energy Grid – Wide-area view of energy production and expenditure
12
SOME QUERIES ARE EASY Alert me when servers crash Server Crashed
13
OTHERS ARE HARD How popular are all of my domains? Urls? Requests CDN Requests Requests CDN Requests
14
BEFORE JETSTREAM Time [two days] Bandwidth 95% Level Analyst’s remorse: not enough data wasted bandwidth Buyers’s remorse: system overload or overprovisioning Needed for backhaul
15
? ? ? WHAT HAPPENS DURING OVERLOAD? Time [one day] Bandwidth Needed for backhaul Available Time Latency Queue size grows without bound!
16
THE JETSTREAM VISION Time [two days] Bandwidth Available Needed for backhaul JetStream lets programs adapt to shortages and backfill later. Need new abstractions for programmers Used by JetStream
17
SYSTEM ARCHITECTURE compute resources (several sites) Control plane Data plane … Query graph Coordinator Daemon Optimized query JetStream API … … Planner Library … worker node stream source
18
AN EXAMPLE QUERY File Read Operator Parse Log File Local Storage File Read Operator Parse Log File Local Storage Query Every 10 s Central Storage Query Every 10 s Site A Site B Site C
19
Feedback control ADAPTIVE DEGRADATION Local Data Dataflow Operators Summarized or Approximated Data Feedback control to decide when to degrade User-defined policies for how to degrade data Network Dataflow Operators
20
MONITORING AVAILABLE BANDWIDTH Data Time Marker Data Sources insert time markers into the data stream every k seconds Network monitor records time it took to process interval – t => k/t estimates available capacity
21
WAYS TO DEGRADE DATA Can drop low-rank values Can coarsen a dimension
22
AN INTERFACE FOR DEGRADATION (I) First attempt: policy specified by choosing an operator. Operators read the congestion sensor and respond. Coarsening Operator Incoming data Sampled Data Network Sending 4x too much
23
COARSENING REDUCES DATA VOLUMES 01:01:01foo.com/a1 01:01:01foo.com/b10 01:01:01foo.com/c5 01:01:02foo.com/a2 01:01:02foo.com/b15 01:01:02foo.com/c20 01:01:*foo.com/a3 01:01:*foo.com/b25 01:01:*foo.com/c25
24
BUT NOT ALWAYS 01:01:01foo.com/a1 01:01:01foo.com/b10 01:01:01foo.com/c5 01:01:02bar.com/a2 01:01:02bar.com/b15 01:01:02bar.com/c20 01:01:*foo.com/a1 01:01:*foo.com/b10 01:01:*foo.com/c5 01:01:*bar.com/a2 01:01:*bar.com/b15 01:01:*bar.com/c20
25
DEPENDS ON LEVEL OF COARSENING Data from CoralCDN logs
26
GETTING THE MOST DATA QUALITY FOR THE LEAST BW Issue Some degradation techniques result in good quality but have unpredictable savings. Solution Use multiple techniques – Start off with technique that gives best quality – Supplement with other techniques when BW scarce => Keeps latency bounded; minimize analyst’s remorse
27
ALLOWING COMPOSITE POLICIES Chaos if two operators are simultaneously responding to the same sensor Operator placement constrained in ways that don’t match degradation policy. Coarsening Operator Incoming data Network Sampling Operator Sending 4x too much
28
INTRODUCING A CONTROLLER Introduce a controller for each network connection that determines which degradations to apply Degradation policies for each controller Policy no longer constrained by operator topology Coarsening Operator Incoming data Network Sampling Operator Controller Drop 75% of data! Sending 4x too much
29
DEGRADATION
30
MERGEABILITY IS NONTRIVIAL Can’t cleanly unify data at arbitrary degradation Degradation operators need to have fixed levels 01 - 0506 - 1011 - 1516 - 2021 - 25 Every 5 01 - 1011 - 20 Every 10 21 - 3001 - 30 Every 30?? 26 - 30 01 - 0607 - 1213 - 1819 - 24 Every 6 25 - 30 ??????
31
INTERFACING WITH THE CONTROLLER Coarsening Operator Incoming data Network Sampling Operator Controller Operator Shrinking data by 50% Possible levels: [0%, 50%, 75%, 95%, …] Controller Go to level 75% Sending 4x too much
32
A PLANNER FOR POLICY Query planners: Query + Data Distribution => Execution Plan Why not do this for degradation policy? What is the Query? For us the policy affects the data ingestion => Effects all subsequent Queries Planning All Potential Queries + Data Distribution => Policy
33
EXPERIMENTAL SETUP 80 nodes on VICCI testbed in US and Germany Princeton Policy: Drop data if insufficient BW
34
WITHOUT ADAPTATION Bandwidth Shaping
35
WITH ADAPTATION Bandwidth Shaping
36
COMPOSITE POLICIES
37
OPERATING ON DISPERSED DATA Dispersed Databases
38
CUBE DIMENSIONS 01:01:00 foo.com/rfoo.com/qbar.com/nbar.com/m 01:01:01 Time URL
39
CUBE AGGREGATES Count Requests Max Latency bar.com/m 01:01:01
40
CUBE ROLLUP 01:01:00 foo.com/rfoo.com/qbar.com/nbar.com/m Time URL bar.com/*foo.com/*
41
FULL HIERARCHY (5,90) (3,75) (8,199) (21,40) 01:01:00 foo.com/rfoo.com/qbar.com/nbar.com/m Time URL (8,90)(29,199) (37,199) URL: * Time: 01:01:01
42
RICH STRUCTURE (5,90) (3,75) (8,199) (21,40) D A C 01:01:59 01:01:00 foo.com/rfoo.com/r foo.com/qfoo.com/q bar.com/nbar.com/n bar.com/m 01:01:01 01:01:58 B … E CellURLTime Abar.com/*01:01:01 B* Cfoo.com/*01:01:01 Dfoo.com/r01:01:* Efoo.com/*01:01:*
43
TWO KINDS OF AGGREGATION 1.Rollups – Across Dimensions 2.Inserts – Across Sources The data cube model constrains the system to use the same aggregate function for both. Constraint: no queries on tuple arrival order Makes reasoning easier!
44
AN EXAMPLE QUERY File Read Operator Parse Log File Local Storage File Read Operator Parse Log File Local Storage Query Every 10 s Central Storage Query Every 10 s Site A Site B Site C
45
SUBSCRIBERS Extract data from cubes to send downstream Control latency vs completeness traeoff File Read Operator Parse Log File Local Storage Query Every 10 s File Read Operator Parse Log File Site A
46
SUBSCRIBER API Notified of every tuple inserted into cube Can slice and rollup cubes Possible policies: Wait for all upstream nodes to contribute Wait for a timer to go off
47
FUTURE WORK Reliability Individual queries – Statistical methods – Multi-round protocols Currently working on improving top-k Fairness that gives best data quality Thanks for listening!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.