Download presentation
Presentation is loading. Please wait.
Published byLucille Fortier Modified over 6 years ago
1
dQUOB: decision-making over streaming data with SQL queries
Beth Plale Computer Science Dept. Indiana University Bloomington, IN 04 December 2002 I-Light
2
Data streams are flows of timestamped events that are manipulated by means of continuously executing queries Continuous Query Work Fjords (Franklin), NiagraCQ (DeWitt), STREAM (Widom), dQUOB (Plale) 04 December 2002 I-Light
3
Visualizing Doppler Radar Flows
Peachtree City, GA Streamed sweep data raw Level 2 data archive Archived sweep data discards Grier, SC Hytop, AL Transform, Filter, Aggregate 04 December 2002 I-Light
4
Quick Summary of dQUOB toolkit features
SQL queries coupled with user-defined functions (e.g., FFT, data reduction). Assumes data stream is timestamped sequence of events. Event == tuple, data stream == relation Supports time-based stream join Two events satisfy a join if they ‘happen at the same time’ Applied to: large scale scientific instruments, scientific visualization, hazard detection Major contribution: thinking about data stream as though it were a database. Hard work: converting query/action rule into entity that executes independent of database (over data stream instead). Hard work: utilizing query optimization heuristics to improve and deal with dynamics Saltz, Active Streams, SQL queries, computation, and data streams but his queries execute on db where mine are portable. 04 December 2002 I-Light
5
dQUOB toolkit components
repository data stream query As Tcl script quoblet dQUOB library -- SQL queries dropped into data stream to filter, aggregate, and transform events. -- Queries execute continuously over arriving events. dQUOB SQL compiler server scientific model, remote sensors viz tool, downstream client 04 December 2002 I-Light
6
Architecture of quoblet for executing queries at runtime
output streams QM dispatcher QN input streams provider consumer event stream quoblet 04 December 2002 I-Light
7
Installing a query at a remote site
quoblet data streams 7. install query activate query 3. 6. client application callback 4. query server/ repository ACK connect established 5. Web portal enter query 1. add query, authenticate user 2. 04 December 2002 I-Light
8
Research Issue: when R and S are asynchronous streams and stream S is slow and erratic, unneeded memory consumption R: S: a g f e d c b 1 l k j i h 5 4 3 2 join window: count … time Issue 2: difficult for user to pick right join window size. Cost of error is great: too large, consumes memory; too small, increases false negatives 04 December 2002 I-Light
9
Approach to asynchronous stream problem: express join window size as interval of time
g f e d c b 1 l k j i h 5 4 3 2 join window: count … time join window: 10 sec. S: 1 5 4 3 2 … R: a g f e d c b l k j i h time 04 December 2002 I-Light
10
Startup Latency Evaluation
Purpose: measure startup latency under various runtime environments. Response latency – interval between when a complex event occurs and when a response is initiated at the remote site. Runtime environment: worker processes started up using either PBS or Linux OS real time scheduling features. 04 December 2002 I-Light
11
Experimental Environment
Proto-AVIDD cluster 5 IA32 nodes Housed in IU University Information Technology Services (UITS) building Remote sensor server at Morgan Monroe State Forest (20 miles north of IU) Radio link to IU 4.3 Mbits/sec (iperf 6.2) Remote sensor server at IU Computer Science Dual Poweredge 6400 server 94.1 Mbits/sec (iperf 6.2) 04 December 2002 I-Light
12
Response to Weather Condition Detection
Activate workers to process high-res data worker 1 worker 2 worker 3 3. Condition detected event detection server 1. event cache Morgan Monroe State Forest remote gathering server activate high-res sampling at remote source 2. Storage server 04 December 2002 I-Light
13
Methods Evaluated PBS/Qsub – submitting via PBS using Qsub command line interface, no running jobs PBS/Qsub/preempt – Qsub interface, existing jobs running under PBS on worker processors that must be preempted Linux real time/RSH – Linux real time scheduling and RSH Linux real time/SSH – Linux real time scheduling and SSH 04 December 2002 I-Light
14
PBS PBSpro 5.2 preemption Queue priority SIGSTOP/SIGCONT Checkpointing
Quickest No swapping to disk Safest Checkpointing 04 December 2002 I-Light
15
Real time scheduling in Linux
Using sched_setscheduler() Not “Hard” real time. rt_priority: like “nice” integer 0-99, higher number=higher priority Preempts non-real time processes round robin scheduling within priority (SCHED_RR) 04 December 2002 I-Light
16
Key Events in Detection/Response Processing
Timestep Event Definition t0 Weather condition first detected at server determined from low resolution event data t1 First high resolution weather event received at AVIDD t2 Worker ‘up and ready’; instance prior to opening connection back to remote client t3 Worker receives first high resolution weather event 04 December 2002 I-Light
17
Experiment Scenario worker 1 3 2 IU CS dept event detection cache
server cache remote gathering State Forest 94.1 Mbits/sec 4.3 Mbits/sec IU CS dept 04 December 2002 I-Light
18
Startup latencies for PBS Scheduling case
Interval PBS/Qsub LAN (sec) PBS/Qsub remote (sec) {to – t1} {t0 – t2} {t2 – t3} {to – t1} interval between detection and activation of high-rate stream {to – t2} interval between detection and worker startup, {t2 – t3} worker delay before receiving first event 04 December 2002 I-Light
19
Startup Latencies under Linux real time
Interval SSH (sec) RSH (sec) {to – t1} {t0 – t2} {t2 – t3} {to – t1} interval between detection and activation of high-rate stream {to – t2} interval between detection and worker startup, {t2 – t3} worker delay before receiving first event 04 December 2002 I-Light
20
Summary of Startup Latency
Remote source response latency (sec) {t1 – t0} Worker response latency (sec) {t3 – t0} PBS LAN PBS remote RSH MAN SSH MAN 04 December 2002 I-Light
21
Beth Plale http://www.cs.indiana.edu/~plale/projects/dQUOB
04 December 2002 I-Light
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.