Download presentation
Presentation is loading. Please wait.
1
Telegraph: Ideas & Status
2
Overview Folks –Amol Deshpande, Mohan Lakhamraju, VijayShankar Raman –Rob von Behren, Steve Gribble, Matt Welsh –Kris Hildrum –Hellerstein, Franklin, Brewer, Papadimitriou (ITR team) Roots –“Regres” think-tank Carey, Hellerstein, Stonebraker, 1998-99 –CONTROL project (Online Aggregation, etc.) UC Berkeley ‘96-present –Query Scrambling Franklin & Urhan, UMD –Inktomi experiences –Jaguar (Welsh & Culler)
3
Telegraph Goals Query all the data in the world –ITR: internet sources and services –Endeavour: sensors –Also shared-nothing DBMS done better Unify and redesign storage engines –DBMS, HTTP server, cluster-based FS –Reject multi-threading in favor of event-flow state machines storage manager a “query plan” over events –Cluster-centric recovery scheme
4
Today Status on storage manager –Event flow and state machines –Simplified transactional API –Experiences with Jaguar –Status Continuously adaptive dataflow –Eddies & Rivers –Applications to event flow: storage mgr is a dataflow plan too Open Questions
5
State Machines Web servers/proxies, cache consistency HW use FSMs –Order 100x-1000x more concurrent clients than threads allow –One thread per concurrent HW activity –FSMs for multiplexing threads on connections Thesis: apply query plan technology to state machines –We understand data flow –Optimization = composition of FSMs –MS Research “Pipeline Server” State machine gives better cache locality (old-fashioned DB batching of I/O on chip!) A theme in the TinyOS research too
6
Gribble I/O Core (v6!!) buffercache(fsm) lockmngr(fsm)r w r w r wSN_Hashtable(fsm) SN_Hashtable(fsm) threadpool HT“work”queue ‘stub’ ‘stub’ threadboundary threadboundary
7
Mohan API for Xact Recovery Application Lock Manager Segment Manager Trans Manager Recovery Manager Lock Unlock Deadlock Detect Recoveryaction Readaction Updateaction Begin Commit Abort Read Update Scan Buffer Manager Pin Unpin Flush Commit/Abort-action
8
Jaguar Two Basic Features –Rather than JNI, map certain bytecodes to inlined assembly code Do this judiciously, maintaining type safety! –Pre-Serialized Objects (PSOs) Can lay down a Java object “container” over an arbitrary VM range outside Java’s heap. With these, you have everything you need –Inlining and PSOs allow direct user-level access to network buffers, disk device drivers, etc. –PSOs allow buffer pool to be pre-allocated, and tuples I the pool to be “pointed at” Matt Welsh
9
Storage Manager Status Working! –Transactions and recovery too –Gribble’s hashtable indexes currently don’t talk to Mohan’s stuff –Complete version and numbers for VLDB 2000, mid-February Lessons –Debugger support for state machine development needed –Thinking about where to multiplex and queue in a state machine is NOT EASY (but we’re learning) –Jaguar isn’t quite there yet e.g. GC control But we’re getting there Need to keep Welsh and Culler aboard
10
Query Processing Challenges The world is a messy place –performance varies widely over time River lessons on NOW (NowSort experience) Internet MEMS for sure! –performance metadata usually unavailable or wrong no “runstats” on the web Users are unpredictable –want to get early answers, “control” queries as they run Plus Mariposa/Millenium-esque issues –local autonomy, costs for access, etc.
11
ITR Example Scenario “What do the French think about farm subsidies?” –How would you do this on the web today? Translate query into French via BabelFish Find a French search engine, restrict domains to.fr Fetch matches and translate back to English via BabelFish Feed to a text summarizer like NetSumm
12
Behavior Along the Way Speed changes –Site that was fast suddenly slows down Behavior changes –Site that was returning few answers starts returning lots (“selectivity”) Failures –Site won’t respond. Choose an alternate server. Ordering affects answers –summarize then translate? Or vice versa?
13
Standard Query Engine Won’t Cut It Can’t adapt while running –need a “continuous” query optimizer –need to handle midstream failover Reload, alternate sites Uses the wrong QP algorithms –Can’t produce incremental results need CONTROL-based dataflow algorithms Can’t understand cost/quality tradeoffs –maybe I’d settle for something cheesier if it went faster -- e.g. use an English search engine in the US
14
QP Framework: Eddies Need an adaptive query processor –respond to changes mid-stream Eddy –a pipelining object router works well with ops that have frequent moments of symmetry –adjusts flow adaptively objects flow in different orders visit each op once before output –simple policy for routing never give out a new object if there’s a used one Avnur & Hellerstein SIGMOD 2000
15
Simple Eddies Learn Input Rates Two single-table, unchanging filters –one fast, one slow –both have same probability of output (selectivity) most tuples visit the fast op first –policy + finite queues result in back pressure –slow op almost always finds a used tuple from fast op –fast op rarely finds a used tuple
16
Simple Eddies & Output Rate Again, two single-table static filters –one low probability of output, one high –equal costs Back-pressure slightly worse than random –low-probability should be favored –but it is more likely to find used tuples
17
An Aside: n-Arm Bandits A little machine learning problem: –Each arm pays off differently –Explore? Or Exploit? Sometimes want to randomly choose an arm Usually want to go with the best If probabilities are static, dampen exploration over time
18
Learning Eddies Tuple routing is basically a bandit problem –which operator should I choose next? –Complicated by back pressure Bandit problems + queueing theory Lottery Scheduling implementation –Each operator starts with k tickets –When multiple operators request a tuple, hold a “lottery”; holder of winning ticket gets it –When an operator takes a tuple, it earns a ticket –When an operator produces a tuple, it is charged a ticket Works well in practice for some things –Problems with delayed sources & joins –Kris Hildrum studying formal proofs of convergence Ticket policy needs work. Mechanism looks robust.
19
Open Eddy Questions Eddy addresses the operator ordering problem Remaining problems: –operator choice (hash join or index join?) –source choice, overlap, failover: Ninja? –delayed sources –short jobs –resource mgmt (memory allocation) –distributed work and parallelism Sensor (i.e. sequence) operations –What changes when data-ordering matters? –What are the ops for sensors? Streaming media? –Objects not discretely differentiated??
20
Putting it together Current eddy/river in C –Prototypes in Java, but not state machines –Probably do a rewrite in state machine format Thesis: every piece of the system is a “query plan” –Apply eddies to event routing in the storage manager? –To network protocol?
21
Cross-pollination Telegraph QP and Ninja “Paths” DB, IStore, and OceanStore students looking at adaptive storage location –OceanStore orthogonal to Telegraph storage manager? But let’s combine! –DB and Istore efforts apply to clusters MEMS and sensors –As soon as eddy/river rewrite done, we need to look at sensor apps and ops TinyOS –Good state machine lessons at the boundary –Data flow between the devices?? Negotiation –Eddies and pricing fits into this! I.e. we have the infrastructure for dynamic pricing and re-routing on the way.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.