Telegraph: Ideas & Status. Overview Folks –Amol Deshpande, Mohan Lakhamraju, VijayShankar Raman –Rob von Behren, Steve Gribble, Matt Welsh –Kris Hildrum.

Slides:

Advertisements

Similar presentations

Categories of I/O Devices

Advertisements

Processes Management.

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.

MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.

Parallel Databases By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.

Introduction CSCI 444/544 Operating Systems Fall 2008.

1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

Zookeeper at Facebook Vishal Kathuria.

THQ/Gas Powered Games Supreme Commander and Supreme Commander: Forged Alliance Thread for Performance.

Information Capture and Re-Use Joe Hellerstein. Scenario Ubiquitous computing is more than clients! –sensors and their data feeds are key –smart dust.

Eddies: Continuously Adaptive Query Processing Ron Avnur Joseph M. Hellerstein UC Berkeley.

W4118 Operating Systems OS Overview Junfeng Yang.

OPERATING SYSTEM OVERVIEW

Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein.

Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.

CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

Device Management.

1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

Telegraph: A Universal System for Information. Telegraph History & Plans Initial Vision –Carey, Hellerstein, Stonebraker –“Regres”, “B-1” Sweat, ideas.

PRASHANTHI NARAYAN NETTEM.

Performance Issues in Adaptive Query Processing Fred Reiss U.C. Berkeley Database Group.

Eddies: Continuously Adaptive Query Processing Based on a SIGMOD’2002 paper and talk by Avnur and Hellerstein.

Data-Intensive Systems Michael Franklin UC Berkeley

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc

Hadoop & Cheetah. Key words Cluster  data center – Lots of machines thousands Node  a server in a data center – Commodity device fails very easily Slot.

Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.

Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.

Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

施賀傑何承恩 TelegraphCQ. Outline Introduction Data Movement Implies Adaptivity Telegraph - an Ancestor of TelegraphCQ Adaptive Building.

Telegraph Continuously Adaptive Dataflow Joe Hellerstein.

M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.

1 I/O Management and Disk Scheduling Chapter Categories of I/O Devices Human readable Used to communicate with the user Printers Video display terminals.

1 XJoin: Faster Query Results Over Slow And Bursty Networks IEEE Bulletin, 2000 by T. Urhan and M Franklin Based on a talk prepared by Asima Silva & Leena.

ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 6.

CPSC 404, Laks V.S. Lakshmanan1 Tree-Structured Indexes BTrees -- ISAM Chapter 10 – Ramakrishnan & Gehrke (Sections )

Chapter 3.5 Memory and I/O Systems. 2 Memory Management Memory problems are one of the leading causes of bugs in programs (60-80%) MUCH worse in languages.

Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.

1 Fjording The Stream An Architecture for Queries over Streaming Sensor Data Samuel Madden, Michael Franklin UC Berkeley.

1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.

How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.

Databases Illuminated

An Example Data Stream Management System: TelegraphCQ INF5100, Autumn 2009 Jarle Søberg.

Deadlock Detection and Recovery

Eddies: Continuously Adaptive Query Processing Ross Rosemark.

2/14/01RightOrder : Telegraph & Java1 Telegraph Java Experiences Sam Madden UC Berkeley

File Systems cs550 Operating Systems David Monismith.

Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.

Telegraph Status Joe Hellerstein. Overview Telegraph Design Goals, Current Status First Application: FFF (Deep Web) Budding Application: Traffic Sensor.

SEDA An architecture for Well-Conditioned, scalable Internet Services Matt Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium.

NINJA. Project of UC Berkeley Computer Science Division Paper : The Ninja Architecture for Robust Internet-Scale Systems and Services

Query Processing CS 405G Introduction to Database Systems.

Threads. Readings r Silberschatz et al : Chapter 4.

The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.

Large-scale file systems and Map-Reduce

Applying Control Theory to Stream Processing Systems

Computational Models Database Lab Minji Jo.

Software Architecture in Practice

Telegraph: An Adaptive Global-Scale Query Engine

Evaluation of Relational Operations: Other Operations

Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy

Chapter 2: Operating-System Structures

Planning and Storyboarding a Web Site

Evaluation of Relational Operations: Other Techniques

Control Theory in Log Processing Systems

Chapter 2: Operating-System Structures

CS703 – Advanced Operating Systems

Information Capture and Re-Use

overview today’s ideas relational databases

Presentation transcript:

Telegraph: Ideas & Status

Overview Folks –Amol Deshpande, Mohan Lakhamraju, VijayShankar Raman –Rob von Behren, Steve Gribble, Matt Welsh –Kris Hildrum –Hellerstein, Franklin, Brewer, Papadimitriou (ITR team) Roots –“Regres” think-tank Carey, Hellerstein, Stonebraker, –CONTROL project (Online Aggregation, etc.) UC Berkeley ‘96-present –Query Scrambling Franklin & Urhan, UMD –Inktomi experiences –Jaguar (Welsh & Culler)

Telegraph Goals Query all the data in the world –ITR: internet sources and services –Endeavour: sensors –Also shared-nothing DBMS done better Unify and redesign storage engines –DBMS, HTTP server, cluster-based FS –Reject multi-threading in favor of event-flow state machines storage manager a “query plan” over events –Cluster-centric recovery scheme

Today Status on storage manager –Event flow and state machines –Simplified transactional API –Experiences with Jaguar –Status Continuously adaptive dataflow –Eddies & Rivers –Applications to event flow: storage mgr is a dataflow plan too Open Questions

State Machines Web servers/proxies, cache consistency HW use FSMs –Order 100x-1000x more concurrent clients than threads allow –One thread per concurrent HW activity –FSMs for multiplexing threads on connections Thesis: apply query plan technology to state machines –We understand data flow –Optimization = composition of FSMs –MS Research “Pipeline Server” State machine gives better cache locality (old-fashioned DB batching of I/O on chip!) A theme in the TinyOS research too

Gribble I/O Core (v6!!) buffercache(fsm) lockmngr(fsm)r w r w r wSN_Hashtable(fsm) SN_Hashtable(fsm) threadpool HT“work”queue ‘stub’ ‘stub’ threadboundary threadboundary

Mohan API for Xact Recovery Application Lock Manager Segment Manager Trans Manager Recovery Manager Lock Unlock Deadlock Detect Recoveryaction Readaction Updateaction Begin Commit Abort Read Update Scan Buffer Manager Pin Unpin Flush Commit/Abort-action

Jaguar Two Basic Features –Rather than JNI, map certain bytecodes to inlined assembly code Do this judiciously, maintaining type safety! –Pre-Serialized Objects (PSOs) Can lay down a Java object “container” over an arbitrary VM range outside Java’s heap. With these, you have everything you need –Inlining and PSOs allow direct user-level access to network buffers, disk device drivers, etc. –PSOs allow buffer pool to be pre-allocated, and tuples I the pool to be “pointed at” Matt Welsh

Storage Manager Status Working! –Transactions and recovery too –Gribble’s hashtable indexes currently don’t talk to Mohan’s stuff –Complete version and numbers for VLDB 2000, mid-February Lessons –Debugger support for state machine development needed –Thinking about where to multiplex and queue in a state machine is NOT EASY (but we’re learning) –Jaguar isn’t quite there yet e.g. GC control But we’re getting there Need to keep Welsh and Culler aboard

Query Processing Challenges The world is a messy place –performance varies widely over time River lessons on NOW (NowSort experience) Internet MEMS for sure! –performance metadata usually unavailable or wrong no “runstats” on the web Users are unpredictable –want to get early answers, “control” queries as they run Plus Mariposa/Millenium-esque issues –local autonomy, costs for access, etc.

ITR Example Scenario “What do the French think about farm subsidies?” –How would you do this on the web today? Translate query into French via BabelFish Find a French search engine, restrict domains to.fr Fetch matches and translate back to English via BabelFish Feed to a text summarizer like NetSumm

Behavior Along the Way Speed changes –Site that was fast suddenly slows down Behavior changes –Site that was returning few answers starts returning lots (“selectivity”) Failures –Site won’t respond. Choose an alternate server. Ordering affects answers –summarize then translate? Or vice versa?

Standard Query Engine Won’t Cut It Can’t adapt while running –need a “continuous” query optimizer –need to handle midstream failover Reload, alternate sites Uses the wrong QP algorithms –Can’t produce incremental results need CONTROL-based dataflow algorithms Can’t understand cost/quality tradeoffs –maybe I’d settle for something cheesier if it went faster -- e.g. use an English search engine in the US

QP Framework: Eddies Need an adaptive query processor –respond to changes mid-stream Eddy –a pipelining object router works well with ops that have frequent moments of symmetry –adjusts flow adaptively objects flow in different orders visit each op once before output –simple policy for routing never give out a new object if there’s a used one Avnur & Hellerstein SIGMOD 2000

Simple Eddies Learn Input Rates Two single-table, unchanging filters –one fast, one slow –both have same probability of output (selectivity) most tuples visit the fast op first –policy + finite queues result in back pressure –slow op almost always finds a used tuple from fast op –fast op rarely finds a used tuple

Simple Eddies & Output Rate Again, two single-table static filters –one low probability of output, one high –equal costs Back-pressure slightly worse than random –low-probability should be favored –but it is more likely to find used tuples

An Aside: n-Arm Bandits A little machine learning problem: –Each arm pays off differently –Explore? Or Exploit? Sometimes want to randomly choose an arm Usually want to go with the best If probabilities are static, dampen exploration over time

Learning Eddies Tuple routing is basically a bandit problem –which operator should I choose next? –Complicated by back pressure Bandit problems + queueing theory Lottery Scheduling implementation –Each operator starts with k tickets –When multiple operators request a tuple, hold a “lottery”; holder of winning ticket gets it –When an operator takes a tuple, it earns a ticket –When an operator produces a tuple, it is charged a ticket Works well in practice for some things –Problems with delayed sources & joins –Kris Hildrum studying formal proofs of convergence Ticket policy needs work. Mechanism looks robust.

Open Eddy Questions Eddy addresses the operator ordering problem Remaining problems: –operator choice (hash join or index join?) –source choice, overlap, failover: Ninja? –delayed sources –short jobs –resource mgmt (memory allocation) –distributed work and parallelism Sensor (i.e. sequence) operations –What changes when data-ordering matters? –What are the ops for sensors? Streaming media? –Objects not discretely differentiated??

Putting it together Current eddy/river in C –Prototypes in Java, but not state machines –Probably do a rewrite in state machine format Thesis: every piece of the system is a “query plan” –Apply eddies to event routing in the storage manager? –To network protocol?

Cross-pollination Telegraph QP and Ninja “Paths” DB, IStore, and OceanStore students looking at adaptive storage location –OceanStore orthogonal to Telegraph storage manager? But let’s combine! –DB and Istore efforts apply to clusters MEMS and sensors –As soon as eddy/river rewrite done, we need to look at sensor apps and ops TinyOS –Good state machine lessons at the boundary –Data flow between the devices?? Negotiation –Eddies and pricing fits into this! I.e. we have the infrastructure for dynamic pricing and re-routing on the way.