Telegraph: A Universal System for Information. Telegraph History & Plans Initial Vision –Carey, Hellerstein, Stonebraker –“Regres”, “B-1” Sweat, ideas.

Slides:



Advertisements
Similar presentations
NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System.
Advertisements

ICS 434 Advanced Database Systems
Database Architectures and the Web
CS 540 Database Management Systems
Network Storage and Cluster File Systems Jeff Chase CPS 212, Fall 2000.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Telegraph Endeavour Retreat 2000 Joe Hellerstein.
DB Zero & DB Everything Donald Kossmann 28msec, Inc. & ETH Zurich.
Information Capture and Re-Use Joe Hellerstein. Scenario Ubiquitous computing is more than clients! –sensors and their data feeds are key –smart dust.
Eddies: Continuously Adaptive Query Processing Ron Avnur Joseph M. Hellerstein UC Berkeley.
FYP Briefing Presentation Experiencing Content Addressable Storage: I really hate removable hard disk April 9, 2009 Presented by: Dr. T.Y. Wong.
Telegraph: An Adaptive Global- Scale Query Engine Joe Hellerstein.
CS 300 – Lecture 23 Intro to Computer Architecture / Assembly Language Virtual Memory Pipelining.
12 Chapter 12 Client/Server Systems Hachim Haddouti.
12 Chapter 12 Client/Server Systems Database Systems: Design, Implementation, and Management, Fifth Edition, Rob and Coronel.
Server Design, Architecture, Performance & Measurement Vivek Pai Princeton University October 17, 2001.
Towards Adaptive Dataflow Infrastructure Joe Hellerstein, UC Berkeley.
Databases on ISTORE: AME for parallel RDBMSs Noah Treuhaft.
Presentation by Krishna
Telegraph: Ideas & Status. Overview Folks –Amol Deshpande, Mohan Lakhamraju, VijayShankar Raman –Rob von Behren, Steve Gribble, Matt Welsh –Kris Hildrum.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.
1 CSE544 Database Architecture Tuesday, February 1 st, 2011 Slides courtesy of Magda Balazinska.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
Database Systems – Data Warehousing
Alok 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,
Telegraph Continuously Adaptive Dataflow Joe Hellerstein.
Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
STORAGE ARCHITECTURE/ EXECUTIVE: Virtualization It’s not what you think you’re buying. John Blackman Independent Storage Consultant.
March 19981© Dennis Adams Associates Tuning Oracle: Key Considerations Dennis Adams 25 March 1998.
Introduction to Hadoop and HDFS
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
Chapter 2 Database Systems Architecture. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.2-2 Topics in this Chapter Three levels of architecture.
Database Systems Carlos Ordonez. What is “Database systems” research? Input? large data sets, large files, relational tables How? Fast external algorithms;
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
1 Distributed Databases Chapter 21, Part B. 2 Introduction v Data is stored at several sites, each managed by a DBMS that can run independently. v Distributed.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
SQL School is strongly committed to provide COMPLETE PRACTICAL REALTIME Trainings on SQL Server Technologies – Dev, SQL DBA, MSBI (SSIS, SSAS, SSRS) and.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
History & Motivations –RDBMS History & Motivations (cont’d) … … Concurrent Access Handling Failures Shared Data User.
Mapping the Data Warehouse to a Multiprocessor Architecture
Telegraph Status Joe Hellerstein. Overview Telegraph Design Goals, Current Status First Application: FFF (Deep Web) Budding Application: Traffic Sensor.
CPS 216: Advanced Database Systems Shivnath Babu.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
CS 540 Database Management Systems
Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.
Societal-Scale Computing: The eXtremes Scalable, Available Internet Services Information Appliances Client Server Clusters Massive Cluster Gigabit Ethernet.
BIG DATA/ Hadoop Interview Questions.
- History and Motivations
CS 540 Database Management Systems
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Table General Guidelines for Better System Performance
Noah Treuhaft UC Berkeley ROC Group ROC Retreat, January 2002
Telegraph: An Adaptive Global-Scale Query Engine
Web Application Architectures
Tiers vs. Layers.
Table General Guidelines for Better System Performance
Web Application Architectures
Web Application Architectures
Information Capture and Re-Use
Presentation transcript:

Telegraph: A Universal System for Information

Telegraph History & Plans Initial Vision –Carey, Hellerstein, Stonebraker –“Regres”, “B-1” Sweat, ideas and further vision –4 of my grads committed –Brewer + 2 grads committed –Franklin will play –obvious tie-ins with other projects

Telegraph Architecture Query/Browse/Mine Global Agoric Federation Continuously Reoptimizing Query Processor Adaptive Data Placement Storage Manager (FS, DB, Web) Ninja, GiST, IStore River, Ninja, Aetherstore, Control,STIX Mariposa, Millenium, Control Control, DigLib & synergies!

Storage Manager Historic chance to start over! –new hardware realities variable-length segments, not blocks big main memories extra CPUs at the devices (IStore) –revisit and clean up infrastructure for transactions clean API supporting both log-based & version-based schemes; version-based runs today! big SW Eng. challenge –unify DB/FS/Web server! Clients: Ninja’s persistent hash table, query processing, web server, Linux (NT?) filesystem. –(Mohan Lakhamraju, Rob von Behren, Steve Gribble)

Query Engine Shared-nothing (cluster) –all data flow (no blocking ops) auto load-balance to micro/macro changes in environment adaptivity more important than raw performance!! CONTROL! || ripple join, online reordering (Shankar Raman) –continuously reoptimizing query plans tie-ins with STIX (Christos/Sinclair/Russell/Hellerstein) (Ron Avnur) –first steps in handling streaming sources

Cluster Data Layout –issues: fragmentation, placement, replication on 10^6 disks. For DB/FS/Web. –goals: availability, efficiency, consistency, manageability. –Adaptivity: cooperative vs. competitive ($$) techniques? –(Mehul Shah)

Global Federation Global distribution –federated DBMS layer a la Mariposa/Cohera address all the hard stuff they dropped! –Global data placement as in cluster, but must be competitive. (Mehul Shah) –Global query processing (Amol Deshpande) Agoric query optimization distributed query processing –Global metadata yellow pages both for services & datasets Millenium/Ninja tie-ins?

Applications Really finding stuff in all the world’s data? –UI meets AI meets Logic (browse/mine/query) CONTROL is key: seamless, non-blocking interaction multi-res output and feedback during browse/query hints, wizards, training (AI mining, user in the loop) build on existing “scalable spreadsheet”/xform tools (Shankar Raman)