Noah Treuhaft UC Berkeley ROC Group ROC Retreat, January 2002

Slides:



Advertisements
Similar presentations
Database Architectures and the Web
Advertisements

Replication and Consistency (2). Reference r Replication in the Harp File System, Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba.
© Dennis Shasha, Philippe Bonnet 2001 Log Tuning.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
CS 582 / CMPE 481 Distributed Systems
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Wide-area cooperative storage with CFS
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
G Robert Grimm New York University (with some slides by Steve Gribble) Distributed Data Structures for Internet Services.
Tiered architectures 1 to N tiers. 2 An architectural history of computing 1 tier architecture – monolithic Information Systems – Presentation / frontend,
Distributed Databases
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
1 The Google File System Reporter: You-Wei Zhang.
Lecture On Database Analysis and Design By- Jesmin Akhter Lecturer, IIT, Jahangirnagar University.
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
Some key-value stores using log-structure Zhichao Liang LevelDB Riak.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
Slide 1 Breaking databases for fun and publications: availability benchmarks Aaron Brown UC Berkeley ROC Group HPTS 2001.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
A Recovery-Friendly, Self-Managing Session State Store Benjamin Ling and Armando Fox
Wide Area Events Using DDS We have a model that can efficiently support a family of applications, Publish-Subscribe-Notify. To realize this model, we implemented.
Free Recovery: A Step Towards Self-Managing State Andy Huang and Armando Fox Stanford University.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
DStore: An Easy-to-Manage Persistent State Store Andy Huang and Armando Fox Stanford University.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Ur/Web: A Simple Model for Programming the Web
CS 540 Database Management Systems
Business System Development
Remote Backup Systems.
Cassandra - A Decentralized Structured Storage System
Slide credits: Thomas Kao
The Case for a Session State Storage Layer
Introduction to Cassandra
Software Architecture in Practice
Open Source distributed document DB for an enterprise
CSE-291 Cloud Computing, Fall 2016 Kesden
Cassandra Transaction Processing
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
CHAPTER 3 Architectures for Distributed Systems
Google Filesystem Some slides taken from Alan Sussman.
Replication Middleware for Cloud Based Storage Service
Evaluation of Relational Operations: Other Operations
Chapter 17: Database System Architectures
Distributed P2P File System
The Single Node B-tree for Highly Concurrent Distributed Data Structures by Barbara Hohlt 11/23/2018.
Shen, Yang, Chu, Holliday, Kuschner, and Zhu
Hadoop Technopoints.
Today: Coda, xFS Case Study: Coda File System
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Decoupled Storage: “Free the Replicas!”
The evolution of a transaction processing system
Evaluation of Relational Operations: Other Techniques
Performance And Scalability In Oracle9i And SQL Server 2000
Database System Architectures
EEC 688/788 Secure and Dependable Computing
Remote Backup Systems.
Evaluation of Relational Operations: Other Techniques
Setting up PostgreSQL for Production in AWS
Presentation transcript:

Noah Treuhaft UC Berkeley ROC Group ROC Retreat, January 2002 Quantifying Availability/Performance Tradeoffs in Distributed Data Structures Noah Treuhaft UC Berkeley ROC Group ROC Retreat, January 2002

Outline Motivation Distributed data structures A shared-disk DB toolkit Quantifying the tradeoffs Status

Motivation Many interactions between availability and performance in systems some are synergies (DB index “structure modifying operations” as “nested top actions”) others are tradeoffs (transaction throughput) ROC principle: availability is not subordinate to performance the application determines the appropriate balance... and that guides us through the tradeoffs

Motivation (2) Implication for systems research: lead by building tunable systems but must ensure that people understand how to tune them! unlabeled knobs are useless Key insight: quantify availability/performance tradeoffs with availability benchmarking hard work, so don’t make system users do their own benchmarking

Outline Motivation Distributed data structures A shared-disk DB toolkit Quantifying the tradeoffs Status

What’s a distributed data structure (DDS)? Interface like a centralized data structure uniform access from all cluster nodes Updates consistency model Persistent Out-of-core Building block for Internet-style services provides persistent state management “high” throughput AND “high” availability service inherits tradeoffs from DDS

Gribble’s prototype DDS: distributed hash table clients interact with any service “front-end” [all persistent state is in DDS and is consistent across cluster] client client client client client WAN service DDS lib service DDS lib service DDS lib service interacts with DDS via library [library is 2PC coordinator, handles partitioning, replication, etc., and exports hash table API] SAN “brick” is durable single-node hash table plus RPC skeletons for network access storage “brick” storage “brick” storage “brick” storage “brick” storage “brick” storage “brick” example of a distributed HT partition with 3 replicas in group from a presentation by Steve Gribble

Outline Motivation Distributed data structures A shared-disk DB toolkit Quantifying the tradeoffs Status

Berkeley DB overview Great for persistent state management and more Access methods for unordered and ordered data hash table and B-tree Transactions Runs on a single machine

Berkeley DB architecture

Shared-disk DB architecture DLM App AM Xact BP Log App AM Xact BP Log Cluster node

Outline Motivation Distributed data structures A shared-disk DB toolkit Quantifying the tradeoffs Status

Two tradeoffs Concurrent intersystem page modification log merge required during recovery reduced page contention page transfers replaced by log-record transfers “Hot” page replication immediate page recovery reduced logging? memory overhead two-phase commit overhead

Availability benchmarking 101 Availability benchmarks quantify system behavior under failures, maintenance, recovery They require a realistic workload for the system quality of service metrics and tools to measure them fault-injection to simulate failures human operators to perform repairs normal behavior (99% conf.) QoS degradation failure Repair Time from a presentation by Dave Patterson

Outline Motivation Distributed data structures A shared-disk DB toolkit Quantifying the tradeoffs Status

Status Getting familiar with Berkeley DB implemented TPC-B looking through the source code Combing through shared-disk DB research literature Identifying availability/performance tradeoffs others will appear during implementation