Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul.

Slides:



Advertisements
Similar presentations
Distributed Processing, Client/Server and Clusters
Advertisements

Database Architectures and the Web
Distributed databases
Scalable Content-aware Request Distribution in Cluster-based Network Servers Jianbin Wei 10/4/2001.
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
Distributed components
Technical Architectures
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Topics ACID vs BASE Starfish Availability TACC Model Transend Measurements SNS Architecture.
© 2001 Stanford Distinguishing P, S, D state n Persistent: loss inevitably affects application correctness, cannot easily be regenerated l Example: billing.
G Robert Grimm New York University Scalable Network Services.
“ Adapting to Network and Client Variation Using Infrastructural Proxies : Lessons and Perspectives ” University of California Berkeley Armando Fox, Steven.
1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Topics ACID vs BASE Starfish Availability TACC Model Transend Measurements SNS Architecture.
G Robert Grimm New York University Scalable Network Services.
Presentation on Clustering Paper: Cluster-based Scalable Network Services; Fox, Gribble et. al Internet Services Suman K. Grandhi Pratish Halady.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Systems Issues for Scalable, Fault Tolerant Internet Services Yatin Chawathe Eric Brewer To appear in Middleware ’98
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Ch 4. The Evolution of Analytic Scalability
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
1 The Google File System Reporter: You-Wei Zhang.
1 Distributed Processing, Client/Server, and Clusters Chapter 13.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Networked File System CS Introduction to Operating Systems.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
2/1/00 Porcupine: a highly scalable service Authors: Y. Saito, B. N. Bershad and H. M. Levy This presentation by: Pratik Mukhopadhyay CSE 291 Presentation.
Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Research of P2P Architecture based on Cloud Computing Speaker : 吳靖緯 MA0G0101.
Distributed System Architectures Yonsei University 2 nd Semester, 2014 Woo-Cheol Kim.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
REST By: Vishwanath Vineet.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
System Models Advanced Operating Systems Nael Abu-halaweh.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Fault – Tolerant Distributed Multimedia Streaming Web Application By Nirvan Sagar – Srishti Ganjoo – Syed Shahbaaz Safir
SERVERS. General Design Issues  Server Definition  Type of server organizing  Contacting to a server Iterative Concurrent Globally assign end points.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Cluster-Based Scalable
Introduction to Load Balancing:
Software Design and Architecture
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Database Architectures and the Web
Replication Middleware for Cloud Based Storage Service
Systems Issues for Scalable, Fault Tolerant Internet Services
Introduction to Databases Transparencies
Ch 4. The Evolution of Analytic Scalability
Distributed computing deals with hardware
Database System Architectures
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul Gauthier Paul Gauthier Presenter: Kang Cao

Over View IntroductionIntroduction Cluster-Based Scalable Service ArchitectureCluster-Based Scalable Service Architecture Service ImplementationService Implementation MeasurementsMeasurements DiscussDiscuss conclusionconclusion

Introduction GoalGoal Advantages of ClustersAdvantages of Clusters Challenges of Cluster computingChallenges of Cluster computing BASE SemanticsBASE Semantics

Goal ScalabilityScalability –Keep same per-user cost as load increases. Availability:Availability: –Run 24 hour a day and 7 day a week Cost effectivenessCost effectiveness

Advantages ScalabilityScalability –Clusters are well suited to Internet Service workload –Incremental scalability High availabilityHigh availability Commodity building blocksCommodity building blocks –Cheap commodity PC –Get service quickly and cheap

challenges AdministrationAdministration Component VS. System replicationComponent VS. System replication Partial failuresPartial failures Share statesShare states

BASE Semantics Against ACID(atomicity, consistency,isolation,durability) StaleStale Soft stateSoft state ApproximateApproximate

Cluster-Based Scalable Service Architecture Layer ArchitectureLayer Architecture Separate network services from their implementationSeparate network services from their implementation Stateless workersStateless workers

Cluster-Based Scalable Service Architecture SNSSNS TACCTACC ServiceService

Scalable network service Incremental and absolute scalabilityIncremental and absolute scalability Worker load balancing and overflow managementWorker load balancing and overflow management Front-end availability, fault tolerance mechanismsFront-end availability, fault tolerance mechanisms System monitoring and loggingSystem monitoring and logging

SNS SNSManagerSNSManager InternalNetwork Front End MSMS MSMS MSMS Worker Driver WorkerWorker WorkerWorker... $ $ Internet

Load balance Centralized load balancingCentralized load balancing Easy to implementEasy to implement

How to handle Bursts Has a overflow poolHas a overflow pool Manager can spawn workers on overflow machines on the demandManager can spawn workers on overflow machines on the demand

Scalability Components replicatedComponents replicated Amount of additional resources required is a linear function of the increase in offered loadAmount of additional resources required is a linear function of the increase in offered load Partition the function between front end and workerPartition the function between front end and worker Keep worker as simple as possibleKeep worker as simple as possible

Fault Tolerance and Availability Fault Tolerance and Availability Process peer fault toleranceProcess peer fault tolerance Using soft statesUsing soft states Timeout as an additional fault- tolerance mechanismTimeout as an additional fault- tolerance mechanism

TACC TACC: Transformation, Aggregation, Caching, Customization API for composition of stateless data transformation and content aggregation modulesAPI for composition of stateless data transformation and content aggregation modules Uniform caching of original, post- aggregation and post-transformation dataUniform caching of original, post- aggregation and post-transformation data Transparent access to Customization databaseTransparent access to Customization database

TACC A programming model for internet Service TransformationTransformation AggregationAggregation CachingCaching CustomizationCustomization

Service Implementation Workers that present human interface to what TACC modules do, including device-specific presentationWorkers that present human interface to what TACC modules do, including device-specific presentation User interface to control the serviceUser interface to control the service Most service can be done at the service and TACC layersMost service can be done at the service and TACC layers

Example:TranSend Model pool switch workstationWorkstationworkstation Internet

TranSend Front EndsFront Ends Load balancing ManagerLoad balancing Manager User profile DatabaseUser profile Database Cache NodesCache Nodes Datatype-Specific DistillersDatatype-Specific Distillers Graphical MonitorGraphical Monitor

Load Balancing Manager Client-side JavaScript support balance load across multiple front endsClient-side JavaScript support balance load across multiple front ends Centralized manager for internal load balancingCentralized manager for internal load balancing

Load balancing components register to managercomponents register to manager Front end asks manager to give it a worker when it has taskFront end asks manager to give it a worker when it has task Manager locates a worker to Front endManager locates a worker to Front end Manager may create a new distillerManager may create a new distiller Workers report their load to managerWorkers report their load to manager

Load balancing Manager broadcast the information of load periodicallyManager broadcast the information of load periodically FrontEnds cache these informationFrontEnds cache these information FrontEnds use the cached information to dispatch requests to workersFrontEnds use the cached information to dispatch requests to workers

Fault Tolerance and crash Recovery Using BASE semantics simplifies crash recoveryUsing BASE semantics simplifies crash recovery Manager reports workers failures to the FrontEndManager reports workers failures to the FrontEnd Manager detects and restarts a crashed front endManager detects and restarts a crashed front end The front end detects and restarts a crashed managerThe front end detects and restarts a crashed manager

Performance Load balancing

Performance: Load balancing

Conclusions: Layer architecture for cluster- base scalable network serviceLayer architecture for cluster- base scalable network service The architecture is reusableThe architecture is reusable Cluster-based value-added network services will become an important Internet-service paradigmCluster-based value-added network services will become an important Internet-service paradigm

Performance: Scalability

question 1.Why are the cluster-based network service well suited to internet service

answer The requirements are highly parallel( many indepent simultaneous users)The requirements are highly parallel( many indepent simultaneous users) The grain size typically corresponds to at most a few CPU seconds on a commodity PCThe grain size typically corresponds to at most a few CPU seconds on a commodity PC

Question 2 Why does the cluster-base network service use BASE semantics?Why does the cluster-base network service use BASE semantics?

Answer: BASE semantics allow us to handle partial failure in clusters with less complexity and cost.BASE semantics allow us to handle partial failure in clusters with less complexity and cost.

Question 3 When the overflow machines are being recruited unusually often, what should be done at this time?When the overflow machines are being recruited unusually often, what should be done at this time?

Answer: It is time to add new machines.It is time to add new machines.

Question 4 Does the Frontend crash not lost any information? If does, what kind information will be lost?Does the Frontend crash not lost any information? If does, what kind information will be lost?

Answer: User requests will be lost and user need to handle timeout and resend request.User requests will be lost and user need to handle timeout and resend request.