 What is Cluster Computing  Benefits of Cluster Computing  Types of Cluster’s  Cluster Components  Cluster Applications  Performance Impacts and.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Welcome to Middleware Joseph Amrithraj
Distributed Processing, Client/Server and Clusters
2. Computer Clusters for Scalable Parallel Computing
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
Chapter 3 Internet. Physical Components of the Internet Servers Networks Routers.
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
Distributed Processing, Client/Server, and Clusters
Chapter 16 Client/Server Computing Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Web Server Hardware and Software
Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2010.
Reliability Week 11 - Lecture 2. What do we mean by reliability? Correctness – system/application does what it has to do correctly. Availability – Be.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Chapter 9: Moving to Design
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
DISTRIBUTED COMPUTING
Introduction to client/server architecture
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
 The Open Systems Interconnection model (OSI model) is a product of the Open Systems Interconnection effort at the International Organization for Standardization.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Communicating over the Network Network Fundamentals – Chapter 2.
Lecture slides prepared for “Business Data Communications”, 7/e, by William Stallings and Tom Case, Chapter 8 “TCP/IP”.
Network Topologies.
Data Communications and Networks
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
Input/OUTPUT [I/O Module structure].
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
HOW WEB SERVER WORKS? By- PUSHPENDU MONDAL RAJAT CHAUHAN RAHUL YADAV RANJIT MEENA RAHUL TYAGI.
MediaGrid Processing Framework 2009 February 19 Jason Danielson.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
COP 4930 Computer Network Projects Summer C 2004 Prof. Roy B. Levow Lecture 3.
Quality of System requirements 1 Performance The performance of a Web service and therefore Solution 2 involves the speed that a request can be processed.
OS Services And Networking Support Juan Wang Qi Pan Department of Computer Science Southeastern University August 1999.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.
1 OSI and TCP/IP Models. 2 TCP/IP Encapsulation (Packet) (Frame)
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
1 Chapter 1 Basic Structures Of Computers. Computer : Introduction A computer is an electronic machine,devised for performing calculations and controlling.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2012.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Clouds , Grids and Clusters
Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.
CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037
Introduction to Operating System (OS)
Cloud Computing Ed Lazowska August 2011 Bill & Melinda Gates Chair in
Introduction to client/server architecture
University of Technology
TASK 4 Guideline.
Chapter 16: Distributed System Structures
Web Server Administration
Multiple Processor Systems
CLUSTER COMPUTING.
MAINTAINING SERVER AVAILIBILITY
Database System Architectures
Presentation transcript:

 What is Cluster Computing  Benefits of Cluster Computing  Types of Cluster’s  Cluster Components  Cluster Applications  Performance Impacts and Care about’s  Cluster Computing Examples  Conclusion  References

 A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks.  Cluster computing is best characterized as the integration of a number of off-the-shelf commodity computers and resources integrated through hardware, networks, and software to behave as a single computer. Fig: Basic layout of cluster computing

 The main benefits of clusters are scalability, availability, and performance.  Scaling the cluster's processing power is achieved by simply adding additional nodes to the cluster  Availability within the cluster is assured as nodes within the cluster provide backup to each other in the event of a failure.  In summary, clusters provide:  Scalable capacity for compute, data, and transaction intensive applications, including support of mixed workloads.  Ability to handle unexpected peaks in workload.  Central system management of a single systems image.  24 x 7 availability.

There are several different varieties of computer clusters, each offering different advantages to the user. These varieties are: 1.High Availability Clusters 2.Load-balancing Clusters 3.Cluster Aware and Cluster Unaware applications

 This type of cluster distributes incoming requests for resources or content among multiple nodes running the same programs or having the same content.  If a node fails, requests are redistributed between the remaining available nodes Fig:Load-balancing clusters

 Cluster Aware applications know about the existence of other nodes and are able to communicate with them.  Cluster-unaware applications do not know if they are running in a cluster or on a single node.

 Application: It includes all the various applications that are going on for a particular group.

 Middleware: o These are software packages which acts as interface between user and operating system for the cluster computing. o Middleware provides various services required by an application to function correctly.  Operating System: Clusters can be supported by various operating systems which includes Windows, Linux etc.  Interconnect: Interconnection between the various nodes of the cluster system can be done using 10GbE(Gigabyte Ethernet), Myrinet etc.  Nodes: Nodes of the cluster system implies about the different computers that are connected. All of these processors can be of Intel or AMD 32 or 64 bit. Contd…..

Cluster nodes have one master and other nodes as slaves. Master node is responsible for running the file system and also serves as the key system for clustering middleware to route processes, duties, and monitor the health and status of each slave node. Slave node provides the cluster a computing and data storage capability.

There are four primary categories of applications that use parallel clusters they are: 1.Compute Intensive Application:  Compute intensive is a term that applies to any computer application that demands a lot of computation cycles. 2. Data or I/O Intensive Applications:  Data intensive is a term that applies to any application that has high demands of attached storage facilities. 3.. Transaction Intensive Applications:  Transaction intensive is a term that applies to any application that has a high- level of interactive transactions between an application resource and the cluster resources

 Throughput:  Data throughput begins with a calculation of a theoretical maximum throughput and concludes with effective throughput.  The effective throughput available between nodes will always be less than the theoretical maximum  Throughput for cluster nodes is based on many factors, including the following: o Total number of nodes running o Switch architectures o Forwarding methodologies o Queuing methodologies o Buffering depth and allocations o Noise and errors on the cable plant  Slow Start:  In a busy network, the sudden appearance of a large amount of new traffic could exacerbate any existing congestion.

 Each time an acknowledgment is received, the amount of data the device can send is increased by the size of another full-sized segment.  Thus, the device “starts slow” in terms of how much data it can send, with the amount it sends increasing until either the full window size is reached or congestion is detected on the link.  Congestion Avoidance:  When potential congestion is detected on a TCP link, a device responds by throttling back the rate at which it sends segments.  The device then uses the Slow Start algorithm, described above, to gradually increase the transmission rate back up again to try to maximize throughput without congestion occurring again. Contd…..

 A typical Google search consists of the following operations: 1.An Internet user enters a query at the Google webpage 2.The web browser searches for the Internet Protocol(IP) address via the Domain Name Server (DNS). 3.Google uses a DNS-based load balancing system that maps the query to a cluster that is geographically nearest to the user so as to minimize network communication delay time. The IP address of the selected cluster is returned. 4.The web browser then sends the search request in Hypertext Transport Protocol(HTTP) format to the selected cluster at the specified IP address.

5. The selected cluster then processes the query locally. 6. A hardware-based load balancer in the cluster monitors the available set of Google Web Servers (GWSs) in the cluster and distributes the requests evenly within the cluster. 7. A GWS machine receives the request, coordinates the query execution and sends the search result back to the user’s browser. Fig: Google Web Server Contd….

 There are two phases in which search takes place:  The first phase of query execution involves index servers consulting an inverted index that match each query keyword to a matching list of documents.  In the second phase, document servers fetch each document from disk to extract the title and the keyword-in-context portion of the document.  In addition to the two phases, the GWS also activates the spell checker and the ad server.  The spell checker verifies that the spelling of the query keywords is correct, while the ad server generate advertisements that relate to the query and may therefore interest the user. Contd….

 The Scientific Computing and Imaging(SCI) Institute at University of Utah has explored cluster-based scientific visualization using a 32-node visualization cluster composed of commodity hardware components connected with a high- speed network  The OpenGL scientific visualization tool Simian has been modified to create a cluster-aware version of Simian that supports parallelization by making explicit use of remote cluster nodes through a message-passing interface(MPI).  Normally, Simian uses a swapping mechanism to manage datasets that are too large to load into the available texture memory, resulting in low performance and interactivity.  The “divide-and-conquer” technique first decomposes the dataset into sub- volumes before distributing the sub-volumes to multiple remote cluster nodes.

 Each node is then responsible for rendering its sub-volume using the locally available graphics hardware.  The individual results are finally combined using a binary-swap compositing algorithm to generate the final image. Fig: Visualization of Fire-spread Datasets Contd…..

 Solve parallel processing paradox  Offer incremental growth and matches with funding pattern  High-performance cluster computing is enabling a new class of computationally intensive applications that are solving problems that were previously cost prohibitive for many enterprises.  cluster architectures also can lead to more reliable computer systems through redundancy.

M.Baker, A.Apon, R.Buyya, H.Jin, “Cluster Computing and Applications”, Encyclopedia of Computer Science and Technology