High Availability Clusters in Linux Sulamita Garcia EDS Unix Specialist

Slides:



Advertisements
Similar presentations
Express5800/ft series servers Product Information Fault-Tolerant General Purpose Servers.
Advertisements

The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Dr. Zahid Anwar. Simplified Architecture of Linux Cluster Simplified Architecture of a Single Computer Simplified architecture of an enterprise cluster.
CS526 Dr.Chow1 HIGH AVAILABILITY LINUX VIRTUAL SERVER By P. Jaya Sunderam and Ankur Deshmukh.
Lesson 20 – OTHER WINDOWS 2000 SERVER SERVICES. DHCP server DNS RAS and RRAS Internet Information Server Cluster services Windows terminal services OVERVIEW.
Lesson 1: Configuring Network Load Balancing
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
10/02/2004ELFms meeting1 Linux Virtual Server Miroslav Siket FIO-FS.
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
Module 13: Network Load Balancing Fundamentals. Server Availability and Scalability Overview Windows Network Load Balancing Configuring Windows Network.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
CH2 System models.
FailSafe SGI’s High Availability Solution Mayank Vasa MTS, Linux FailSafe Gatekeeper
 High-Availability Cluster with Linux-HA Matt Varnell Cameron Adkins Jeremy Landes.
 Load balancing is the process of distributing a workload evenly throughout a group or cluster of computers to maximize throughput.  This means that.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
High Availability in DB2 Nishant Sinha
70-412: Configuring Advanced Windows Server 2012 services
WINDOWS SERVER 2003 Genetic Computer School Lesson 12 Fault Tolerance.
Install, configure and test ICT Networks
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
High-Availability MySQL with DR:BD and Heartbeat: MTV Japan mobile services ©2008 MTV Networks Japan K.K.
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
Presented by Deepak Varghese Reg No: Introduction Application S/W for server load balancing Many client requests make server congestion Distribute.
BIG DATA/ Hadoop Interview Questions.
Step-by-Step Guide to Asynchronous Data (File) Replication (File Based) over a WAN Supported by Open-E ® DSS™ Software Version: DSS ver up85 Presentation.
MySQL HA An overview Kris Buytaert. ● Senior Linux and Open Source ● „Infrastructure Architect“ ● I don't remember when I started.
Architecting Enterprise Workloads on AWS Mike Pfeiffer.
rain technology (redundant array of independent nodes)
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
Consulting Services JobScheduler Architecture Decision Template
Scaling Network Load Balancing Clusters
REPLICATION & LOAD BALANCING
Bentley Systems, Incorporated
Chapter Objectives In this chapter, you will learn:
High Availability Linux (HA Linux)
AlwaysOn Mirroring, Clustering
Global Catalog and Flexible Single Master Operations (FSMO) Roles
Consulting Services JobScheduler Architecture Decision Template
Troubleshooting Network Communications
Network Load Balancing
HP ArcSight ESM 6.8c HA Fail Over Illustrated
Cluster Communications
Chapter 15: Networking Services Design Optimization
Some bits on how it works
NETWORKING TECHNOLOGIES
Lec3: Network Management
Storage Virtualization
Welcome To : Group 1 VC Presentation
Client-Server Interaction
TASK 4 Guideline.
Replication Middleware for Cloud Based Storage Service
Unit 27: Network Operating Systems
DHCP, DNS, Client Connection, Assignment 1 1.3
An Introduction to Computer Networking
Fault Tolerance Distributed Web-based Systems
Internet Protocols IP: Internet Protocol
IST346: Scalability.
Global Catalog and Flexible Single Master Operations (FSMO) Roles
APACHE WEB SERVER.
Computer Networks Presentation
Presentation transcript:

High Availability Clusters in Linux Sulamita Garcia EDS Unix Specialist

What are clusters ● A set of computers working as one ● High Performance Computing – Super computers ● Load Balance – Easy way to improve responsiveness: less delay ● High Availability – Always responding

Why High Availability ● We all depend on digital systems – From light to banks, any stop is a nightmare – Even communication can cost: clients, time lines, productivity ● It is a plus in services – Security also means availability and failure recover

Failures ● There is always a possibility of error ● Failure: physical - electrical or mechanical ● Error: a failure which affects the data, changing a value ● Fault: a failure causes a crash or freezes the system

High Availability ● Systems always online ● Classified by number of 9s – 99,99% – 99,999% <- majority – 99,9999%... – Suppliers always try improve this number ● Availability == 1 is hypothetical – there is always a chance of failure

Ways to HA ● Fault tolerance software – Avoiding failures to become errors and errors become defects – Complex and heavy ● Hardware – Can perform many tasks – Very expansive

High Availability ● Work with fault possibility ● Redundancy by hardware and control by software ● Usual hardware ● Machines recover themselves automatically

Identifying the environment ● A set of resources to take care of ● Tests to be run frequently ● Actions to run if these tests to fail ● Tools to check and manage the environment

Resources ● A web server ● A link ● Network card state ● A storage unit

Actions or tests ● Reload the service ● Reboot the machine ● fsck the filesystem ● Configuration of alternative routes ● Notification to admin by pager, mail

High Availability - Contingency ● Raid ● Redundant energy font ● Two Internet link ● Two network cards ● Data Replication ● Configuration replicating ● Replication of user information...

Replication data ● Depends on data ● Does it change to much? ● Does it have much access? ● Can you loose some data? ● How much load the machine can have?

DRBD ● Data replication block device – Block replication: don't understand data ● Replicates partitions, but not files nor directories ● Mirroring: two nodes at time ● Data immediately replicated: highly reliable

DRBD ● Metadata: 128Mb for mirroring – Separate partition: indexing – Or shrink your data to grarantee 128Mb to drbd ● Separate link connection ● Any filesystem: ext3, reiserfs, etc ● Load: must be a well designed project ● Great project, few people: if you can contribute

Databases ● Drbd replicates blocks: don't know about registers ● A wrong register can crash entire database ● Most databases already has a way to do it – Oracle, LDAP(directory service - slurpd) ● Master -> slaves – Share the load

Databases ● Mysql: – Master: my.cnf [mysqld] log-bin server-id=1 – Slaves: mysql>CHANGE MASTER TO MASTER_HOST='SERVER', MASTER_USER='REPLICATION', MASTER_PASSWORD='MYPASS', MASTER_LOG_FILE='SERVER-BIN.00001', MASTER_LOG=211; mysql> START SLAVE;

Another way ● Rsync – Replicates files and directories – Updates – Scheduling – Low load – Permission: users, machines – Can loose some data

Monitoring ● Heartbeat – Check other machine's availability – Define primary/secondary – When a primary does not answer, the secondary takeovers the services and resources – Take care of to small times: machines can fight over services, that's not good

Mon ● Small ● Many monitors and alerts – Monitor check a service – Alert takes an action: mail, pager, command

Entire enviroment ● Data replication: – drbd/rsync/database schema ● Heartbeat: – Primary does not answer, secondary takes the ip service and starts the services -> services available ● Mon: – Some services are not responding: mail the admin, stop heartbeat

Load balance ● A controller divides the requests among a set of machines ● Share the load ● Easily recover if a box failed

Linux Virtual Server ● Load balance – Priority, last used, least used, round robin, or combined ● Controller: can respond or not ● Take care of data: services as http

Some simple load balance ● DNS: same address www IN A www IN A ● Iptables: several or a range –to: iptables -t nat -A OUTPUT -p tcp -d xxx -j DNAT --to x1 --to x2

Remember ● What you need to replicate ● How exactly should be it ● How much the load does the box support ● The network ● If schedule, how repeatedly will those periods occur

Links and Questions? ● ● ●