Team 5: Virtual Online Blackjack 17-654: Analysis of Software Artifacts 18-841: Dependability Analysis of Middleware Philip Bianco John Robert Vorachat.

Slides:

Advertisements

Similar presentations

Advertisements

WSUS Presented by: Nada Abdullah Ahmed.

Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.

Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank

Technical Architectures

Team 1: Box Office : Analysis of Software Artifacts : Dependability Analysis of Middleware JunSuk Oh, YounBok Lee, KwangChun Lee, SoYoung Kim,

Distributed Systems Architectures

Distributed Object Computing Weilie Yi Dec 4, 2001.

Team 2: The House Party Blackjack Mohammad Ahmad Jun Han Joohoon Lee Paul Cheong Suk Chan Kang.

1 Philippe. Team 3: Spam’n’Beans : Analysis of Software Artifacts : Dependability Analysis of Middleware Gary Ackley Andrew Boyer Charles.

Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 

.NET Mobile Application Development Introduction to Mobile and Distributed Applications.

Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,

DISTRIBUTED COMPUTING

Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.

Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.

Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗

Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Section 6.1 Explain the development of operating systems Differentiate between operating systems Section 6.2 Demonstrate knowledge of basic GUI components.

11 REVIEWING MICROSOFT ACTIVE DIRECTORY CONCEPTS Chapter 1.

1 The Google File System Reporter: You-Wei Zhang.

Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.

Module 13: Configuring Availability of Network Resources and Content.

CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.

Module 13: Network Load Balancing Fundamentals. Server Availability and Scalability Overview Windows Network Load Balancing Configuring Windows Network.

Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.

Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.

Microsoft Active Directory(AD) A presentation by Robert, Jasmine, Val and Scott IMT546 December 11, 2004.

Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.

Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.

FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Other Topics RPC & Middleware.

1 Chapter 38 RPC and Middleware. 2 Middleware  Tools to help programmers  Makes client-server programming  Easier  Faster  Makes resulting software.

Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.

Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.

Module 11: Implementing ISA Server 2004 Enterprise Edition.

Computer Emergency Notification System (CENS)

Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:

1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.

Lecture 22: Client-Server Software Engineering

Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.

NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.

1 MMORPG Servers. 2 MMORPGs Features Avatar Avatar Levels Levels RPG Elements RPG Elements Mission Mission Chatting Chatting Society & Community Society.

 What are CASE Tools ?  Rational ROSE  Microsoft Project  Rational ROSE VS MS Project  Virtual Communication  The appropriate choice for ALL Projects.

Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.

GLOBE DISTRIBUTED SHARED OBJECT. INTRODUCTION  Globe stands for GLobal Object Based Environment.  Globe is different from CORBA and DCOM that it supports.

11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.

VMware vSphere Configuration and Management v6

Eric Tryon Brian Clark Christopher McKeowen. System Architecture The architecture can be broken down to three different basic layers Stub/skeleton layer.

High Availability in DB2 Nishant Sinha

The Project Presentation April 28, : Fault-Tolerant Distributed Systems Team 7-Sixers Kyu Hou Minho Jeung Wangbong Lee Heejoon Jung Wen Shu.

SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.

Final Fantasy ½: IOC Overview Team Q CSE 403 Winter ‘03 I’ve got something special for you.

Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.

Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:

Affinity Depending on the application and client requirements of your Network Load Balancing cluster, you can be required to select an Affinity setting.

Lead SQL BankofAmerica Blog: SQLHarry.com

UI-Performance Optimization by Identifying its Bottlenecks

A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)

Introduction to Networks

#01 Client/Server Computing

Replication Middleware for Cloud Based Storage Service

Oracle Architecture Overview

PerformanceBridge Application Suite and Practice 2.0 IT Specifications

Microsoft Virtual Academy

#01 Client/Server Computing

Presentation transcript:

Team 5: Virtual Online Blackjack : Analysis of Software Artifacts : Dependability Analysis of Middleware Philip Bianco John Robert Vorachat Tamarree Lutz Wrage Gene Wilson

1 Team Members Phil Bianco John Robert du Vorachat Tamarree yahoo.com Lutz Wrage mu.edu Gene Wilson s.cmu.edu

1 Virtual Online Blackjack  An interactive client server application that allows multiple players to play blackjack in a virtual casino.  The server performs all the functions of a dealer in a Las Vegas casino including: – Selling chips to the players – Taking bets – Dealing the initial hand to each player – Presenting options to the player (Hit or Stay) – Officiating the game  This is an interesting application because – Multiple server side elements (Casino Floor, Bank, Table) – Clear fault tolerance and performance requirements – Completely Java solution  The application uses the Sun Microsystems IDL ORB for the following reasons: – The price was certainly well within our budget – The team wanted to play with CORBA

1 Baseline Architecture

1 Fault Tolerance Goals  Fault tolerant goals – Client automatically connects to a back up casino server. – New backup is automatically started. – Minimize transient state and store all state on the database.  Replicated Components – Casino Server (IDL interfaces for Casino Floor, Bank and Table)  All State is stored in a magnificently designed database. – Installed MS SQL Server on a PC in the Cave – Shared server with at least one other team  Sacred functions – Database – Naming Service – Players (clients)

1 Fault Tolerant Elements  Replication Manager – Pings (1 per sec) all Casino servers to detect service faults – Automatically starts new servers to maintain the number of Casino servers (2) – User interface enables injecting faults (killing a server) – Very configurable using configuration files  Proxy classes for all communication – Isolates most fault tolerance functions from the application

1 FT-Baseline Architecture

1

1

1 Mechanisms for Fail-Over  How do you accomplish fail-over?  How do you detect a fault?  Which exceptions do you handle (mention the names)?  What do you do, upon catching one of these exceptions?  When do you obtain the names of the server references? What if you run out of live references?

1 Local method call  Local Methods Fault Free Standard Garbage Collection

1 Local Method call with failover to remote server

1 Fault Free Remote method call with standard GC

1 Remote Method call with failover to local server

1 Remote with server running IGC Fault Free

1 Remote with client and server running IGC

1 Timing of failover and activation time during failover

1 Active Replication Timing Data

1 Fail-Over Measurements  Show your graphs, one by one, over the next few slides – Place one graph per slide – Select at most one graph (out of the entire set that you have) – Pick the most interesting graph Showing at least fail-overs Showing the “spike” of the Naming Service (or Replication Manager) communication

1 RT-FT-Baseline Architecture  Should describe your strategy for reducing the fail-over time, in the interests of obtaining “real-time” bounded behavior under faults

1 Bounded “Real-Time” Fail-Over Measurements  Show your RT-FT-baseline graphs – Select at most one graph (out of the entire set that you have) – Pick the most interesting graph Showing at least fail-overs Showing the “spike” of the Naming Service (or Replication Manager) communication being mitigated Include on the slide the percentage by which you’ve reduced the “spike” Tell us what the bounds for the fail-over now are

1 RT-FT-Performance Strategy  Used active replication strategy to address performance – The proxy classes that handle everything. – During player startup the AR proxies get references to all running replicas. – Each method call is sent to all replicas. The sequence numbers for one call are identical across replicas.  What mechanisms did you need in addition to what your system has?

1 Performance Measurements  Show your performance graphs for active replication or load balancing – Select at most one graph (out of the entire set that you have) – Pick the most interesting graph For load balancing, show the system performance under several clients (try to scale up to more than 20 clients) For active replication, show what the fail-over times vs. run-time performance trade-offs are, as compared to cold passive replication

1 Other Features  List other features that you used – Used CVS throughout the project – Moved to ant (from make files) after baseline – Some use of a scripting language for automated player (clients) – Gene’s tool to find current usage of cluster machines  Explored garbage collection – Incremental GC – Turned off  How to design for testability (fault injection)  This is where you get to show off about how you’ve gone the extra mile in this project! – Performance at different times of the day – Extensive use of configuration files enabled greater flexibility for implementation and testing

1 Insights from Measurements  What insights did you gain from the three sets of measurements, and from analyzing the data? – Java garbage collection is the dominate factor for performance Time is double for remote clients. Changing garbage collection impacts the performance by ??? – Replication tradeoffs – In the tested configuration, network latency impact on performance is negligible compared to the database access time. – How did you use each set of insights in the next phases of the project?

1 Open Issues  List any issues that you still need to resolve, and that you might want to see discussed openly – Profiler testing with Java? – Impact of other JVMs?  If you had the time, what are the 2-3 additional features that you would have liked to have implemented for your system? – Improve the user interface – Examine impact of security requirements on FT, RT and performance  Analyzing why the performance edges up over time

1 Conclusions  Lessons Learned:  Technical Lessons: – Fault tolerance rapidly increases complexity of the system – Active replication is not-trivial. – In Java applications garbage collection is the largest performance bottleneck. – Name server lookups contributed a very minor amount of delay to failover recovery compared to state recovery from the database.  Configuration Issues for Remote Development: – We did most of our development independenly and remotely. – Linux, CVS, ssh and afs all made this much easier. – Simple scripts can make life much easier: clusterload, project.env. – , instant messaging, shared file system space help communication. – Separate databases for each developer. – Design still need face-to-face meetings to be effective.

1 Conclusions  Accomplishments: – Met objective to create a distributed middleware application and demonstrated improvements at each milestone – Our Fault-Tolerant design worked very well: fast, scalable, robust. – Replication Manager with fault injection, powerful user interface. – Player manager could start and control many clients on multilple hosts at once. – Active Replication. – Name caching. – Automatic player script with Expect. – Found the sources of the worst performance bottlenecks. – 1.3 babies

1 Conclusions  What would you do differently, if you could start the project from scratch now? – Focus on state management earlier in the development. – Design on active replication much earlier. – Using the clients as callback servers is a very bad idea, makes Active Replication and/or load balancing hard to impossible. – Have root access to the development machines. – Better configuration management, keeping better track of milestone versions. – Better test plans. – A little more structure to our team: Better meeting scheduling. Agendas for meetings. Minutes for meetings. Maybe a team leader, possibly as a rotating assignment (i.e. 3 weeks/person)