1 Building Reliable Web Services: Methodology, Composition, Modeling and Experiment Pat. P. W. Chan Department of Computer Science and Engineering The.

Slides:



Advertisements
Similar presentations
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
Advertisements

LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.
Availability in Globally Distributed Storage Systems
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Model-Driven Dependability Analysis of Web Services Apostolos Zarras Computer Science Department University of Ioannina, Greece
Making Services Fault Tolerant
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
HardwareSoftware Success Failure Input Output. N-Version Programming Fault-Tolerant Programming Version 1 Version 2 Version N … Voter M Identical Outputs.
Software Reliability Engineering: A Roadmap
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
The Phoenix Recovery System: Rebuilding from the ashes of an Internet catastrophe Flavio Junqueira, Ranjita Bhagwan, Keith Marzullo, Stefan Savage, and.
8. Fault Tolerance in Software 8.1 Introduction Is it true that a program that has once performed a given task as specified will continue to do so? Yes,
Design, Implementation, and Experimentation on Mobile Agent Security for Electronic Commerce Applications Anthony H. W. Chan, Caris K. M. Wong, T. Y. Wong,
Reliability on Web Services Pat Chan 31 Oct 2006.
Constructing Reliable Software Components Across the ORB M. Robert Rwebangira Howard University Future Aerospace Science and Technology.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
Page 1 Copyright © Alexander Allister Shvartsman CSE 6510 (461) Fall 2010 Selected Notes on Fault-Tolerance (12) Alexander A. Shvartsman Computer.
1 Building Reliable Web Services: Methodology, Composition, Modeling and Experiment Pat. P. W. Chan Supervised by Michael R. Lyu Department of Computer.
A T AXONOMY AND S URVEY OF C LOUD C OMPUTING S YSTEMS Reporter: Steven Chen Date: 2010/10/27 1.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
3 Cloud Computing.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 31 Slide 1 Service-centric Software Engineering 2.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
Lecture 13 Fault Tolerance Networked vs. Distributed Operating Systems.
Towards a Contract-based Fault-tolerant Scheduling Framework for Distributed Real-time Systems Abhilash Thekkilakattil, Huseyin Aysan and Sasikumar Punnekkat.
FMEA-technique of Web Services Analysis and Dependability Ensuring Anatoliy Gorbenko Vyacheslav Kharchenko Olga Tarasyuk National Aerospace University.
CH2 System models.
An efficient active replication scheme that tolerate failures in distributed embedded real-time systems Alain Girault, Hamoudi Kalla and Yves Sorel Pop.
BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
 Chapter 13 – Dependability Engineering 1 Chapter 12 Dependability and Security Specification 1.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4b) Department of Electrical.
Secure Systems Research Group - FAU 1 Active Replication Pattern Ingrid Buckley Dept. of Computer Science and Engineering Florida Atlantic University Boca.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Survey of Adding Fault Tolerance to Service Oriented Architecture Ingrid Buckley 03/26/09.
An Adaptive Intrusion-Tolerant Architecture Alfonso Valdes, Tomas Uribe, Magnus Almgren, Steven Cheung, Yves Deswarte, Bruno Dutertre, Josh Levy, Hassen.
Yuhui Chen; Romanovsky, A.; IT Professional Volume 10, Issue 3, May-June 2008 Page(s): Digital Object Identifier /MITP Improving.
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
1 Reliable Web Services by Fault Tolerant Techniques: Methodology, Experiment, Modeling and Evaluation Term Presentation Presented by Pat Chan 3 May 2006.
Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002.
Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005.
1 3. System reliability Objectives Learn the definitions of a component and a system from a reliability perspective Be able to calculate reliability of.
Multi-state System Element Pr{G  x} Element with total failure Element with five different performance levels g*gj4gj4 g j3 gj2gj2 gj1gj1 g j0 =0 x 1.
WS-DREAM: A Distributed Reliability Assessment Mechanism for Web Services Zibin Zheng, Michael R. Lyu Department of Computer Science & Engineering The.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization of Distributed & Mobile Systems Dr. Michael R.
Mixed Criticality Systems: Beyond Transient Faults Abhilash Thekkilakattil, Alan Burns, Radu Dobrin and Sasikumar Punnekkat.
1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University
Tolerating Communication and Processor Failures in Distributed Real-Time Systems Hamoudi Kalla, Alain Girault and Yves Sorel Grenoble, November 13, 2003.
Slide 1 Service-centric Software Engineering. Slide 2 Objectives To explain the notion of a reusable service, based on web service standards, that provides.
Computing system Lesson Objective: Understand what is meant by a ‘computer system’ Learning Outcome: Define the key words and give a brief explanation.
Service Reliability Engineering The Chinese University of Hong Kong
1 Developing Aerospace Applications with a Reliable Web Services Paradigm Pat. P. W. Chan and Michael R. Lyu Department of Computer Science and Engineering.
A Survey of Fault Tolerance in Distributed Systems By Szeying Tan Fall 2002 CS 633.
Movement-Based Check-pointing and Logging for Recovery in Mobile Computing Systems Sapna E. George, Ing-Ray Chen, Ying Jin Dept. of Computer Science Virginia.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Investigating QoS of Web Services by Distributed Evaluation Zibin Zheng Feb. 8, 2010 Department of Computer Science & Engineering.
An Algorithm for Automatically Obtaining Distributed and Fault Tolerant Static Schedules Alain Girault - Hamoudi Kalla - Yves Sorel - Mihaela Sighireanu.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Presented by Deepak Varghese Reg No: Introduction Application S/W for server load balancing Many client requests make server congestion Distribute.
18/05/2006 Fault Tolerant Computing Based on Diversity by Seda Demirağ
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Chapter 8 Fault Tolerance. Outline Introductions –Concepts –Failure models –Redundancy Process resilience –Groups and failure masking –Distributed agreement.
Week#3 Software Quality Engineering.
Fault Tolerance In Operating System
Service-centric Software Engineering
Fault Tolerance Distributed Web-based Systems
Reliable Web Services: Methodology, Experiment and Modeling International Conference on Web Services (ICWS 2007) Pat. P. W. Chan, Michael R. Lyu Department.
Network management system
Presentation transcript:

1 Building Reliable Web Services: Methodology, Composition, Modeling and Experiment Pat. P. W. Chan Department of Computer Science and Engineering The Chinese University of Hong Kong 23 October 2007

2 Outline  Introduction to Web Services  Problem Statement  Methodologies for Web Service Reliability  New Reliable Web Service Paradigm  Web Service Composition Algorithm  Experimental Results and Discussion  Conclusion

3 Introduction  Service-oriented computing is becoming a reality.  The problems of service dependability, security and timeliness are becoming critical.  We propose experimental settings and offer a roadmap to dependable Web services.

4 Problem Statement  Fault-tolerant techniques Replication Diversity  Replication is one of the efficient ways for providing reliable systems by time or space redundancy. Increasing the availability of distributed systems Key components are re-executed or replicated Protect against hardware malfunctions or transient system faults  Another efficient technique is design diversity Employ independently designed software systems or services with different programming teams, Defend against permanent software design faults.  We focus on the analysis of the replication techniques when applied to Web services.  A generic Web service system with spatial as well as temporal replication is proposed and investigated.

5 Proposed Paradigm Replication Manager RR Algorithm / Voting WatchDog UDDI Registry WSDL Web Service IIS Application Database Web Service IIS Application Database Web Service IIS Application Database Client Port Application Database Register Look up Get WSDL Invoke web service Keep check the availability of all the web. If Web service failed, update the list of availability of Web services Update the WSDL

6 Round Robin

7 Parallel N-Version Programming

8 Recovery Block

9 Experiments  A series of experiments are designed and performed for evaluating the reliability of the Web service, Spatial replication Reboot Retry

10 Testing system  Best Route Finding.  Provide traveling suggestions for users.  Starting point and destination.  The system needs to provide the best route and the price for the users.

11 System Architecture

12 Experimental Setup  Examine the computation to communication ratio  Examine the request frequency to limit the load of the server to 75%  Fix the following parameters Computation to communication ratio (e.g 10:1) Request frequency

13 Experimental Setup Communication time: Computation time 143:14 (10:1) Request frequency1 request per min Load78.5% Timeout period of retry1 min Timeout for Web service in RM1s (web service specific) Polling frequency10 requests per min Number of replicas5 Max number of retries5 Round-robin rate1 s

14 Experiment Parameters  Fault mode Temporary (fault probability: 0.01) Permanent (fault probability: 0.001)  Experiment time 5 days (7200 requests)  Measure: Number of failures Average response time (ms)  Failure definition: 5 retries are allowed. If there is still no correct result from the Web service after 5 retries, it is considered as a failure.

15 Experimental Result with Round-robin (failures / response time in ms) ExperimentsSingle serverSingle server with retry Single server with reboot (continues no response for 3 requests) Single server with retry and reboot Spatial Replication RR Hybrid approach RR+Retry Hybrid approach RR+ Reboot All round approach RR spatial + Retry (5 times) + reboots Normal case0 / 1830 / 1930 / 1900 / 1870 / 1880 / 1950 / 1930 / 190 Temp705 / 1900 / / 2310 / / 1870 / / 1880 / 231 Perm6144 / / / --5 / / / / 1870 / 191

16 Experimental Result with N-Version (failures / response time in ms) ExperimentsSingle serverSingle server with retry Single server with reboot (continues no response for 3 requests) Single server with retry and reboot Spatial Replication Voting Hybrid approach Voting+ Retry Hybrid approach Voting + Reboot All round approach Voting spatial + Retry (5 times) + reboots Normal case0 / 1830 / 1930 / 1900 / 1870 / 1890 / 1900 / 188 Temp705 / 1900 / / 2310 / 2380 / / 1890 / 187 Perm6144 / / / --5 / / / / 1890 / 188

17 Experimental Result with Recovery Block (failures / response time in ms) ExperimentsSingle serverSingle server with retry Single server with reboot (continues no response for 3 requests) Single server with retry and reboot Spatial Replication Voting Hybrid approach Voting+ Retry Hybrid approach rollback + Reboot All round approach rollback spatial + Retry (5 times) + reboots Normal case0 / 1830 / 1930 / 1900 / 1870 / 1910 / 1890 / 1930 / 188 Temp705 / 1900 / / 2310 / 2380 / 2050 / 2030 / 2040 / 201 Perm6144 / / / --5 / / / / 2110 / 201

18 Web Service Composition Algorithm  N-version programming Reliable Efficient  Composition WSDL WSCI  Verification BPEL Petri-Net

19 … … WSDL

20 <action name="ReceiveStartpointDest “ role="tns:busAgent “ operation="tns:BRF/shortestpath"> <action name="Receivecheckpoint “ role=" tns:busAgent “ operation="tns:BRF/addCheckpoint"> … WSCI

21

22 1.Output 2.Operation in WSDL 3.Find the output information in CP1 (Web service component) 4.If Input of the operation == required input CP1 5.Else search in the WSCI of CP1 to find action == operation 6.Get the pervious action involved 7.Search in WSDL to find operation == action 8.If Input of the operation == required input CP4 Else, till the root of WSCI Web service composition CP7 CP10

23

24

25 Petri-Net – Basic Activities

26 Petri-Net – Structure Activities

27 Composed Petri-Net

28

29

30

31

32

33

34 Petri-Net (Four identical replicas)

35 Petri-Net (N-version Web service with voting)

36 Petri-Net (Recovery Block)

37 (a) (b) P1P1 λ1λ1 μ1c2μ1c2 S-j P2P2 μ2c2μ2c2 λ2λ2 S-j-1 S S-n F λNλN μ*c 2 (1-c 1 )μ* λ*λ* S-1S-2 λ*λ* μ*c2μ*c2 λ*λ* (1-c 1 )μ* F (1-c 1 )μ 1 (1-c 1 )μ 2 (1-c 2 )μ 1 (1-c 2 )μ 2 Reliability Model

38 Reliability Model IDDescriptionValue λnλn Network failure rate0.02 λ*Web service failure rate0.025 λ1λ1 Resource problem rate0.142 λ2λ2 Entry point failure rate0.150 μ*Web service repair rate0.286 μ1μ1 Resource problem repair rate0.979 μ2μ2 Entry point failure repair rate0.979 C1C1 Probability that the RM response on time0.9 C2C2 Probability that the server reboot successfully0.9

39 Outcome (SHARPE)

40 Conclusion  Surveyed replication and design diversity techniques for reliable services.  Proposed a hybrid approach to improving the availability of Web services.  Proposed a Web service composition algorithm and verified by Petri-Net.  Carried out a series of experiments to evaluate the availability and reliability of the proposed Web service system.

41 Q&A