Reliability on Web Services Pat Chan 31 Oct 2006.

Reliability on Web Services Pat Chan 31 Oct 2006

Outline  Introduction to Web Services  Problem Statement  Methodologies for Web Service Reliability  New Reliable Web Service Paradigm  Optimal Parameters  Experimental Results and Discussion  Conclusion  Future Directions

Introduction  Service-oriented computing is becoming a reality.  The problems of service dependability, security and timeliness are becoming critical.  We propose experimental settings and offer a roadmap to dependable Web services.

Problem Statement  Fault-tolerant techniques Replication Diversity  Replication is one of the efficient ways for providing reliable systems by time or space redundancy. Increasing the availability of distributed systems Key components are re-executed or replicated Protect against hardware malfunctions or transient system faults.  Another efficient technique is design diversity. By independently designing software systems or services with different programming teams, Resort in defending against permanent software design faults.  We focus on the analysis of the replication techniques when applied to Web services.  A generic Web service system with spatial as well as temporal replication is proposed and investigated.

Road Map for Research  Redundancy in time  Redundancy in space Sequentially Parallel Majority voting using N modular redundancy Diversified version of different services

Replication Manager Web service selection algorithm WatchDog UDDI Registry WSDL Web Service IIS Application Database Web Service IIS Application Database Web Service IIS Application Database Client Port Application Database 1.Create web services 2.Select primary web service (PWS) 3.Register 4. Look up 5. Get WSDL 6.Invoke web service 8..Keep check the availability of all the web. 9.If Web service failed, update the list of availability of Web services 7.Update the WSDL Proposed Paradigm

Round Robin Web Service IIS Application Database Web Service IIS Application Database Web Service IIS Application Database Web Service IIS Application Database

RM sends message to the Web Service Update the Web service availability list Do not get reply Map the new address to the WSDL after RR System Fail Get reply All Service failed Work Flow of the Replication Manager

Experiments  A series of experiments are designed and performed for evaluating the reliability of the Web service, Spatial replication00001111 Reboot00110011 Retry01010101 Spatial replication Reboot Retry (1,1,1) (0,1,1) (1,0,0) (0,0,0) (1,0,1) (1,1,0) (0,1,0) (0,0,1)

Parameters  Computational load  Communication  Request frequency  Load of the server  Retry frequency  Polling frequency  Reboot time

Fault Mode  Machine – reboot the server with probability  Network – Network Fault Injection Tool (NFIT)  Application – inject fault in the application  Failure definition:  5 retries are allowed. If there is still no correct result from the web service after 5 retries, it is considered as a failure.

Experimental Setup  Examine the computational to communication ratio  Examine the request frequency to limit the load of the server to 75%  Fix the following parameters Computational to communication ratio (e.g 10:1) Request frequency

Experimental Setup -- Overload prime(2, 12000) Communication time: Computational time 143:14 (10:1) Request frequency5 req/s (200ms) Load78.5% Timeout period of retry10s Timeout for web service in RM1s (web service specific) Polling frequency5 requests per min Number of replicas3 Max number of retries5 Round Robin rate1 s

Experiment Parameters  Fault mode Timing Calculation Temporary (fault probability: 0.01) Permanent (fault probability: 0.001)  Experiment time 30mins (9000 requests)  Measure: Number of failures Average response time (ms)

Experimental Result (failures / response time in ms) ExperimentsSingle serverSingle server with retry Single server with reboot (continues no response for 3 requests) Single server with retry and reboot Spatial Replication RR Hybrid approach RR+Retry Hybrid approach RR+ Reboot All round approach RR spatial + Retry (5 times) + reboots Normal case0 / 1570 / 156 0 / 155 0 / 156 0 / 157 Timing Temp93 / 1580 / 26991 / 1600 / 25689 / 1560 / 25885 / 1550 / 243 Calculation Temp 87 / 1570 / 25488 / 1570 / 26178 / 1550 / 24487 / 1550 / 231 Timing Perm8174 / --8362 / --2505 / --3 / 29387177 / --7284 / --76 / 1570 / 172 Calculation Perm 8256 / --8581 / --2782 / --2 / 28727529 / --7425 / --69 / 1580 / 176

Varying the parameters  Number of tries  Timeout period for retry in single server  Timeout period for retry in our paradigm  Polling frequency  Number of replicas  Load of server

Number of tries Number of failures in Temp Number of failures in Perm 09576 122 200 300 400 500

Timeout period for retry in single server Timeout period for retry (s) Number of failures in Temp Number of failures in Perm 0957265 227156 507314 606890 70189 8082 9011 1002 1200 1400 1600 1800

Timeout period for retry in single server # of failure Timeout period

Timeout period for retry in our paradigm Timeout period for retry (s) Number of failures in Temp Number of failures in Perm 0281 202 500 1000 2000

Polling frequency (number of requests per min) Number of failures in Temp Number of failures in Perm 007124 10811 2030 5012 1001 15213254 2011241023

Polling frequency # of failure Polling frequency

Vary the reboot time with different polling frequency # of failure Polling frequency Reboot time

Number of Replicas Number of replicasNumber of failures in TempNumber of failures in Perm No replica918152 22356 300 400

Load of Web Server Load of the web server (%) Number of failures in Temp Number of failures in Perm 7000 7500 8023 851014 90512528 9532143125 10087928845 11089978994

Summary of Parameters  Number of tries = 2  Timeout period for retry in single server = 10s  Timeout period for retry in our paradigm = 5s  Polling frequency = 10 request per min  Number of replicas = 3  Load of server = < 75%

Cross-continental  Experiments cross-continental (Berlin CUHK)  Setting:  UDDI: gaia  Replication Manager: gaia  Client: 141.20.33.17  Replica: PC89084, PC89083, PC89082

Experimental Result (failures / response time in ms) ExperimentsSingle server with retry Single server with reboot (continues no response for 3 requests) Single server with retry and reboot Spatial replication RR Hybrid approach RR+Retry Hybrid approach RR+ Reboot All round approach RR spatial + Retry (5 times) + reboots Normal case0 / 6720 / 673 0 / 671 0 / 6720 / 6730 / 671 Timing Temp 92 / 6740 / 77691 / 6730 / 77189 / 6720 / 77885 / 6720 / 772 Calculation Temp 89 / 6730 / 76888 / 6720 / 76978 / 6710 / 77487 / 6730 / 769 Timing Perm 8210 / --8258 / --2410 / --5 / 30767456 / --7310 / --71 / 6740 / 712 Calculation Perm 8189 / --8412 / --2623 / --3 / 29857125 / --7514 / --68 / 6710 / 718

Cross-continental 2  Setting:  UDDI: gaia  Replication Manager: gaia  Client: 141.20.33.17  Replica: luna, Uranus, PC89084, PC89083

Experimental Result (failures / response time in ms) ExperimentsSingle server with retry Single server with reboot (continues no response for 3 requests) Single server with retry and reboot Spatial replication RR Hybrid approach RR+Retry Hybrid approach RR+ Reboot All round approach RR spatial + Retry (5 times) + reboots Normal case0 / 4100 / 4120 / 411 0 / 4100 / 4120 / 4110 / 410 Timing Temp 89 / 4120 / 51390 / 4130 / 51289 / 4120 / 51087 / 4130 / 511 Calculation Temp 91 / 4110 / 51487 / 4110 / 50980 / 4100 / 50888 / 4120 / 509 Timing Perm 8210 / --8412 / --2714 / --5 / 28567451 / --7247 / --75 / 4110 / 431 Calculation Perm 8145 / --8247 / --2563 / --4 / 27457320 / --7235 / --68 / 4100 / 433

Average  Average response time from local: 156ms  Average response tine from CUHK: 670ms  Average response time of the whole system: 411ms

Conclusion  Surveyed replication and design diversity techniques for reliable services.  Proposed a hybrid approach to improving the availability of Web services.  Carried out a series of experiments to evaluate the availability and reliability of the proposed Web service system.  Optimal parameter are obtained.

Future Directions  Improve the current fault-tolerant techniques Current approach can deal with hardware and software failures. How about software fault detectors?  N-version programming Different providers provide different solutions. There is a problem in failover or switch between the Web Services.

Future Directions  Modeling the Web Service behavior The behavior of the We Services can be studied.  Modeling the failure The failure model of Web Service is different from traditional software.

Reliability on Web Services Pat Chan 31 Oct 2006.

Similar presentations

Presentation on theme: "Reliability on Web Services Pat Chan 31 Oct 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Reliability on Web Services Pat Chan 31 Oct 2006.

Similar presentations

Presentation on theme: "Reliability on Web Services Pat Chan 31 Oct 2006."— Presentation transcript:

Similar presentations

About project

Feedback