Dept. of Computer Science & Engineering, CUHK Fault Tolerance and Performance Analysis in Wireless CORBA Chen Xinyu 2002-12-09 Supervisor: Markers: Prof.

Slides:



Advertisements
Similar presentations
TU/e Service Discovery Mechanisms: two case studies / IC2002 Service Discovery Mechanisms: Two case studies Control point Device UPnP Terminal Domain Host.
Advertisements

Uncoordinated Checkpointing The Global State Recording Algorithm.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Towards an Exa-scale Operating System* Ely Levy, The Hebrew University *Work supported in part by a grant from the DFG program SPPEXA, project FFMK.
Availability in Globally Distributed Storage Systems
Expected-Reliability Analysis for Wireless CORBA with Imperfect Components Chen Xinyu
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
MPICH-V: Fault Tolerant MPI Rachit Chawla. Outline  Introduction  Objectives  Architecture  Performance  Conclusion.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.
Queueing Analysis for Access Points with Failures and Handoffs of Mobile Stations in Wireless Networks Chen Xinyu and Michael R. Lyu The Chinese Univ.
CS-550 (M.Soneru): Recovery [SaS] 1 Recovery. CS-550 (M.Soneru): Recovery [SaS] 2 Recovery Computer system recovery: –Restore the system to a normal operational.
Dept. of Computer Science & Engineering, CUHK Performance and Effectiveness Analysis of Checkpointing in Mobile Environments Chen Xinyu
An Authentication Service Against Dishonest Users in Mobile Ad Hoc Networks Edith Ngai, Michael R. Lyu, and Roland T. Chin IEEE Aerospace Conference, Big.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
1 CS 194: Distributed Systems Distributed Commit, Recovery Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering.
HOST MOBILITY SUPPORT BAOCHUN BAI. Outline Characteristics of Mobile Network Basic Concepts Host Mobility Support Approaches Hypotheses Simulation Conclusions.
SRDS’03 Performance and Effectiveness Analysis of Checkpointing in Mobile Environments Xinyu Chen and Michael R. Lyu The Chinese Univ. of Hong Kong Hong.
A Survey of Rollback-Recovery Protocols in Message-Passing Systems M. Elnozahy, L. Alvisi, Y. Wang, D. Johnson Carnegie Mellon University Presented by:
Budapest University of Technology and Economics Department of Measurement and Information Systems 1 Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis.
Dept. of Computer Science & Engineering The Chinese Univ. of Hong Kong On Fault Tolerance, Performance, and Reliability for Wireless and Sensor Networks.
Transport Layer Issue in Wireless Ad Hoc and Sensor Network
1 On Failure Recoverability of Client-Server Applications in Mobile Wireless Environments Ing-Ray Chen, Baoshan Gu, Sapna E. George and Sheng- Tzong Cheng.
Wireless CORBA Richard Verhoeven. Content Quick Introduction to CORBA Wireless & Mobile Wireless CORBA Test Case Conclusions.
Wireless Access and Terminal Mobility in CORBA Dimple Kaul, Arundhati Kogekar, Stoyan Paunov.
UbiStore: Ubiquitous and Opportunistic Backup Architecture. Feiselia Tan, Sebastien Ardon, Max Ott Presented by: Zainab Aljazzaf.
BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
1 A Dynamical Redirection Approach to Enhancing Mobile IP with Fault Tolerance in Cellular Systems Jenn-Wei Lin, Jichiang Tsai, and Chin-Yu Huang IEEE.
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
12. Recovery Study Meeting M1 Yuuki Horita 2004/5/14.
Sunday, October 15, 2000 JINI Pattern Language Workshop ACM OOPSLA 2000 Minneapolis, MN, USA Fault Tolerant CORBA Extensions for JINI Pattern Language.
CS5204 – Operating Systems 1 Checkpointing-Recovery.
ISADS'03 Message Logging and Recovery in Wireless CORBA Using Access Bridge Michael R. Lyu The Chinese Univ. of Hong Kong
PRoPHET+: An Adaptive PRoPHET- Based Routing Protocol for Opportunistic Network Ting-Kai Huang, Chia-Keng Lee and Ling-Jyh Chen.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002.
1 Recovery in the Mobile Wireless Environment Using Mobile Agents S. Gadiraju, V. Kumar Presented by Yamin Yu.
Dual-Region Location Management for Mobile Ad Hoc Networks Yinan Li, Ing-ray Chen, Ding-chau Wang Presented by Youyou Cao.
Sapna E. George, Ing-Ray Chen Presented By Yinan Li, Shuo Miao April 14, 2009.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
An Efficient Quorum-based Fault- Tolerant Approach for Mobility Agents in Wireless Mobile Networks Yeong-Sheng Chen Chien-Hsun Chen Hua-Yin Fang Department.
EEC 688/788 Secure and Dependable Computing Lecture 6 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Movement-Based Check-pointing and Logging for Recovery in Mobile Computing Systems Sapna E. George, Ing-Ray Chen, Ying Jin Dept. of Computer Science Virginia.
Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007 Movement-Based Checkpointing and Logging for Recovery in Mobile Computing Systems.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
1 Fault Tolerance and Recovery Mostly taken from
EEC 688/788 Secure and Dependable Computing
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
EECS 498 Introduction to Distributed Systems Fall 2017
Fault Tolerance Distributed Web-based Systems
EEC 688/788 Secure and Dependable Computing
Fault Tolerance CSC 8320 : AOS Class Presentation Shiraj Pokharel
Middleware for Fault Tolerant Applications
EEC 688/788 Secure and Dependable Computing
Wireless CORBA Richard Verhoeven.
EEC 688/788 Secure and Dependable Computing
Brahim Ayari, Abdelmajid Khelil, Neeraj Suri and Eugen Bleim
EEC 688/788 Secure and Dependable Computing
Co-designed Virtual Machines for Reliable Computer Systems
Group Service in CORBA Xing Gang Supervisor: Prof. Michael R. Lyu
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
System-Level Support CIS 640.
Brahim Ayari, Abdelmajid Khelil and Neeraj Suri
Presentation transcript:

Dept. of Computer Science & Engineering, CUHK Fault Tolerance and Performance Analysis in Wireless CORBA Chen Xinyu Supervisor: Markers: Prof. Michael R. Lyu Prof. Jerome Yen Prof. John C.S. Lui

Outline u Motivation u Wireless CORBA u Fault Tolerant Wireless CORBA u Performance and Availability Analysis u Conclusions and Future Work

Motivation u Mobile Computing u Permanent failures Physical damage u Transient failures Mobile host Wireless link Environmental conditions u Fault Tolerant CORBA Entity replication

Visited Domain Home Domain Terminal Domain Wireless CORBA Architecture Access Bridge Static Host Terminal Bridge GIOP Tunnel ab 1 ab 2 mh 1 GTP Messages

Visited Domain ab 1 ab 2 Wireless CORBA Architecture Access Bridge Static Host Home Domain Home Location Agent Terminal Domain Terminal Bridge GIOP Tunnel mh 1 Terminal Domain Terminal Bridge GIOP Tunnel GIOP Tunnel mh 1 Terminal Domain Terminal Bridge GIOP Tunnel mh 1 Terminal Domain Terminal Bridge Access Bridge

Outline u Motivation u Wireless CORBA u Fault Tolerant Wireless CORBA u Performance and Availability Analysis u Conclusions and Future Work

Basic Concepts u Checkpoint the saved program’s states during failure-free execution u Repair brings the failed device back to normal operation u Rollback reloads the program’s states saved at the most recent checkpoint u Recovery the reprocessing of the program, starting from the most recent checkpoint, applying the logged messages and until the point just before the failure

Device, Wireless & Mobile Issues u Device Issues Slow processor Small memory Small disk space Low power supply Physical damage Applying mobile host as stable storage  a large number of system messages or a large size of information carried in one message   Checkpoints and Logs collection u Wireless Issues High bit error rate Little bandwidth Long transfer delay u Mobile Issue Handoff  Applying Access Bridge as stable storage  Uncoordinated checkpointing Pessimistic message logging

Fault Tolerance Architecture Client Object Terminal Bridge Recovery Mechanism ORB Platform Mobile Host Recovery Mechanism Logging Mechanism Platform Access Bridge Mobile Side Fixed Side Mobile Support Station ORB Recovery Mechanism Logging Mechanism ORB Platform Static Server GIOP Tunnel Multicast Messages Object Replica

Mobile Host Handoff Access Bridge 1 Access Bridge 2 Access Bridge 3 Home Location Agent Handoff Location Update

Home Location Agent Mobile Host Handoff Access Bridge 1 Access Bridge 2 Access Bridge 3 Handoff Location Update

Home Location Agent Mobile Host Crash Access Bridge 1 Access Bridge 2 Access Bridge 3

Home Location Agent Mobile Host Recovery Access Bridge 1 Access Bridge 2 Access Bridge 3 Collect last checkpoint and succeeded message logs Sorted by Ack. SN Reconnect Messages Replay

Outline u Motivation u Wireless CORBA u Fault Tolerant Wireless CORBA u Performance and Availability Analysis u Conclusions and Future Work

Assumptions u Failure occurrence, message arrival and handoff event  homogeneous Poisson process with parameter,  and  respectively u Failures do not occur when the program is in the repair or rollback process u A failure is detected as soon as it occurs

Execution without Checkpointing RY0Y0 X0X0 R F1F1 H1H1 Z0Z0 0 t FjFj HkHk m j (1)m j (N)m 1 (n 1 )m 0 (N) X(N) RepairHandoff HH

Conditional Execution Time & LST

LST and Expectation of Program Execution Time u u

CiCi Execution with Equi-number Checkpointing R+C Y i (0) X i (0) R+C F i (1) H i (1) Z i (0) 0 t F i (j) H i (k) m ij (1)m ij (a)m i1 (n i1 )m i0 (a) X i (N,a) Repair + RollbackHandoff C i-1 Checkpointing HHCC

Conditional Execution Time & LST

LST and Expectation of Program Execution Time u u

Average Availability u uptime interval: a program produces useful work towards its completion u downtime interval: Repair and rollback Handoff Checkpoint creation Wasted Computation u average availability: how much of the time an MH is in uptime interval during an execution

Optimal Checkpointing Interval u 

Beneficial Condition u 

Equi-number Checkpointing u Equi-number checkpointing with respect to message number Message number in each checkpointing interval is not changed u Equi-number checkpointing with respect to checkpoint number Checkpoint number is not changed

Equi-number Checkpointing with respect to Checkpoint Number

Equi-number Checkpointing with respect to Message Number

Comparison Between Checkpointing and Without Checkpointing

Average Availability vs. Message Arrival Rate and Handoff Rate

Conclusions u Fault tolerant wireless CORBA u Equi-number checkpoiting strategy u LST and expectation of program execution time u Average availability u Optimal checkpointing interval u Beneficial condition

Future Work u Analysis model The message queuing effect during repair and recovery u Failure detector Distributed consensus with link failures, process failures, and mobile disconnections Leads to a faster solution Reduces communication costs u Fault tolerance in Ad Hoc network Without infrastructure support Self-organizing and adaptive

Thank You