A Fault Tolerance Protocol for Uploads: Design and Evaluation

Slides:



Advertisements
Similar presentations
Primitives for Achieving Reliability 3035/GZ01 Networked Systems Kyle Jamieson Department of Computer Science University College London.
Advertisements

Abstract HyFS: A Highly Available Distributed File System Jianqiang Luo, Mochan Shrestha, Lihao Xu Department of Computer Science, Wayne State University.
CSE 461: Error Detection and Correction. Next Topic  Error detection and correction  Focus: How do we detect and correct messages that are garbled during.
Reliability & Channel Coding
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department.
Chapter 6 Errors, Error Detection, and Error Control.
Scalable On-Demand Media Streaming With Packet Loss Recovery Anirban Mahanti, Derek L. Eager, Mary K. Vernon, and David J. Sundaram-Stukel IEEE/ACM Trans.
1 Forward Error Correction Shimrit Tzur-David School of Computer Science and Engineering Hebrew University of Jerusalem.
Unit 1 Protocols Learning Objectives: Understand the need to detect and correct errors in data transmission.
Using Redundancy and Interleaving to Ameliorate the Effects of Packet Loss in a Video Stream Yali Zhu, Mark Claypool and Yanlin Liu Department of Computer.
Scalable Authentication of MPEG-4 Streams Yongdong Wu & Robert H. Deng present: Yu-Song Syu.
TCP: Software for Reliable Communication. Spring 2002Computer Networks Applications Internet: a Collection of Disparate Networks Different goals: Speed,
Using Interleaving to Ameliorate the Effects of Packet Loss in a Video Stream Mark Claypool and Yali Zhu Computer Science Department Worcester Polytechnic.
A Selective Retransmission Protocol for Multimedia on the Internet Mike Piecuch, Ken French, George Oprica and Mark Claypool Computer Science Department.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
Process-to-Process Delivery:
©2001 Pål HalvorsenINFOCOM 2001, Anchorage, April 2001 Integrated Error Management in MoD Services Pål Halvorsen, Thomas Plagemann, and Vera Goebel University.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
RAID REDUNDANT ARRAY OF INEXPENSIVE DISKS. Why RAID?
An Analytical Model for Progressive Mesh Streaming Wei Cheng, Wei Tsang Ooi School of Computing, National University of Singapore. Sebastian Mondet, Romulus.
Performance Analysis of MPEG-4 Video Stream with FEC Error Recovery over IEEE DCF WLAN Cheng-Han Lin, Huai-Wen Zhang, Ce-Kuen Shieh Department of.
Lecture 10: Error Control Coding I Chapter 8 – Coding and Error Control From: Wireless Communications and Networks by William Stallings, Prentice Hall,
© 2009 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved. 1 Communication Reliability Asst. Prof. Chaiporn Jaikaeo, Ph.D.
CIT 307 Online Data Communications Error Detection Module 11 Kevin Siminski, Instructor.
MULTI-TORRENT: A PERFORMANCE STUDY Yan Yang, Alix L.H. Chow, Leana Golubchik Internet Multimedia Lab University of Southern California.
COSC 3213: Computer Networks I Instructor: Dr. Amir Asif Department of Computer Science York University Section M Topics: 1. Error Detection Techniques:
Software Architectural Assumptions in Software Architecting Chen Yang a,b, Peng Liang a, Paris Avgeriou b a State Key Lab of Software Engineering, Wuhan.
COMPUTER NETWORKS Ms. Mrinmoyee Mukherjee Assistant Professor St. Francis Institute of Technology, Mount Poinsur, S.V.P Road, Borivli (west), Mumbai
Analysis of TCP Latency over Wireless Links Supporting FEC/ARQ-SR for Error Recovery Raja Abdelmoumen, Mohammad Malli, Chadi Barakat PLANETE group, INRIA.
TCP-Cognizant Adaptive Forward Error Correction in Wireless Networks
Forward Error Correction vs. Active Retransmit Requests in Wireless Networks Robbert Haarman.
Computer Science Division
Data Link Layer. Data Link Layer Topics to Cover Error Detection and Correction Data Link Control and Protocols Multiple Access Local Area Networks Wireless.
David Wetherall Professor of Computer Science & Engineering Introduction to Computer Networks Error Coding Overview (§3.2)
Reliable Client-Server Communication. Reliable Communication So far: Concentrated on process resilience (by means of process groups). What about reliable.
Winter 2007CS244a Handout 141 CS244a: An Introduction to Computer Networks Handout 14: Error Detection and Correction Nick McKeown Professor of Electrical.
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
LOP_RE: Range Encoding for Low Power Packet Classification Author: Xin He, Jorgen Peddersen and Sri Parameswaran Conference : IEEE 34th Conference on Local.
Practical Session 10 Computer Architecture and Assembly Language.
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Ing-Ray Chen, Member, IEEE, Hamid Al-Hamadi Haili Dong Secure and Reliable Multisource Multipath Routing in Clustered Wireless Sensor Networks 1.
Performance Analysis of MPEG-4 Video Stream with FEC Error Recovery over IEEE DCF WLAN Cheng-Han Lin, Huai-Wen Zhang, Ce-Kuen Shieh Department of.
CS4470 Computer Networking Protocols
Dr. Clincy Professor of CS
Computer Architecture and Assembly Language
Introduction to Information Technologies
Part III. Data Link Layer
The Three Main Sources of Transmission Errors
Advanced Computer Networks
MinJi Kim, Muriel Médard, João Barros
Error recovery for Packet Audio and Video
Packetizing Error Detection
Packetizing Error Detection
Process-to-Process Delivery:
Introduction to Information Technologies
CS4470 Computer Networking Protocols
Transmission Errors Error Detection and Correction
Error Detection Neil Tang 9/26/2008
Packetizing Error Detection
Protocols and the TCP/IP Suite
CS 325: CS Hardware and Software Organization and Architecture
Error Detection and Correction
Reliability and Channel Coding
Computer Architecture and Assembly Language
Transmission Errors Error Detection and Correction
Process-to-Process Delivery: UDP, TCP
Types of Errors Data transmission suffers unpredictable changes because of interference The interference can change the shape of the signal Single-bit.
Presentation transcript:

A Fault Tolerance Protocol for Uploads: Design and Evaluation Leslie Cheung*, Cheng-Fu Chou#, Leana Golubchik*, Yan Yang* *Internet Multimedia Lab Computer Science Department & IMSC University of Southern California #Department of Computer Science and Information Engineering  National Taiwan University

Background and Motivation Bistro: A scalable, secure, wide area upload architecture ISPA '04

Background and Motivation ISPA '04

Problem When some intermediaries fail or are malicious, the original protocol does not perform well Need to ask clients to retransmit lost data Goals Improve performance by reducing retransmissions Reduce the amount of redundant data ISPA '04

Outline of Fault Tolerance Protocol Erasure Code: Let k be the number of data packets, an erasure code encoder adds (n-k) parity packets to make it a n packets file Erasure codes assume that received packets are correct. This assumption is invalid because data can be corrupted. Solution: Use checksums to detect corrupted packets Drop corrupted packets, and treat them as losses ISPA '04

Outline of Fault Tolerance Protocol Definition: Checksum groups Generate a checksum of a group of packets ISPA '04

Outline of Fault Tolerance Protocol ISPA '04

Analytical Models Reliability Model Performance Model Cost Function Packet Lost Independently with probability p Metric (c1) Probability of retransmissions Performance Model Performance at the first step Size of the timestamp request messages Metric (c2) Number of checksums per data packet Cost Function Cost = w1 * c1 + w2 * c2 w1, w2: weights ISPA '04

Numerical Results Vary different parameters in the cost function Parameters of interest (n-k): number of parity packets in FEC group k: number of data packets in FEC group Few large FEC groups vs many small FEC groups Z: number of checksum groups in a FEC group p: probability of losing a packet ISPA '04

Numerical Results Varying (n-k), number of parity packets per FEC group Y = 5, k = 10, Z = 2,3,…, p = 0.01, w1 = 0.9, w2=0.1 ISPA '04

Numerical Results Varying k, number of data packets per FEC group W = 100, n = 2k , Z = 2, p = 0.01, w1 = 0.9, w2=0.1 ISPA '04

Numerical Results Varying Z, number of checksum groups per FEC group Y = 5, n = 20, k = 10, p = 0.01, w1 = 0.9, w2=0.1 ISPA '04

Numerical Results Varying p, probability of losing a packet Y = 5, n = 20, k = 10, Z = 2, w1 = 0.9, w2=0.1 ISPA '04

Conclusions and Future Work Fault tolerance is important in uploads Our protocol is in the right direction Future Work How to set the parameters? Striping reliability and performance Data collection problem (not all packets are needed with erasure code) ISPA '04

Ordering of FEC group & checksum group Can we reverse the order of FEC group and checksum group? No. We drop the all packets in a checksum group if the checksum check fails. If we reverse the order, losing one packet would result in dropping all packets in a checksum group, which consist of a number of FEC groups. Do not have any packets to recover the lost part. ISPA '04