Download presentation
Presentation is loading. Please wait.
1
Efficient Data Dissemination and Survivable Data Storage Lihao Xu http://www.cs.wayne.edu/~lihao/
2
Ubiquitous Information Access
3
Key Building Blocks Storage Retrieval Dissemination Consumption
4
Key Building Blocks
5
Error Correcting Codes
6
21k … 3 Message
7
Error Correcting Codes 21k … 3 Message Codeword 21n - 1 … 3n
8
Error Correcting Codes 21k … 3 Message Codeword 21n - 1 … 3n m 21k … 3 Message
9
MDS ( Maximum Distance Separable ) Codes m = k
10
(n,k) MDS Codes Reed-Solomon (RS) Code
11
(n,k) MDS Codes (4,2) B-Code a d+c b d+a c a+b d b+c
12
Data Dissemination: Broadcast Scheduling
13
Wireless Server Data Dissemination want 1 want 2 want 1 want 3 Wireless Clients
14
Wireless Server Broadcast in a Cell want 1 want 2 want 1 want 3 Wireless Clients
15
want 1 want 2 want 1 want 3 Wireless Server Broadcast Model Model clients as random processes Model clients as random processes Desired item is random with probability p i for item i of length l i. Desired item is random with probability p i for item i of length l i. Wireless Clients
16
Scheduling Problem S = 2 items, l 1 =l 2 Each item consists of k packets, k large Challenge: choose packet broadcast schedule to minimize wait for clients 1212
17
Prior Work Complexity of optimal schedules Complexity of optimal schedules Bar-Noy, Bhatia, Naor, Schieber, Foltz Complexity of computing optimal schedules Complexity of computing optimal schedules Kenyon, Schabanel Error correction/detection Error correction/detection Bestavros
18
Metric: Delivery Time Delivery Time for item 1 S = 1212
19
Delivery Time Total amount of time spent waiting for item i when starting at time in schedule S. Instant in time when client starts waiting for item. S = 1212
20
Expected Delivery Time (EDT) uniformly distributed over schedule S.
21
EDT Calculation 1212 P = P = 1/2 12
22
EDT Calculation 1212 DT2 P = P = 1/2 12
23
EDT Calculation 1212 DT23/2 P = P = 1/2 12
24
EDT Calculation 1212 DT23/2 P = P = 1/2 12 DT 1 7/4
25
EDT Calculation 1212 DT23/2 P = P = 1/2 12 DT 1 7/4 EDT 7/4
26
Performance with Errors Data items consist of k packets Data items consist of k packets What happens if a packet is lost? What happens if a packet is lost? Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k1
27
Performance with Errors What happens if a packet is lost? What happens if a packet is lost? Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k112345
28
Performance with Errors What happens if a packet is lost? What happens if a packet is lost? Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k112345 EDT = 3 !
29
Use k of n MDS code, n = 2k Use k of n MDS code, n = 2k Now only need to wait for 1 additional packet Solution – Coding Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k11 12345...k 12345...k k +
30
EDT = 9/4 EDT = 9/4 Solution – Coding Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k11 12345...k 12345...k k +
31
Solution – Coding Use k of n MDS code, m = 2(k+1) Use k of n MDS code, m = 2(k+1) Now only need to wait for 1 additional packet Original: Transmitted: 12345...k Received: 1 k + n12345...kn 12345...k1n12345...kn 12345...kn
32
Solution – Coding Original: Transmitted: 12345...k Received: 1 k + n12345...kn 12345...k1n12345...kn 12345...kn EDT = 7/4 + e
33
General Solution Original: Transmitted: 12345...k Received: 1 k + n12345...kn 12345...k1n12345...kn 12345...kn Given loss probability p, what is the optimal n?
34
General Solution
37
k = 100 and p = 0.1
38
General Solution k = 100
39
Two-Channel Broadcasting Wireless Server want 1 want 2 want 1 want 3 Wireless Clients Wireless Server
40
Coordinating Schedule Data Use (2k, k) MDS code to eliminate data overlap Use (2k, k) MDS code to eliminate data overlap Channel 1 sends packets 1 through k (raw data) Channel 2 sends packets k+1 through 2k Features Features Each channel is self-sufficient No overlap between channels S 1 = 12 12 S 2 = 12 12 (same schedule, different data)
41
Scheduling for two channels Scheduling for two channels Two items with equal length and demand Two synchronized channels of equal bandwidth First channel’s schedule fixed at 12 What is the optimal schedule for channel 2? What is the optimal schedule for channel 2? Two Broadcast Channels S 1 = S 2 = 12 ?
42
Some Schedules 12 12 12 12 12 12 12 12 Repeat Swap Shift 2 Reshuffle Unequal Portions 12111222 12 1 1 2 2 Arbitrary 2 11122
43
Some Schedules 12 12 12 12 12 12 12 12 Repeat Swap Shift 2 Reshuffle 11 Unequal Portions 12111222 1 12 1 1 2 2 Arbitrary 2 EDT = 1 22
44
Some Schedules 12 12 12 12 12 12 12 12 Repeat Swap Shift 2 Reshuffle 11 Unequal Portions 12111222 1 12 1 1 2 2 Arbitrary 2 EDT = 1 EDT = 63/64 EDT < 63/64? 22
45
Schedule Performance Symmetric Problem Symmetric Problem Equal lengths Equal demands Equal bandwidth channels Symmetric “fixed” schedule for 1 st channel Asymmetric Solution Asymmetric Solution Asymmetric schedules can beat any symmetric schedule for the 2 nd channel How is this possible?
46
More to Explore … More servers/Channels More servers/Channels Differing levels of synchronization Differing levels of synchronization Transmission Errors Transmission Errors Streaming Data Streaming Data Bounds Bounds Wireless Server want 1 want 2 want 1 want 3 Wireless Clients Wireless Server Wireless Server Wireless Server
47
Hydra: A Platform for SSS
48
Secure and Survivable Storage Availability Recoverability Persistence Confidentiality Integrity Scalability Efficiency
49
Secure and Survivable Storage Yahoo Ebay Amazon Google Banks Your Labs More …
50
Hydra
51
Hydra Design Goals Portable to various OS/FS Hardware independent Unix FS semantics maintained Low overhead in performance and storage Transport independent Easy to install, configure, scale, maintain and automate
52
Hydra and System App. Hydra FS I/O
53
Hydra and System App. Hydra FS I/O App. Hydra FS I/O
54
Hydra and System App. Hydra FS I/O App. Hydra FS I/O App. FS/Hydra I/O
55
Basics of Hydra (4,2) B-Code a d+c b d+a c a+b d b+c
56
Performance Test 2.4G P4, 512 MB, 80GB ATA/100 7200rpm, Redhat 9.0 (kernel 2.4.2.0) Operations Throughput ( Mbps ) File Read 384 File Write 200 Memory Copy 17572 (4,2) B-Code Encoding 5522 (4,2) B-Code Decoding 22866 (4,2) RS Encoding 286 (4,2) RS Decoding 216
57
Hydra Components Meta Data ( hnode) Operations Monitor
58
Hydra Meta Data Code Symbol Location Data Layout Security Flag Access Rights Extensions
59
Hydra Operations Distribute (Write) Recover (Read) Detect Repair Restore Others
60
Hydra Monitor Connectivity Security
61
Hydra Applications Web Server CDN/P2P/Data Server Archiving Data Security system activity logger, forensic, file integrity checker … Others
62
Acknowledgement
63
lihao@cs.wayne.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.