DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Vishal Misra Joint work with Abhinav Kamra, Jon Feldman (Google) and Dan Rubenstein
MSR Cambridge, 7/13/06 2 A generic sensor network Sink(s) Sensor Nodes Data follows multi-hop path to sink(s) data Sensed Data
MSR Cambridge, 7/13/06 3 An abstract channel Sensor Nodes Sinks Node on route to sink fails Communication dies Nodes fail Erasure Channel: Need Some Reliability Mechanism
MSR Cambridge, 7/13/06 4 Data Persistence We define data persistence of a sensor network to be the fraction of data generated within the network that eventually reaches the sink. Focus of Work: Maximizing Data Persistence
MSR Cambridge, 7/13/06 5 Specific Context Sensor Networks in a Disaster setting Monitoring earthquakes, fires, floods etc.. Network might get destroyed before delivering data Disaster event might cause spikes in sensed data: congestion near sinks Partial recovery of data also useful
MSR Cambridge, 7/13/06 6 Increasing persistence Open Loop Approach (coding) Apply channel codes to recover from errors Closed Loop Approach (networking) Employ feedback to retransmit lost data Exploit topology awareness to route along surviving paths
MSR Cambridge, 7/13/06 7 Traditional approaches Coding: erasure codes Gallager Codes [1962], Rediscovered as LDPC RS Codes [1960, Reed and Solomon] Tornado Codes [1997, Luby et al.] Luby Transform Codes [1998, Luby] Come back to them later Raptor Codes [2001, Shokrollahi] Networking: reliable transport protocols for sensor networks PSFQ [2002, Wan et al.] RMST [2003, Stann et al.] ESRT [2003, Akylidz et al.]
MSR Cambridge, 7/13/06 8 Why our problem is different (coding perspective) Traditional approaches implement single source channel coding Our data source is distributed Traditional approaches aim at full recovery from errors (erasures) In sensor networks partial recovery is useful and important
MSR Cambridge, 7/13/06 9 Why our problem is different (networking perspective) In disaster scenarios need quick delivery of data Feedback often infeasible Often, no time to set up routing trees Approach should employ minimal configuration Difficult to predict which nodes will survive Sinks might get destroyed. Location of sinks unkown Surviving routes unknown Feedback based approaches may not scale Sensor nodes have limited resources to implement complex functionality
MSR Cambridge, 7/13/06 10 Our Approach Two main ideas Randomized routing and replication Push data in random directions to ensure survival Distributed channel codes that optimize data delivery (Growth Codes) Based on LDPC erasure codes
MSR Cambridge, 7/13/06 11 Solution Features Data replication (for persistence) Explicit routing not required Can employ if present No feedback from sink necessary Partial data recovery Completely distributed
MSR Cambridge, 7/13/06 12 First Idea: Random Replication Nodes transfer sensed data with random neighbors Process iterates and sensed data is copied across the network Sensed data goes on a “random walk” through the network Process robust to localized failures Can be thought of as a replication code Codes Naïve: Can we do better?
MSR Cambridge, 7/13/06 13 Brief Segway: Digital Fountain Digital Fountain: Source splits message into smaller data symbols Data symbols are encoded into codewords Potentially infinitely many unique codewords Clients can decode original data with sufficiently many unique codewords Low overhead erasure resistant channel codes
MSR Cambridge, 7/13/06 14 Luby Transform (LT) Codes Rateless erasure codes LT Codes are universal in the sense that they Are near optimal for every erasure channel Are very efficient as the data length grows.
MSR Cambridge, 7/13/06 15 Erasure Codes: LT-Codes b1b1 b2b2 b3b3 b4b4 b5b5 F= n=5 input blocks
MSR Cambridge, 7/13/06 16 LT-Codes: Encoding b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 1.Pick degree d 1 from a pre- specified distribution. (d 1 =2) 2.Select d 1 input blocks uniformly at random. (Pick b 1 and b 4 ) 3.Compute their sum (XOR). 4.Output sum, block IDs E(F)= F=
MSR Cambridge, 7/13/06 17 LT-Codes: Encoding E(F)= b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 F=
MSR Cambridge, 7/13/06 18 LT-Codes: Decoding b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 c7c7 b5b5 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b2b2 b2b2 b1b1 b2b2 b3b3 b4b4 b5b5 c1c1 c2c2 c3c3 c5c5 c6c6 c7c7 b5b5 c4c4 b5b5 b5b5 b2b2 b2b2 Key to efficiency: the right degree distribution Receiver
MSR Cambridge, 7/13/06 19 Degree Distribution for LT-Codes Soliton Distribution: Avg degree H(N) ~ ln(N) In expectation: Exactly one degree 1 symbol in each round of decoding Distribution very fragile in practice, fixed with Robust Soliton Soliton wave is one where dispersion balances refraction perfectly. Soliton Distribution: input symbols are added to the ripple at the same rate as they are processed
MSR Cambridge, 7/13/06 20 Thought: Sensor Digital Fountain? Sensor Nodes Sinks Information survives losses
MSR Cambridge, 7/13/06 21 LT codes for sensor networks? Sensed data could be the data units, but… How do we achieve a given degree distribution? LT codes designed for centralized sources Sensor networks have distributed data sources As a thought experiment, assume that magically we can implement distributed LT codes
MSR Cambridge, 7/13/06 22 Perfect Source Simulation: Sampling ideal distributions (N = 1500) Initially, no coding does Better than Robust Soliton! Robust Soliton improves as more codewords are received
MSR Cambridge, 7/13/06 23 Toy problem Suppose a sink could ask for a codeword of the right degree, still chosen randomly, what would be the most useful? A:Time dependent!
MSR Cambridge, 7/13/06 24 Coupon Collector’s Problem
MSR Cambridge, 7/13/06 25 Growth Codes Degree of a codeword “grows” with time At each timepoint codeword of a specific degree has the most utility for a decoder (on average) This “most useful” degree grows monotonically with time R: Number of decoded symbols sink has R1R1 R3R3 R2R2 R4R4 d=1 d=2d=3d=4 Time ->
MSR Cambridge, 7/13/06 26 Growth Codes: Encoding R i is what the sink has received What about encoding? To decode R i, sink needs to receive some K i codewords, sampled uniformly Sensor nodes estimate K i and transition accordingly Optimal transition points a function of N, the size of the network Exact value of K 1 computed. Upper bounds for K i, i > 1 computed.
MSR Cambridge, 7/13/06 27 Distributed Implementation of Growth Codes Time divided into rounds Each node exchanges degree 1 codewords with random neighbor until round K 1 Between round K i and K i-1 nodes exchange degree i codewords Sink receives codewords as they get exchanged in the network Growth Code degree distribution at time k k) := i = max(0, min( (K i -K i-1 )/k, (k-K i-1 )/k))
MSR Cambridge, 7/13/06 28 Sensor Network Model N node sensor network Limited storage at each sensor node Large storage at sink All sensed data assumed independent Do not consider source coding Sink x1x1 x9x9 x 10 x1x1 x2x2 x2x2 x3x3 x4x4 x6x6 x4x4
MSR Cambridge, 7/13/06 29 High Level View of the Protocol x1x1 x3x3 In the beginning: Nodes 1 and 3 exchanging codewords 3 x3x3 x3x3 x3x3 x3x3 x1x1 x1x1 x1x1 x1x1 Later on: Node 1 is destroyed: Symbol x 1 survives in the network. Nodes are now exchanging degree 2 codewords x4⊕x3x4⊕x3 x8x8 x8⊕x7x8⊕x7 x1⊕x4x1⊕x4 x2⊕x8x2⊕x8 x3x3 x6⊕x3x6⊕x3 x4⊕x5x4⊕x5 x2⊕x8x2⊕x8 x1⊕x4x1⊕x4
MSR Cambridge, 7/13/06 30 Received codewords Iterative Decoding x1x1 x3x3 x5x5 x2x2 x1x1 x3x3 x4x4 x3x3 Recovered symbols Unused codewords 5 original symbols x 1 … x 5 4 codewords received Each codeword is XOR of component original symbols
MSR Cambridge, 7/13/06 31 Online Decoding at the Sink x1x1 Recovered Symbols x6x6 x3x3 Undecoded codewords x2⊕x5x2⊕x5 Sink New codeword x2⊕x6x2⊕x6 x1x1 Recovered Symbols x6x6 x3x3 Undecoded codewords x2x2 = x6x6 ⊕ x2⊕x5x2⊕x5 x5x5 = x2x2 ⊕ x2⊕x6x2⊕x6 x5x5 Sink x2x2
MSR Cambridge, 7/13/06 32 Revisiting earlier simulation (N = 1500)
MSR Cambridge, 7/13/06 33 Time to recover all data Phase transition in obtaining last few data units (coupon collector’s problem)
MSR Cambridge, 7/13/06 34 Recovery Rate Without coding, a lot of data is lost during the disaster even when using randomized replication
MSR Cambridge, 7/13/06 35 Effect of Topology 500 nodes placed at random in a 1x1 square, nodes connected if within a distance of 0.3
MSR Cambridge, 7/13/06 36 Resilience to Random Failures 500 node random topology network Nodes fail every second with a probability of (1 every 4 seconds in the beginning)
MSR Cambridge, 7/13/06 37 Experiments with Motes Crossbow micaz 2.4GHz IEEE 250 Kbps High Data Rate Radio
MSR Cambridge, 7/13/06 38 Motes experiment
MSR Cambridge, 7/13/06 39 Motes experiment: continued
MSR Cambridge, 7/13/06 40 Conclusions Developed distributed channel codes to maximize data persistence in (sensor) networks First (to our knowledge) time varying LDPC codes Proved Optimality of Growth Codes Protocol requires minimal configuration (only rough estimate of network size needed) Tested system with simulations and implementation on mica motes More information: (tech report available, paper appearing in Sigcomm 2006)