Data Fusion in Sensor Networks Asheq Khan
Oct 28, 2004Asheq Khan Outline Introduction Key concepts Three schemes –Cluster based data fusion –Synchronization among nodes –Resistance against attacks Conclusion
Oct 28, 2004Asheq Khan Introduction A sensor network comprises of sensor nodes and a base station. Each sensor node is battery powered and equipped with: –Integrated sensors –Data processing capabilities –Short-range radio communications Due to their limited power and shorter communication range, sensor nodes perform in-network data fusion.
Oct 28, 2004Asheq Khan Data Fusion Process A data fusion node collects the results from multiple nodes. It fuses the results with its own based on a decision criterion. Sends the fused data to another node/base station. Advantages: –Reduces the traffic load. –Conserves energy of the sensors.
Oct 28, 2004Asheq Khan Key Concepts in Data Fusion Three questions needs to be addressed: First, at what instance does a node report a sensed event? Second, how does a node fuse multiple reports into a single one? Third, what data fusion architecture to use?
Oct 28, 2004Asheq Khan Reporting Periodical reporting: Sensor nodes periodically send reports to the base station. Base station inquiry response reports: the BS queries sensors in specific regions for current sensed information. Event triggered reports: The occurrence of a certain event can trigger reports from sensors in that particular region.
Oct 28, 2004Asheq Khan Fusion Decision Voting: the oldest and most widely used fusion decision method. Fusion node arrives at a consensus by a voting scheme like: –Majority voting –Complete Agreement –Weighted voting The popularity of voting arises from its simplicity and accuracy. Other fusion decision algorithms include probability-based Bayesian Model and stack generalization.
Oct 28, 2004Asheq Khan Fusion Architecture Centralized: –Simplest –A central processor fuses the reports collected by all other sensing nodes. –Advantage: Erroneous report(s) can be easily detected. –Disadvantage: inflexible to sensor changes and the workload is concentrated at a single point.
Oct 28, 2004Asheq Khan Fusion Architecture (2) Decentralized : –Data fusion occurs locally at each node on the basis of local observations and the information obtained from neighboring nodes. –No central processor node. –Advantages: scalable and tolerant to the addition or loss of sensing nodes or dynamic changes in the network.
Oct 28, 2004Asheq Khan Fusion Architecture (3) Hierarchical: –Nodes are partitioned into hierarchical levels. –The sensing nodes are at level 0 and the BS at the highest level. –Reports move from the lower levels to higher ones. –Advantage: Workload is balanced among nodes
Oct 28, 2004Asheq Khan Cluster Based Data Fusion
Oct 28, 2004Asheq Khan Problem Due to their energy constraints, sensors need to perform efficient data fusion to extend the lifetime of the network. Lifetime of a sensor network is the number of rounds of data fusion it can perform before the first sensor drains out. This is known as the “Maximum Lifetime Data Aggregation” (MLDA) problem.
Oct 28, 2004Asheq Khan Goal Given: the location & energy of each sensor and the BS. Find an efficient manner to collect & aggregate reports from the sensors to the BS. [Dasgupta, WCNC’03] propose a cluster based heuristic (CMLDA) to solve the problem.
Oct 28, 2004Asheq Khan System Model n sensor nodes(1..n) Base station(n+1) Fixed data packet size: k bits Initial energy of a sensor i: ε i Receive energy, RX i = ε elec * k Transmission energy, TX i,j = ε elec *k + ε amp *d 2 i,j *k
Oct 28, 2004Asheq Khan Algorithm Two phases. Phase 1: –Sensors are grouped into clusters called “super-sensors”. –Each super sensor consists of a minimum no. of sensors. – The energy of a super sensor is the sum of the energy of all the sensors within it. –Distance between two super sensors is the maximum distance between two sensors where, each reside in a different super sensor. –Apply the MLDA algorithm.
Oct 28, 2004Asheq Khan MLDA Algorithm ILP is employed to find a near- optimal admissible flow network. Objective: maximize the lifetime of network (T) under the energy constraints. Generate schedule(s) from the admissible flow network.
Oct 28, 2004Asheq Khan Example Schedule 1 25 Schedule 2
Oct 28, 2004Asheq Khan Algorithm (2) Phase Two: 1.Initialize {Aggregation Schedule} = Ø 2.Life Time, T = 0 3.Choose a Scheduler from phase 1 4.Initialize Aggregation tree, A with the BS 5.Visit each super clusters and add the nodes to the tree such that, the residual energy at each edge is maximized. 6.Add A to the Aggregation Scheduler 7.Increment T by 1 8.Repeat steps 3-7 until a node drains out.
Oct 28, 2004Asheq Khan Comments Provides a set of data fusion schedules that maximize the lifetime of the network. Clustering of nodes reduces the time needed to solve the ILP.
Oct 28, 2004Asheq Khan Synchronization Among Nodes
Oct 28, 2004Asheq Khan Problem During data fusion, internal nodes at each level wait for a certain period of time before they fuse the received reports. If nodes at each level wait for the same period of time then an internal node may timeout before receiving reports from all of its children. With insufficient reports, the credibility of a sensed event is questionable.
Oct 28, 2004Asheq Khan Example EF B D C Level 0 Level 1 Level 2 Level 3 Base Station TIMEOUT Senses T =.5 sec Report D
Oct 28, 2004Asheq Khan Solution An efficient data fusion protocol with following characteristics: –Synchronizes the nodes at different levels. –Nodes at higher levels wait longer before fusing data. –A fixed time period is assigned from the sensing of an event to the time it is received by the base station. –Provide a balance between latency & accuracy.
Oct 28, 2004Asheq Khan Multi-level Fusion Synchronization (MFS) Protocol [Yuan,GLOBECOM’03] propose the MFS protocol. The parameters: –MAX: time BS waits before fusing the received data –Δ: difference in waiting period at consecutive levels –K: the distance (in hops) from the sink
Oct 28, 2004Asheq Khan Algorithm Upon detection of an event, a leaf node reports to its parent node. This triggers the timer of the parent node. Then the parent node sends a START message to trigger the timer of its neighboring nodes. The timer at a node expires after (MAX – K*Δ) seconds.
Oct 28, 2004Asheq Khan An Example EF B DC Level 0 Level 1 Level 2 Level 3 Base Station Max = 1 sec Δ = 0.2sec Senses T = (1-(2*0.2)) = 0.6 sec START T = (1-(1*0.2)) = 0.8 sec T = 1.0 sec Senses Report C+D
Oct 28, 2004Asheq Khan Latency Best case: –Assuming: –START messages do not collide –No propagation delay in triggering the timer –MAX Worst case: –Assuming: –None of the internal nodes receive the START message –L = ∑ (MAX – j*Δ) = D*MAX – ((D-1)*D*Δ)/2 {D = depth of propagation tree} D-1 j=0
Oct 28, 2004Asheq Khan Setting the parameters If the BS knows the depth of the fusion tree then it can compute the values of MAX and Δ. Otherwise, in a learning phase, the BS queries the sensors with different values of MAX and Δ. And adjust the values based on the reports credibility and application requirements.
Oct 28, 2004Asheq Khan Result: No. of reports vs. Δ Similar performance with both BFS (balanced tree) & ODMRP (unbalanced tree). Very small or large Δ performs worst. MAX=1.2s
Oct 28, 2004Asheq Khan Result(2): Latency vs. Δ Small Δ incurs large waiting period whereas large Δ incurs small waiting period. In BFS, latency for each Δ < 2* MAX.
Oct 28, 2004Asheq Khan Pros and Cons Pros: –Synchronizes nodes at different levels. –MAX and Δ can be tuned Cons: –Reports arriving after timeout is discarded. –Collision if START messages will cause a latency greater than MAX.
Oct 28, 2004Asheq Khan Resistance Against Attacks
Oct 28, 2004Asheq Khan Problem Previously, it is assumed that the nodes conducting the data fusion are secured. But, a malicious data fusion node can send bogus reports to the BS. The BS is incapable of detecting the bogus information since the sensor nodes do not directly send the reports to the BS.
Oct 28, 2004Asheq Khan Witness Based Data Assurance [Du GLOBECOM’03] present a witness based scheme to ensure that the BS accepts only valid data fusion results. To prove the validity of a report, the fusion node is required to provide proofs from several witnesses. A witness is a node that also performs data fusion but does not send its report to the BS.
Oct 28, 2004Asheq Khan Algorithm 1.Let there be m witnesses + 1 data fusion node. 2.Each witness w i share an unique key with the BS, k i 3.After receiving reports from the sensor nodes, each witness performs data fusion and obtains the result r i. 4.It then sends a MAC (Message Authentication Code) to the data fusion node: MAC i = MAC(r i, w i, k i ) 5.The data fusion node computes its result and sends its MAC key with its witnesses to the BS. 6.The BS exercises a voting scheme to determine the validity of the report. 7.If the report is corrupted, the BS discards it and polls one of the witness nodes for the correct report.
Oct 28, 2004Asheq Khan Voting Schemes The Base Station can employ two voting schemes to determine the validity of the fused report. –m+1 out of m+1: the result is valid if supported by all the witnesses. –n out of m+1: (1=<n<=m+1) the result is valid if supported by at least n witness.
Oct 28, 2004Asheq Khan m+1 out of m+1 voting scheme 1.After receiving all the MAC’s from the witness nodes, the data fusion node computes: MAC F = MAC(S F,F,K F, MAC 1 xor …xor MAC m ) 2.F then sends (S F,F, w 1,.., w m, MAC F ) to the BS. 3.The BS then computes the MAC i = MAC(S F, w i, k i ) for each w 4.Finally computes: MAC’ F = MAC(S F,F,K F, MAC 1 xor …xor MAC m ) 5.If (MAC F = MAC’ F ) then accepts the report
Oct 28, 2004Asheq Khan n out of m+1 voting scheme The disadvantage of the previous approach is that a corrupt witness node can always send invalid MAC and achieve Denial of service attack. To prevent that, F should not merge all the MAC i ’s but instead forward them all: R = (S F,F, MAC F, w 1, MAC 1,..w m,MAC m ) If at least n out of m+1MAC’s match, then the result S F is accepted. Otherwise the result is dropped.
Oct 28, 2004Asheq Khan Pros & Cons Pros –Provides a scheme that ensures that only valid reports are accepted by the BS. Cons –Redundancy: multiple copies of similar reports are fused by the witnesses. –No energy efficient
Oct 28, 2004Asheq Khan Conclusion This talk attempted to give an overview of the data fusion process in sensor networks. Different data fusion architectures, voting schemes architecture are presented. Three important aspects of efficient data fusion are presented: energy efficiency, synchronization among sensors and resistance against attacks. Obviously, an ideal data fusion will be one that can incorporate all the three characteristics.
Oct 28, 2004Asheq Khan References K. Dasgupta, K. Kalpakis and P. Namjoshi, “An Efficient Clustering-based Heuristic for Data Gathering and Aggregation in Sensor Networks,” IEEE WCNC, K. Kalpakis, K. Dasgupta and P. Namjoshi, “Maximum Lifetime Data Gathering and Aggregation in Wireless Sensor Networks,” IEEE ICN, Wei Yuan, Srikanth V. Krishnamurthy, and Satish K. Tripathi, “Synchronization of Multiple Levels of Data Fusion in Wireless Sensor Networks,” In Proceedings of GLOBECOM, W. Du, J. Deng, Y. S. Han and P. K. Varshney, “A Witness- Based Approach for Data Fusion Assurance in Wireless Sensor Networks,” In Proceedings of GLOBECOM, 2003.