Presentation is loading. Please wait.

Presentation is loading. Please wait.

A learning approach for reducing data packets in sensor networks Yinghui Na.

Similar presentations


Presentation on theme: "A learning approach for reducing data packets in sensor networks Yinghui Na."— Presentation transcript:

1 A learning approach for reducing data packets in sensor networks Yinghui Na

2 Problems Sensors have very limited computation capability Traditionally, the sensors report ALL collected data to base station In some situation, most of these data is not interested –E.g., rare event (intrusion) detection purpose Is there a way to report only interested data to BS, and thus to save limited BS resource to maximize lifetime of sensors

3 Classification Interaction between data mining algorithms and network protocols Classification: a task of induction of finding patterns –Assign objects to one of predefined categories A ‘supervised’ approach to classify the unknown (test data) based on well-know (training data).

4 Approach We denote class label of the i-th example x i by y i, where y i ∈ Y={0,1}. O is negative and 1 is positive Collected data points can be labeled at the base station as positive (interesting) and negative (not interesting) Process –Initialization: at beginning, the BS has no data points; the sensors send all data points until the first model from the base station is received –Classification model creation: BS forms the classification model based the received minimum number of positive examples –Sensors report collected data selectively: Sensors report all positive data points and part of negative data points based on the model –BS updates the model: BS retains all received data and update the model

5

6 Cost comparison If report all collected data, the cost is Cb=N*c, the N is the number of all data points, and we assume that c, the cost of sending a data point, is a constant In the proposed approach, the total cost is C=N s *c+N fp C fp +N fn C fn +N m C m, where is the number of selected data points from sensor to BS; N fp and N fn are numbers of false positives and false negatives respectively; C fp and C fn are their corresponding costs per data point; N m is the number of models sent by the base station to the sensors and C m is the cost of such communication The approach is profitable only if the cost of proposed approach is lower than the cost of traditional approach

7 Cost matrix The penalties of classifying the data points in BS can be represented by a 2*2 matrix with element c(i,j). C(0,0) denotes the penalty for not sending a negative data point, c(0,1) the penalty for a false negative, c(1,0) the penalty for sending a negative example, and c(1,1) the penalty for sending a true positive data point. We assume that c(0,0)=0 and c(1,1) =c. penalties c(0,1)= C fp and c(1,0)=c+ C fn are varied.

8 Classification modeling Naïve Bayes classifier –This is a direct application of Bayes’ –P(C|X) = P(X|C)P(C)/P(X): X – a vector of x1, x2,.., xn We used NBC as the classification modeling technique in base station

9 Performance evaluation Simulation Tossim –The network simulation parameters were: packet sizes of 32 bytes (sensor data), and 140 bytes (BS learning model); 100 nodes PRTools –We used the PRTools data generators [] and obtained examples using gendatd and gendatb routines. The dataset contained 1,000,000 examples. Furthermore, we assume that the probability of a positive example was 0.02 and the probability of negative example was 0.98.

10 Results We assume that the cost for c(1,0) ∈ {1,4,16,64} and c(0,1) ∈ {1,4,16,64,256,1024,4096 }. Traditionally, the cost of transmitting 1,000,000 data points will have total 1,000,000 cost In the table, the increase of the false negative penalty beyond 1024 resulted in a non- profitable system. C(1,0)C(0,1) N fp N fn Cost 111236887233091 141931643553590 1162952432610123 4 1647008232019061 6 125617413112735265 1 110244790135448838 4 140964665358024987 64

11 References [1]F. Zhao and L. Guibas. Wireless Sensor Networks: An Information Processing Approach. Morgan Kaufmann, 2004. [2]H. Kargupta. Distributed Data Mining for Sensor Networks, PKDD 2004,Tutorial. [3]W. Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energy- e±cient communication protocol for wireless microsensor networks. In In Proccedings of the Hawaii Conference on System Sciences, January 2000. [4]W. Heinzelman, A. Chandrakasan, and H. Balakrishnan. An application-specific protocol architecture for wireless microsensor net- works. IEEE Transactions on Wireless Communications, 1(4):660-670, 2002. [5]P. Radivojac, U. Korad, K. M. Sivalingam, and Z. Obradovic. Learning from class-imbalanced data in wireless sensor networks. In 58th IEEE Semiannual Conf. Vehicular Technology Conference (VTC), volume 5, pages 3030-3034, Orlando, FL, October, 2003. [6]S. S. Ghiasi, A. Srivastava, X. Yang, and M. Sarrafzadeh. Optimal energy aware clustering in sensor networks. Sensors, 2:258-269, 2002. [7]O. Younis and S. Fahmy. Heed: A hybrid, energy-efficient, distributed clustering approach for ad-hoc sensor networks. IEEE Transactions on Mobile Computing, 3(4), 2004. [8]W. Chen, J. C. Hou, and L. Sha. Dynamic clustering for acoustic tar- get tracking in wireless sensor networks. IEEE Transactions on Mobile Computing, 3(3):258-271, 2004. [9]D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tso- tras, M. Vlachos, N. Koudas, and D. Srivastava. The threshold join algorithm for top-k queries in distributed sensor networks. In DMSN '05: Proceedings of the 2nd international workshop on Data management for sensor networks, pages 61-66, New York, NY, USA, 2005. ACM Press. [10]T. Palpanas, D. Papadopoulos, V. Kalogeraki, and D. Gunopulos. Dis- tributed deviation detection in sensor networks. SIGMOD Record, 32(4):77-82, December, 2003. [11]Loo K., Tong I., Kao B., and Cheung D. Online Algorithms for Mining Inter-Stream Associations From Large Sensor Networks. In Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2005 [12]S. Vucetic, D. Pokrajac, H. Xie and Z. Obradovic. Dection of underrepresented biological sequences using class-conditional distribution models, in proceeding of Third SIM Interational Conference on Data Mining, May 2003 [13]Department of University of California, Berkeley. TOSSIM: Simulating TinyOS Networks. http://www.cs.berkeley.edu/~pal/research/tossim.html http://www.cs.berkeley.edu/~pal/research/tossim.html [14]‘PRTools, a Matlab Toolbox for pattern Recognition,’ http://www.ph.tn.tudelft.nl/prtools, 2002http://www.ph.tn.tudelft.nl/prtools


Download ppt "A learning approach for reducing data packets in sensor networks Yinghui Na."

Similar presentations


Ads by Google