Wireless Ad Hoc Sensor Networks: An Overview Yu Hen Hu University of Wisconsin – Madison Dept. Electrical and Computer Engineering Madison, WI Presented at CVGIP’03, 8/12/2003
© 2003 by Yu Hen Hu 2 UWCSP Node signal processing –Energy Detection –Target classification Region signal processing –Region detection and classification -- voting –Energy based localization –Least square tracking –Hand-off policy
© 2003 by Yu Hen Hu 3 Road and Sensor placement
© 2003 by Yu Hen Hu 4 Closeup view
© 2003 by Yu Hen Hu 5 Region Partition
© 2003 by Yu Hen Hu 6 Region 1 Detection Fusion
© 2003 by Yu Hen Hu 7 Region 2 detection fusion
© 2003 by Yu Hen Hu 8 Acoustic Energy Series and Detection
© 2003 by Yu Hen Hu 9 Acoustic Energy Snapshots
© 2003 by Yu Hen Hu 10 Target Localization using Acoustic Signatures Task: determine target (e.g. tank) location based on acoustic signal emitted from the target. Assumptions: –Known sensor locations –Point, omni-directional acoustic signal source –Single target –Multiple sensors detect acoustic signal simultaneously Constraints: –Limited wireless communication bandwidth –Low power operation
© 2003 by Yu Hen Hu 11 Basic Approaches Distance based localization –If the distance between the target and each sensor can be determined, then target location can be found via triangularization: Distance estimation: –Absolute distance is harder to come by. –Relative distance can be found easier in two ways: Relative time delay Relative signal intensity –Time delay estimation Acoustic signal travels at constant speed in air –Signal intensity Acoustic signal intensity decays as square of distance between target and sensor.
© 2003 by Yu Hen Hu 12 Energy Based Localization (EBL) Energy decay model –The energy of the acoustic signal is inversely proportional to the square of the source-to-sensor distance [Kinsler, fundamental of acoustics, 1982]. y i (t): energy measured at i th sensor node at time t. g i : sensor gain : (= 2): energy decay constant s(t): source energy at time t. r(t): source location at time t. r i : i th sensor location i : model error, parameter error, and background noise energy. E{ i } > 0.
© 2003 by Yu Hen Hu 13 Direct ML Parameter Estimation Consider the equation Given {y i (t), r i, g i ; i = 1,2, …}, one wants to estimate r k (t) and S k (t). Denote Then where Assume i (t) ~ N( i, i 2 ), the negative log-likelihood function to be maximized is with the parameter vector a nonlinear optimization problem If target locations are known, the D matrix, and hence H matrix will be known. S can be solved as Substitute into quadratic cost function, it leads to: still nonlinear! Note D has only 2k < n d.o.f.
© 2003 by Yu Hen Hu 14 Least Square Formulations Rewrite the hyper-sphere equation for sensor i and j: A set of linear equations that can be solved using least square solution! Constrained Least square formulation: Find r and R s to minimize Other constrained LS formulations have also been proposed.
© 2003 by Yu Hen Hu 15 Conclusion Sensor network is a new application area for computer vision, graphics and image processing It requires multi-modality, multimedia processing under the constraint of minimizing communication and energy consumption.
University of Wisconsin-Madison Marco Duarte, Ashwin D’Costa, Dan Li, Ahtasham Ashraf, Kerry Wong, Akbar Sayeed, Yu Hen Hu
© 2003 by Yu Hen Hu 17 Research Team V. Phipatanasuphorn, T. Cloqueur, K.-C. Wang, T.-L. Chin, P. Ramanathan, K. Saluja, Networking API Fault-Tolerant Networking M. Duarte, A. D’Costa, D. Li, A. Ashraf, Yu Hen Hu, A. Sayeed Collaborative Signal Processing (CSP) Sensor Network Data Collection
© 2003 by Yu Hen Hu 18 Collaborative Signal Processing Goal: Disguised sensor nodes (MEMS) in wireless networking. Prototypes: Linux Programmable Embedded Systems Node and Region Based Detection - Energy Classification - Time Signatures, Frequency Spectrum, Wavelets Localization - Multiple Node Energy Measurements, Tracking, Beamforming, Time- delay measurements
© 2003 by Yu Hen Hu 19 Data Sets/Classification Acoustic, Seismic and Passive Infrared (PIR) channels. Vehicles: AAV, DW and HMMWV. Freq Hz. PIR detection as the indication of a detected event. Different event sizes for different channels. Data set is divided into 3 partitions: Q1, Q2, Q3. Test with one of them at a a time, train with the others Classifiers tested: Maximum Likelihood, k-Nearest Neighbor, Support Vector Machine, Learning Vector Quantization.
© 2003 by Yu Hen Hu 20 Features Vector features: –Acoustic data Down sample to 1kHz Compute 512 pts FFT (0.5 second of time series), non-overlap window. Keep the first 100 PSD and average adjacent ones to make 50x1 vector. –Seismic data Down sample to 512 Hz Compute 256 pt FFT Keep 50 of the 128 PSF
© 2003 by Yu Hen Hu 21 Features (Cont.) Event based features –Each event will have varying number of vectors. Voting is used to obtain event based results –Average of vector features of an event –Medium of vector feature of an event Multi-modality feature –Concatenate acoustic event features and seismic event features.
© 2003 by Yu Hen Hu 22 Reporting the Result Classification result is shown as confusion matrices: Row heading: class label of feature vector. Column heading: class label output from the classifier Example: the (3,2) entry of partition q1 and feature 2A = 5. It means 5 feature vectors that was from HMV class was mis- classified into DW class by the classifier when using mean of acoustic feature vectors.
© 2003 by Yu Hen Hu 23 Classification Results (Vehicle) 1, 3 Vector – each feature vector ~ 0.1 s time series, 50 power spectral density bins. Each bin = 20 Hz 1, 3 Event – voting of vector classification results for each event 2A, 4A – 1 feature vector per event. Mean of vectors 2B, 4B – 1 feature vector per event. Medium of vectors 5A, 5B, 5C, 5D – concatenation of feature vectors For each feature-classifier, a confusion matrix M is computed. Classification rate = sum(M(k,k))/sum(sum(M)) Observation: SVM outperforms other classifiers
© 2003 by Yu Hen Hu 24 Classification Results (Locomotive) SVM outperforms other classifiers in general except features 2A, 2B and 4B Combined results (5C) shows improvement for SVM classifier.
© 2003 by Yu Hen Hu 25 Classification – A New Approach Evaluate each modality’s performance on current data and annotate results for each vector. Perform separate-modality classification Calculate probability of each modality to give reliable result – kNN Classifier If modalities disagree, select one with highest probability Better results expected on vehicle classification
© 2003 by Yu Hen Hu 26 Observations Acoustic better than Seismic Event based better than vector based. It is easier to classify between tracked and wheeled vehicles (locomotive) than between individual vehicle types. Multi-modality fusion method used: –Direct concatenation of feature vectors obtained from different signal modalities –Yield better results than classifiers with individual signal modalities in locomotive classification mode. –Does not improve performance than classifiers using acoustic signal only in vehicle type classification mode. These are just preliminary results. Further studies are underway and will be reported in the near future.
Optimal Decision Fusion With Applications to Target Detection in Wireless Ad Hoc Sensor Networks Marco Duarte and Yu-Hen Hu Department of Electrical and Computer Engineering MMSP, Siena, Italy, 9/29-10/2/2004
© 2003 by Yu Hen Hu 28 Outline Optimal decision fusion: theory and practice Applications of decision fusion to ad hoc sensor network signal processing
© 2003 by Yu Hen Hu 29 Decision Fusion A hierarchical, layered decision making process. An particular information fusion method Major application –Multi-modality, multi-media decision making. –E.g. video contextual description, visual assisted speech recognition, etc. Compared to value fusion method –Decision fusion represent partial (local) estimates in discrete categories –Transmitting discrete local estimates may require less communication cost Despite many existing decision fusion methods, most focus only on a specific fusion method.
© 2003 by Yu Hen Hu 30 In This Work We provide a unified framework for decision fusion. This framework enables us to … Better understand the nature of the decision fusion problem in terms of statistical pattern classification Deduce theoretically optimal decision fusion method Explain the practical limitation of the theoretical optimal decision fusion (ODF) method Develop a complementary optimal decision fusion (CODF) method to handle practical decision fusion problems Apply the CODF method to wireless sensor network multi- modal signal processing applications.
© 2003 by Yu Hen Hu 31 Information Fusion: A Statistical Pattern Classification Framework Decision making = hypothesis testing = pattern classification! Goal: Given a feature vector x, classify it into one of the class labels, say, C n. An optimal classifier is a Bayes classifier that –minimizes the probability of mis-classification by choosing the class label that –maximizes the a posteriori probability P(C n |x), sometimes, it is called a MAP classifier. A MAP classifier is often difficult to express analytically. Most practical classifiers can be seen as approximation of the MAP classifier. Sometimes the classification is performed indirectly …
© 2003 by Yu Hen Hu 32 Fusion Framework (cont’d) Preliminary pattern classifications may be performed on the whole or a portion of the feature vector x: –A committee of classifiers examining the same x; or –Portions of x are examined by separate classifiers Fusion is the process for these committee members to reach a consensus! In decision fusion, the committee classifiers give a tentative classification result. In value fusion, the committee members do not reach a decision. Rather an estimate of the posterior probability which is represented by a real number.
© 2003 by Yu Hen Hu 33 Fusion Framework (cont’d) Both decision fusion and value fusion methods will forward their partial decisions or local posterior probability estimates to a fusion center where a final decision will be reached. The fusion center often does not have direct access of the original data x. –This is important when x is physically distributed over a wide area and transmitting cost is very high. If the fusion center also have access to x, then a mixture of expert architecture can be used.
© 2003 by Yu Hen Hu 34 Decision Fusion Architecture Fusion Center Member Classifier #1 Member Classifier #1 Member Classifier #2 Member Classifier #2 Member Classifier #K Member Classifier #K x Local decisions d k {1, …, N} High data rate channel Low data rate channel d1d1 d2d2 dKdK (d): final decision d = [d 1 d 2 d K ] T Decision vector Member classifiers may receive different portions of joint data vector x
© 2003 by Yu Hen Hu 35 Observations Without access to the original feature vector x, the fusion center must make a final decision based solely on the partial decision vector d. For decision fusion, each d k can be represented by an integer from 1 to N each corresponding to a class label. Since there are only N K distinct decision vectors, a decision fusion center must assign a class label to each of these decision vectors.
© 2003 by Yu Hen Hu 36 Optimal Decision Fusion (ODF) With K member classifiers, and N class labels, there are at most N K possible ensembles of d available for decision fusion. Given a particular d, it should be assigned to a class label C i such that the a posteriori probability P(C i |d) is maximized. This leads to a maximum a posteriori (MAP) classifier and hence is an optimal classifier. We call this an Optimal Decision Fusion (ODF) method.
© 2003 by Yu Hen Hu 37 Table-Look-Up Implementation EXOR Example: A table of d 1 d 2 d 3 to (d) can be constructed. But … Not all N K combinations show up in training. Some d may have no or too few training samples associated. Samples of different labels may have the same decision vector d x1x1 x2x2 label d1d1 d2d2 d3d
© 2003 by Yu Hen Hu 38 A Decision Region Interpretation A classifier divides feature space into disjoint decision regions. The decision vector divides the feature space into upto N K disjoint sub-regions. Decision fusion center will assign each of the sub-region into a class label. But the true decision boundary may not overlap with the boundaries of these sub-regions. Hence error may occur. Feature space
© 2003 by Yu Hen Hu 39 Existing Decision Fusion Approaches Weighted linear combination of decisions –Assume local decisions are made independently –Formulate as a ML ratio testing problem –Majority voting, Weighted voting, Follow-the- leader Behavior-knowledge-space (BKS) –Do NOT assume independent local decisions –Empirical table-look-up method that requires large number of training data.
© 2003 by Yu Hen Hu 40 Behavior Knowledge Space Use training data samples to directly implement an ODF classifier. Some shortcomings: –Reject a particular d if too few training samples yield that decision vector. –d is assigned to a class label according to a majority vote of all labels that correspond to the same d vector. –Performance is unsatisfactory when training data size is small. Y.S. Huang and C. Y. Suen, “A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals”, IEEE Trans. PAMI, V. 17 N. 1, Jan. 1995
© 2003 by Yu Hen Hu 41 Complementary ODF (CODF) Complement the ODF method with one of the alternative decision fusion method when there are not sufficient number of training samples for a particular value of d. Reduce number of table entries: –LUT of ODF too large when K or N become large –ODF table entries retained if ODF method is correct and the complementary is incorrect, and should over-writes the complementary methods.
© 2003 by Yu Hen Hu 42 Candidate Complementary Methods Non-Weighted Threshold Voting Simplest method, establishes a minimum number of votes for a class –Where di(x) = 1 for C1, and 0 for C2 For K decision makers, there are K-1 possible threshold levels; pick the one that minimizes the error.
© 2003 by Yu Hen Hu 43 Candidate Complementary Methods Weighted Least Square Threshold Use d i (x) = 1 for C 1, and -1 for C 2 Based on Linear Least Squares Filter Weight vector w = D + l, l is label vector, D is set of training decision ensembles. Global decision (d)=dw Use of pseudoinverse, X + = (X T X) -1 X T Error vector: e = l – Dw ; minimize magnitude
© 2003 by Yu Hen Hu 44 Candidate Complementary Methods Optimal Linear Threshold Use d i (x) = 1 for C 1, and -1 for C 2 Based on Method of Steepest Descent, Successive Approximation Random initial weights w(0), update equation w(a+1) = w(a) + g(a) Global decision (d)=dw Error: e = 1 T l – Dw – sum of elements of e Gradient vector of the error
© 2003 by Yu Hen Hu 45 Candidate Complementary Methods Local Classifier Accuracy Weighting Each classifier is assigned a weight proportional to its classification rate r i : Results depend on consistency of classifier accuracy for different sets of training/testing samples
© 2003 by Yu Hen Hu 46 Candidate Complementary Methods Following The Leader Heuristic method The fusion result will be the decision of the classifier with the highest accuracy rate: Performance dependent on reliability of individual classifiers
© 2003 by Yu Hen Hu 47 Sensor Network Signal Processing Tasks Target Detection (CFAR + region fusion) Target Classification (ML+ region fusion ) Target Localization (ML,EBL) Target Tracking (Kalman Filter) D. Li, K.D. Wong, Y.H. Hu, A.M. Sayeed: Detection, Classification and Tracking of Targets. IEEE Signal Processing Magazine Vol. 19 Issue 2 pp
© 2003 by Yu Hen Hu 48 Experiments Each sensor calculates energy of 0.75-sec segments of acoustic signal For each sensor, the energy reading is fed into Constant False Alarm Rate (CFAR) Detector, checked against time-varying threshold. If reading exceeds threshold, a detection is signaled (1), otherwise, no detection occurs (0).
© 2003 by Yu Hen Hu 49 Experiments All sensors perform individual detection simultaneously. Sensors are grouped by regions, vehicle detection results for each region are desired. Decision fusion is used to yield a single region detection decision from the individual detection outputs. Lowest possible error rate is desired in region detection.
© 2003 by Yu Hen Hu 50 Experiments Experiment 1: Compare ODF against the available complementary methods for the training set. –Measure error rate for each fusion method Experiment 2: Compare ODF against CODF using all available complementary methods. –Measure error rate, CODF table size for each complementary method used Leave-one-out testing across vehicle runs is used.
© 2003 by Yu Hen Hu 51 Results: Experiment 1
© 2003 by Yu Hen Hu 52 Results: Experiment 2 Error Rate
© 2003 by Yu Hen Hu 53 Results: Experiment 2 ODF Table Size
© 2003 by Yu Hen Hu 54 Results: Comments On Training Set, ODF has lowest error rate, confirming its optimality Performance varies among vehicle runs CODF yields better performance than the complementary methods or ODF alone for most runs Weighted Least Square Voting performs best among all proposed complementary methods Size of ODF table is smaller when complementary methods are used, change depends on method.
© 2003 by Yu Hen Hu 55 Conclusions CODF is a low-complexity fusion method that does not depend on the success rates or independence of the decision makers Performance of CODF is strongly dependent on the amount of training data available CODF can be applied to other applications, and other tasks in Sensor Networks
© 2003 by Yu Hen Hu 56 Further Work Application of CODF to Region Classification in Sensor Networks Study of effect of Training Set Characteristics on CODF performance. Study of efficient, effective complementary rules suitable for other applications Website: This presentation can be found at –