Modeling spatially-correlated sensor network data Apoorva Jindal, Konstantinos Psounis Department of Electrical Engineering-Systems University of Southern California SECON 2004
Outline Introduction Statistical analysis of experimental data The model Model verification and validation Tools to generate large synthetic traces Conclusion
Introduction The sensors in sensor networks will be densely deployed and detect common phenomena. It is expected that a high degree of spatial correlation will exist in the sensor networks data.
Introduction However since very few real systems have been deployed, there is hardly any experimental data available to test the proposed algorithms. No effort has been made to propose a model which captures the spatial correlation in sensor networks.
Introduction We propose a mathematical model to capture the spatial correlation in sensor network data. We present a method to generate large synthetic traces from a small experimental trace while preserving the correlation pattern, and a method to generate synthetic traces exhibiting arbitrary correlation patterns
Statistical analysis of experimental data A. data set description (1)S-Pol Radar Data Set The resampled S-Pol radar data, provided by NCAR, records the intensity of reflectivity of atmosphere in dBZ. (2)Precipitation Data Set This data set consists of the daily rainfall precipitation for the Pacific Northwest from
Statistical analysis of experimental data b. Statistic used to Measure Correlation in Data Given a two dimensional stationary process X(x,y), the autocorrelation function is defined as Another statistic often to characterize spatial correlation in data is the variogram defined as
Statistical analysis of experimental data b. Statistic used to Measure Correlation in Data For isotropic random process, the variogram depends only on the distance d=d 1 +d 2 between two nodes. For a set of samples x(x i,y j ), i=1,2,…,γ(d) can be estimated as follows
Statistical analysis of experimental data c. analysis of data using Variograms
The model
The parameters of the model are h, the α i ’s, β, f Y (y), f Z (z). For mathematical convenience, we define the three random variable :
We can find the probability density function f X (x) as follows :
In stationarity, have the same distribution. Using the above and equation(6) the characteristic function of f X (x) can be written as: Characteristic function
Without loss of generality, we will assume that Z is a normal random variable with (0,σ Z )
Since X and Z are independent the characteristic function of f A can be written as Hence, Equation(7) reduce to
For mathematical convenience, we define a new random variable L having characteristic function given by
The model A. Parameters of the Model an Correlation
The model B. Inferring Model Parameters Infer f X (x) from its empirical distribution. Inferring σ z,α i ’s and β is more involved. Using Equation(3) leads to the following:
The model B. Inferring Model Parameters Equating for 1 ≦ i ≦ h+1 gives h+1 equations. These equations along with the equation form a system of h+2 equations. After solving the above system, we can obtain σ z,α i ’s,β and f Y (y) through Determining h. To start from an overestimated h and lower its value until all the α i ’s are positive.
Model verification and validation A. verification (1)S-Pol Radar data set
Model verification and validation A. verification
(2)Precipitation data set: The data inferred for the trace are h =1, α 1 =0.72, β=0.28 and σ Z =2.61
Model verification and validation A. verification
Model verification and validation B. Model Validation DIMENSIONS[6] proposes wavelet based multi- resolution summarization and drill down querying. Spatial Correlation based Collaborative Medium Access Control (CMAC) [10] The evaluation metric used is the query error which is defined as
Model verification and validation B. Model Validation
Tools to generate large synthetic traces The tools are freely available at generateLargeTraceFromSmall generateSyntheticTraces
Conclusion We have proposed a model to capture the spatial correlation, which can generate synthetic traces. We also described a mathematical procedure to extract the parameters of the model from a real data set. We verified and validated the model. Final, we have created two freely available tools to enable researchers to generate data.