Locations. Soil Temperature Dataset Observations Data is – Correlated in time and space – Evolving over time (seasons) – Gappy (Due to failures) – Faulty.

Slides:



Advertisements
Similar presentations
Anomaly Detection in Problematic GPS Time Series Data and Modeling Dafna Avraham, Yehuda Bock Institute of Geophysics and Planetary Physics, Scripps Institution.
Advertisements

OSE meeting GODAE, Toulouse 4-5 June 2009 Interest of assimilating future Sea Surface Salinity measurements.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
11/11/02 IDR Workshop Dealing With Location Uncertainty in Images Hasan F. Ates Princeton University 11/11/02.
Life Under Your Feet Johns Hopkins University Computer Science Earth and Planetary Sciences Physics and Astronomy
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
Streaming Pattern Discovery in Multiple Time-Series Spiros Papadimitriou Jimeng Sun Christos Faloutsos Carnegie Mellon University VLDB 2005, Trondheim,
Compressed sensing Carlos Becker, Guillaume Lemaître & Peter Rennert
Jayant Gupchup Graduate student, Johns Hopkins University Representative Slides.
Learning With Dynamic Group Sparsity Junzhou Huang Xiaolei Huang Dimitris Metaxas Rutgers University Lehigh University Rutgers University.
An introduction to Principal Component Analysis (PCA)
“Real-time” Transient Detection Algorithms Dr. Kang Hyeun Ji, Thomas Herring MIT.
Principal Component Analysis
Distributed Regression: an Efficient Framework for Modeling Sensor Network Data Carlos Guestrin Peter Bodik Romain Thibaux Mark Paskin Samuel Madden.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering On-line Alert Systems for Production Plants A Conflict Based Approach.
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Privacy Preservation for Data Streams Feifei Li, Boston University Joint work with: Jimeng Sun (CMU), Spiros Papadimitriou, George A. Mihaila and Ioana.
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 11, 2011.
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
1 Formation et Analyse d’Images Session 7 Daniela Hall 7 November 2005.
Raw data Hz HH data submitted for synthesis Flux calculation, raw data filtering Additional filtering for footprint or instrument malfunctioning.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Daniel Acuña Outline What is it? Statistical significance, sample size, hypothesis support and publication Evidence for publication bias: Due.
Gwangju Institute of Science and Technology Intelligent Design and Graphics Laboratory Multi-scale tensor voting for feature extraction from unstructured.
Multivariate Approaches to Analyze fMRI Data Yuanxin Hu.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
Cs: compressed sensing
Life Under Your Feet. Sensor Network Design Philosophies Use low cost components No access to line power –Deployed in remote locations Radio is the.
Iterated Denoising for Image Recovery Onur G. Guleryuz To see the animations and movies please use full-screen mode. Clicking on.
Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks Research Proposal Jayant Gupchup Department of Computer Science,
Recent advances for the inversion of the particulate backscattering coefficient at different wavelengths H. Loisel, C. Jamet, and D. Dessailly.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Center for Satellite Applications and Research (STAR) Review 09 – 11 March 2010 MAP (Maximum A Posteriori) x is reduced state vector [SST(x), TCWV(w)]
IORAS activities for DRAKKAR in 2006 General topic: Development of long-term flux data set for interdecadal simulations with DRAKKAR models Task: Using.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Basics of Neural Networks Neural Network Topologies.
Gap filling of eddy fluxes with artificial neural networks
Autocorrelation in Time Series KNNL – Chapter 12.
Ensemble Learning (1) Boosting Adaboost Boosting is an additive model
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
Gap-filling and Fault-detection for the life under your feet dataset.
Data-Intensive Statistical Challenges in Astrophysics Alex Szalay The Johns Hopkins University Collaborators: T. Budavari, C-W Yip (JHU), M. Mahoney (Stanford),
Streaming Problems in Astrophysics
Analyzing wireless sensor network data under suppression and failure in transmission Alan E. Gelfand Institute of Statistics and Decision Sciences Duke.
Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
5/24/ Modeling & Diagnostics of A Furnace System This work develops an efficient diagnostic methodology for a multi-zone batch furnace system using.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Model Based Event Detection in Sensor Networks Jayant Gupchup, Andreas Terzis, Randal Burns, Alex Szalay.
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 14, 2010.
Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South Korea Copyright © solarlits.com.
A new high resolution satellite derived precipitation data set for climate studies Renu Joseph, T. Smith, M. R. P. Sapiano, and R. R. Ferraro Cooperative.
Robust Localization Kalman Filter & LADAR Scans
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Martina Uray Heinz Mayer Joanneum Research Graz Institute of Digital Image Processing Horst Bischof Graz University of Technology Institute for Computer.
How Dirty is your Data : The Duality between detecting Events and Faults J. Gupchup A. Terzis R. Burns A. Szalay Department of Computer Science Johns Hopkins.
Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation Rendong Yang and Zhen Su Division of Bioinformatics,
Enabling Real Time Alerting through streaming pattern discovery Chengyang Zhang Computer Science Department University of North Texas 11/21/2016 CRI Group.
of Temperature in the San Francisco Bay Area
Photometric redshift estimation.
of Temperature in the San Francisco Bay Area
S519: Evaluation of Information Systems
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
Outline S. C. Zhu, X. Liu, and Y. Wu, “Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo”, IEEE Transactions On Pattern Analysis And Machine.
PCA based Noise Filter for High Spectral Resolution IR Observations
Seasonal Forecasting Using the Climate Predictability Tool
Outline Sparse Reconstruction RIP Condition
Presentation transcript:

Locations

Soil Temperature Dataset

Observations Data is – Correlated in time and space – Evolving over time (seasons) – Gappy (Due to failures) – Faulty (Noise, Jumps in values)

Temporal Gap Filling Data is correlated in the temporal domain Data shows – Diurnal patterns – Trends Trends are modeled using slope and intercept Correlations captured using functional PCA

Gap Filling (Connolly & Szalay) + Original signal representation as a linear combination of orthogonal vectors Optimization function - W λ = 0 if data is absent - Find coefficients, a i ’s, that minimize function. + Connolly et al., A Robust Classification of Galaxy Spectra: Dealing with Noisy and Incomplete Data,

Soil Temperature Dataset (same as slide 2)

Gap-filling using Daily Basis Only 2% of the data could be filled using daily model Good Period

Observations Hardware failures causes entire “days” of data to be missing – Temporal method (only using temporal correlation) is less effective If we look in the spatial domain, – For a given time, at least > 50% of the sensors are active – Use the spatial basis instead. – Initialize spatial basis using the period marked “good period” in previous slide

Gap-filling using sensor correlations Faulty data Too many missing readings for this period (Snow storm!)

Accuracy of gap-filling High Errors for some locations are actually due to faults Not necessarily due to the gap-filling method

Examples of Faults

Observations & Incremental Robust PCA * Data are faulty – Pollutes the basis – Impacts the quality of gap-filling Basis is time-dependent – Correlations are changing over time Simultaneous fault-detection & gap-filling – Initialize basis using reliable data – Using basis at time t, Detect and remove faulty readings Weight new vectors depending on residuals Update thresholds for declaring faults – Iterate over data to improve estimates + Budavári, T., Wild, V., Szalay, A. S., Dobos, L., Yip, C.-W Monthly Notices of the Royal Astronomical Society, 394, 1496,

Gap-filling using sensor correlations (same as slide 8)

Iterative and Robust gap-filling -Fault removal leads to more gaps and less data to work with -However, data looks much cleaner.

Impact of number of missing values Even when > 50% of the values are missing Reconstruction is fairly good

Iteration - I

Iteration - II

Accuracy using spatiotemporal gap-filling Locations

Example of running algorithm

ETC

Daily Basis

Collected measurements