Pang-Ning Tan Associate Professor Dept of Computer Science & Engineering Michigan State University
Research Overview Spatio-Temporal Data Mining Link Mining Cluster Analysis Anomaly Detection Predictive Modeling Pattern Discovery Techniques and Algorithm Development New problems and challenges New ways for problem solving
Research Overview Spatio-Temporal Data Mining Link Mining Cluster Analysis Anomaly Detection Predictive Modeling Pattern Discovery Techniques and Algorithm Development spatio-temporal data, fusion of simulation & observation data, handling of noise, multi-resolution, extremes Kernel learning, multi-task learning, transfer learning, ensemble learning, semi-supervised learning Analysis and modeling of climate and earth science data
Time Series Prediction Predictor variables (X) Predictand variable (y) Training data Test data (Future) Transfer Function f: X y
Supervised vs Semi-supervised Learning ðSupervised: Construct function using training data only ðSemi-supervised: Construct function using training + test data (predictor variables only) Predictor variables Predictand variable Training data Test data (Future)
Hidden Markov Regression ØState transition A=[a ij ] ØInitial probability: = { 1, 2,…, N } ØEach state is associated with a regression function: ØObservations are generated by:
Experimental Results ðUC Riverside Time Series Repository ðRoot mean square error (RMSE) ØMore than 10% improvement on Greatlake, Steam, Leaf, and C12full data sets
Statistical Downscaling Predictor variables (e.g., SLP, Z-500, u, v) Predictand variable (e.g., precip) Training (NCEP Reanalysis) Test (GCM) Training (OBS)
Application to Statistical Downscaling ðData: Canadian Climate Change Scenario Network ØPredictors: 25 variables (SLP, wind direction, etc) ØPredictand: mean temperature at a location Ø40 randomly selected sites in Canada
Application to Statistical Downscaling Semi-HMMR not much better than supervised HMMR Training and future data come from different sources (GCM vs NCEP reanalysis)
Bias Correction with Covariance Alignment ðCovariance matrices for training and future data: ðGoal is to find a transformation: X U X U such that A and B’ are “aligned” with each other
Covariance Alignment ðObjective function: ØSolved using gradient descent algorithm ðAfter alignment, apply semiHMMR on X L and X U
Semi-Supervised HMMR after Alignment Without covariance alignment With covariance alignment
Collaboration ðNSF-CNH: Towards an Integrated Framework for Climate Change Impact Assessments for International Market Systems with Long-Term Investments ØDrs J.Winkler (PI), J.Andresen, S.Zhong, R.Black, S.Thornsbury, S.Loveridge, A.Iezzoni, J. Zhao ðOther challenges in downscaling: ØSimultaneous downscaling of multiple predictands/locales Multi-task and transfer learning ØSkewed or zero inflated data (e.g., precipitation) Simultaneous classification & regression (Abraham & Tan, 2010) ØModeling of extreme values
Questions? Thank You Website:
Value of (Unlabeled) Predictor Data