AUTOMATED TROPICAL CYCLONE EYE DETECTION USING DISCRIMINANT ANALYSIS

Slides:



Advertisements
Similar presentations
Operational Use of the Dvorak Technique at the NHC
Advertisements

RAMMT/CIRA Tropical Cyclone Overview THE DVORAK TECHNIQUE Introduction Visible Technique IR Technique Strengths and Weaknesses Lab Exercise: Visible Pattern.
Chapter 3 – Data Exploration and Dimension Reduction © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Future Plans  Refine Machine Learning:  Investigate optimal pressure level to use as input  Investigate use of neural network  Add additional input.
Robert DeMaria.  Motivation  Objective  Data  Center-Fixing Method  Evaluation Method  Results  Conclusion.
Future Plans  Refine Machine Learning:  Investigate optimal pressure level to use as input  Investigate use of neural network  Add additional input.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Lecture II-2: Probability Review
Analysis of High Resolution Infrared Images of Hurricanes from Polar Satellites as a Proxy for GOES-R INTRODUCTION GOES-R will include the Advanced Baseline.
Formation of a tropical cyclone eye is often associated with intensification [1]. Currently, determination of eye formation from satellite imagery is generally.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Application of the Computer Vision Hough Transform for Automated Tropical Cyclone Center-Fixing from Satellite Data Mark DeMaria, NOAA/NCEP/NHC Robert.
Improvements in Deterministic and Probabilistic Tropical Cyclone Wind Predictions: A Joint Hurricane Testbed Project Update Mark DeMaria and Ray Zehr NOAA/NESDIS/ORA,
OPERATIONAL IMPLEMENTATION OF AN OBJECTIVE ANNULAR HURRICANE INDEX ANDREA B. SCHUMACHER 1, JOHN A. KNAFF 2, THOMAS A. CRAM 1, MARK DEMARIA 2, JAMES P.
The Impact of Satellite Data on Real Time Statistical Tropical Cyclone Intensity Forecasts Joint Hurricane Testbed Project Mark DeMaria, NOAA/NESDIS/ORA,
TC Intensity Estimation: SATellite CONsensus (SATCON) Derrick Herndon, Chris Velden, Tony Wimmers, Tim Olander International Workshop on Tropical Cyclone.
CIMSS TC Intensity Satellite Consensus (SATCON) Derrick Herndon and Chris Velden Meteorological Satellite (METSAT) Conference Ford Island Conference Center.
STATISTICAL ANALYSIS OF ORGANIZED CLOUD CLUSTERS ON WESTERN NORTH PACIFIC AND THEIR WARM CORE STRUCTURE KOTARO BESSHO* 1 Tetsuo Nakazawa 1 Shuji Nishimura.
Guidance on Intensity Guidance Kieran Bhatia, David Nolan, Mark DeMaria, Andrea Schumacher IHC Presentation This project is supported by the.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Limitations of Cotemporary Classification Algorithms Major limitations of classification algorithms like Adaboost, SVMs, or Naïve Bayes include, Requirement.
The Impact of Lightning Density Input on Tropical Cyclone Rapid Intensity Change Forecasts Mark DeMaria, John Knaff and Debra Molenar, NOAA/NESDIS, Fort.
Atlantic Simplified Track Model Verification 4-year Sample ( ) OFCL shown for comparison Forecast Skill Mean Absolute Error.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Tie Yuan and Haiyan Jiang Department of Earth & Environment, FIU, Miami, Florida Margie Kieper Private Consultant 65 th Interdepartmental Hurricane Conference.
Tropical Cyclone Rapid Intensity Change Forecasting Using Lightning Data during the 2010 GOES-R Proving Ground at the National Hurricane Center Mark DeMaria.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Improved Statistical Intensity Forecast Models: A Joint Hurricane Testbed Year 2 Project Update Mark DeMaria, NOAA/NESDIS, Fort Collins, CO John A. Knaff,
LECTURE 07: CLASSIFICATION PT. 3 February 15, 2016 SDS 293 Machine Learning.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
1 Statistics & R, TiP, 2011/12 Multivariate Methods  Multivariate data  Data display  Principal component analysis Unsupervised learning technique 
New Tropical Cyclone Intensity Forecast Tools for the Western North Pacific Mark DeMaria and John Knaff NOAA/NESDIS/RAMMB Andrea Schumacher, CIRA/CSU.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Stats Methods at IC Lecture 3: Regression.
Estimating standard error using bootstrap
Principal Component Analysis (PCA)
Training Session: Satellite Applications on Tropical Cyclones
Automated Objective Tropical Cyclone Eye Detection
Accounting for Variations in TC Size
Background on Classification
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Exploring Microarray data
Session 7: Face Detection (cont.)
LECTURE 10: DISCRIMINANT ANALYSIS
CH 5: Multivariate Methods
GOES-R Risk Reduction Research on Satellite-Derived Overshooting Tops
Objective Methods for Tropical Cyclone Center
Principal Component Analysis (PCA)
TC Intensity Estimation: SATellite CONsensus (SATCON)
Objective Dvorak Technique (ODT) AFWA/XOGM
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Generally Discriminant Analysis
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
LECTURE 09: DISCRIMINANT ANALYSIS
Parametric Methods Berlin Chen, 2005 References:
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
EM Algorithm and its Applications
Principal Component Analysis (PCA)
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

AUTOMATED TROPICAL CYCLONE EYE DETECTION USING DISCRIMINANT ANALYSIS Robert DeMaria

Overview Background Motivation Current Method Objective Proposed Method Data Algorithm Results Conclusions Future Work

Schematic of a Hurricane Eye Warm ocean evaporates near ring of very strong winds near the center 2. That causes a ring of thunderstorms near the center called the eye wall. Air moves upward in the eyewall (orange arrows) 3. When storm is organized, air starts to sink in the middle causing clouds to evaporate there and cause the eye to form. From “Hurricane Science and Society” web page

Infrared Image of Hurricane Joaquin October 3, 2015 Note: Coldest areas are bright colors, showing eye wall. Clear area near the storm center is the eye.

Why Do We Care About Detecting Hurricane Eyes? Initial eye formation is a sign a tropical cyclone is getting more organized It is often a signal that the storm is about to get much stronger Eye detection is an important step in estimating the current strength of a hurricane

Current Methods Hurricane hunter aircraft Dvorak Technique sos.noaa.gov/Education/tracking.html Hurricane hunter aircraft Low availability Dvorak Technique Used world wide Produces TC intensity estimates from satellite imagery Determining if a storm has an eye is an important step in the Dvorak technique Performed every 6 hours

Dvorak Technique for Intensity Estimation Uses both human judgment (with strict criteria) and automated tools Forecaster examines visible/IR/microwave imagery Forecasters subjectively classify satellite image as one of 4 basic patterns Curved band Shear pattern Central Dense Overcast Eye Properties of pattern determine intensity IE: Thickness of eyewall Size of curved band, size of CDO

Dvorak cont. Eye pattern selected if there is region near rotational center devoid of clouds Manual processes are time consuming Only small fraction of satellite data is used

Objective Replicate human-performed eye detection with automated procedure Utilize same input routinely available to forecasters Make eye detection available at more times to supplement human-performed eye detection.

Proposed Method Use geostationary IR data and best track data. Use simple statistical computer vision/machine learning techniques Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) Quadratic Discriminant Analysis (QDA) Use archive of classified images (from Dvorak) as training/truth.

Infrared Data from Geostationary Satellites 2D Image of cloud-top temperature Available every 30 minutes around the globe

IR vs Microwave vs Visible Used in addition to IR in Dvorak. Only available during the day. Adds complexity to algorithm. Microwave 3D view inside storm. Easier to determine if eye is present. Only available every 12 hr per satellite. Microwave channels vary between satellites. Can miss a storm entirely.

Storm “Best Track” Estimates of storm position and maximum wind speed Updated every 6 hours from satellite and aircraft data

Seven Variables Used From Best Track Position: Motion Vector: Max Wind Speed: Date:

Dvorak Classifications 1996-2012 data from CIRA archive 4109 samples at 6 hr intervals Atlantic only 991 (24%) eye cases

A Note About “Truth” Dvorak method has some flexibility for forecasters to interpret what they see. Using these classifications as truth will replicate any biases present in the Dvorak classifications.

Algorithm Subsect/unroll satellite imagery Form training/testing sets Perform PCA for dimension reduction Select predictors (using sensitivity vector) Train QDA/LDA Use testing set to evaluate performance

Three Versions of Algorithm Model 1 Best track input only Model 2 IR satellite data only Model 3 Combined best track and IR data Use sensitivity vector (defined later) to pick best set of predictors

Step 1) Subsecting Imagery Cut out 80x80 pixel box centered on best track estimate. 320km x 320km

Unroll Imagery Unroll 80x80 pixel box to form 6400 element vector.

Step 2) Training/Testing Sets Data randomly shuffled 70% used for training 30% used for testing Great care taken not to bias results using testing data.

Normalization Subtract mean image. Divide by standard deviation image.

Problem Have 4109 total samples and 6400 dimensions per image Solution: Use PCA to reduce data to most important patterns.

Step 3) Principal Component Analysis PCA generates an eigenvector for each dimension (6400 total) Eigenvalues represent variance explained by each eigenvector

Dimension Reduction ~95% Can select subset of eigenvectors (based on variance explained). Can project data on to subset to reduce dimension of data. IE: 6400 element vector becomes 25 element vector.

First 25 Eigenvectors 7 & 8 curved band

Dimension Reduction Can construct approximation of original data with a few EOFs

Step 4) Select Predictors Model 1 7 Best track predictors Model 2 25 IR EOF predictors with highest variance explained Model 3 Sensitivity vector used to select best predictors 10 IR EOF predictors 4 Best track predictors

Step 5) QDA/LDA Discriminant Functions where Σk is the covariance matrix for class k and μk is the mean for class k. LDA: where Σ is the weighted average of all Σk. Pick class k with the largest δk value Bi-variate normal distributions provide probability of being in class k

Sensitivity Vector Use LDA to give insight into importance of predictors Calculate change of discriminate function difference with 1 standard deviation change in each predictor value z ∆z(δ2-δ1) = ∂/∂z[(δ2-δ1)]z LDA δk are linear in x, so ∆z are constants

Step 6)Evaluation of Performance Confusion matrix related metrics Fraction Correct True Positive Fraction True Negative Fraction False Positive Rate False Negative Rate

Additional Metrics Peirce Skill Score (PSS) Brier Skill Score (BSS) Evaluates skill of classification compared to random guesses based on training data distrib. Brier Skill Score (BSS) Evaluates skill of probabilities compared to constant probabilities obtained from training data distrib. BSS and PSS 1 is perfect 0 is no better than no skill scheme < 0 is worse than no skill.

Model 1 Used only information derived from best track lat, lon, vmax, Julian Day Storm velocity, previous change in max winds Compared QDA and LDA metrics Used sensitivity vector to understand results

Best Track Predictor Distributions look for differences between classes

Model 1Performance Metrics Notes: About 85% correct, BSS better for LDA, PSS tiny bit better for QDA

Sensitivity Vector for Model 1 Note: Sign provides physical insight. For example, stronger storms or higher latitude storms more likely to have an eye.

Two-Predictor Model Pick top two predictors from sensitivity matrix Vmax and lat Verification only a little worse than w\ 7 predictors Plot class boundaries to compare QDA/LDA Most of signal is in vmax QDA helps to keep middle latitude cases in but exclude very low and high latitude cases

Model 2 – Satellite data only Input amplitudes (PCs) for top 25 EOFs Performance not as good as Model 1 Sensitivity vector shows which patterns help divide classes

Sensitivity Vector for Model 2

Top 4 EOFs for Model 2 Note that top 4 are all very symmetric. Negative sign of sensitivity vector for 9, 4, 15 accounts for the fact that the cold part is in the middle.

Model 3 – Combined IR and Best Track Want to keep number of predictors small Pick predictors where sensitivity vector is about 20% or more of most important predictor 4 Best Track predictors Vmax, lat, eastward and northward motion 10 IR predictors

Model 3 Verification Results About 90% correct classification (close to 95% accuracy of “truth”) BSS better for LDA (accuracy of probabilities) PSS better for QDA (accuracy of classifications) QDA has fewer samples for eye covariance matrix. Note use of all data for case studies. Average Performance metrics from 1000 shufflings of training/testing sets and with all cases

Sensitivity Vector for Model 3 Max wind is top, but IR data are the next three

Class Boundaries for Two-Predictor Model Vmax and IR EOF1 Right hand side isn’t really doing anything Left hand side curve does a better job of classifying weak cases.

Case Study Analysis Pick weak, moderate and strong cases Arlene 2005, Danielle 2010, Katrina 2005 Use full sample to make sure all cases for each storm included Plot classifications, probabilities and truth for full life cycle Examine cases where method failed

Algorithm Analysis for Hurricane Danielle x x

Reasons for Mis-Classifications Center position was not accurate Possible fix by repositioning Cold clouds were displaced from center, usually due to wind shear Wind shear vector is known, use as additional input or to rotate coordinates of IR data

Conclusions LDA and QDA can be used to objectively identify tropical cyclone eyes using commonly available data IR satellite imagery and basic storm parameters 90% success rate LDA is better for estimating probabilities QDA is better for classification Most important input is Vmax IR data more important than all other best track inputs

Future Work Develop method to re-center images. Use wind shear data. Predict eye development using probabilities. Use output to help with rapid intensification forecasting. Generalize method to estimate all four Dvorak scene types.

Extra Slides

Step 5) QDA/LDA Need estimate of probability of being in class k, given input vector x: P(C=k | x) Bayes rule relates P(C=k | x) to P(x | C=k) Fit bivariate normal distributions to get P(x) for each class Take natural log of P(C=k | x) to get discriminate function for each class For LDA, assume class covariance matrices are equal Makes discriminant function linear in x

Danielle Misclassified Images False Positives: False Negatives:

Algorithm Analysis for Hurricane Katrina x x X X

Katrina Misclassified Images False Negatives: