The 2008 Artificial Intelligence Competition

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Build 10 Tornado Detection Algorithm The Build 9 Tornado Vortex Signature ( TVS ) Detection Algorithm was not very robust, and was designed to be a place.

Naïve-Bayes Classifiers Business Intelligence for Managers.

Radar Climatology of Tornadoes in High Shear, Low CAPE Environments in the Mid-Atlantic and Southeast Jason Davis Matthew Parker North Carolina State University.

DYnamical and Microphysical Evolution of Convective Storms Thorwald Stein, Robin Hogan, John Nicol DYMECS.

CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 27, 2012.

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Travis Smith (OU/CIMMS) February 25–27, 2015 National Weather Center

Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Data Integration: Assessing the Value and Significance of New Observations and Products John Williams, NCAR Haig Iskenderian, MIT LL NASA Applied Sciences.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

THE GOES-R GLM LIGHTNING JUMP ALGORITHM (LJA): RESEARCH TO OPERATIONAL ALGORITHM Elise V. Schultz 1, C. J. Schultz 1,2, L. D. Carey 1, D. J. Cecil 2, G.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

Verification of the 88D Hail Detection Algorithm at WFO Cheyenne Mike Weiland WFO Cheyenne.

Accuracy Assessment Having produced a map with classification is only 50% of the work, we need to quantify how good the map is. This step is called the.

1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.

Using Spatiotemporal Relational Random Forests to Predict Convectively Induced Turbulence Also know as: U.S.R.R.F.P.C.I.T or Purscrift Dr. Amy McGovern.

NSF Medium ITR Real-Time Mining of Integrated Weather Information Setup meeting (Aug. 30, 2002)

Relationships between Lightning and Radar Parameters in the Mid-Atlantic Region Scott D. Rudlosky Cooperative Institute of Climate and Satellites University.

Jon Trueblood (Dordt College) Timothy Sliwinski (FSU) Dr. Amy McGovern David John Gagne (OU) Dr. John Williams Dr. Jennifer Abernethy (NCAR) Image Courtesy:

Using Machine Learning Techniques in Stylometry Ramyaa, Congzhou He, Dr. Khaled Rasheed.

+ Storm Tracking and Lightning Cell Clustering using GLM for Data Assimilation and Forecast Applications Principal Investigators: Kristin Kuhlman and Don.

Storm tracking & typing for lightning observations Kristin Calhoun, Don MacGorman, Ben Herzog.

Advanced interpretation and verification of very high resolution models National Meteorological Administration Rodica Dumitrache, Aurelia LUPASCU,

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Discriminating Between Severe and Non-Severe Storms Scott D. Rudlosky Henry E. Fuelberg Department of Meteorology Florida State University.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Object-oriented verification of WRF forecasts from 2005 SPC/NSSL Spring Program Mike Baldwin Purdue University.

CI VERIFICATION METHODOLOGY & PRELIMINARY RESULTS

Travis Smith, Jidong Gao, Kristin Calhoun, Darrel Kingfield, Chenghao Fu, David Stensrud, Greg Stumpf & a cast of dozens NSSL / CIMMS Warn-on-Forecast.

TOULOUSE (FRANCE), 5-9 September 2005 OBJECTIVE VERIFICATION OF A RADAR-BASED OPERATIONAL TOOL FOR IDENTIFICATION OF HAILSTORMS I. San Ambrosio, F. Elizaga.

Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.

MultiModality Registration Using Hilbert-Schmidt Estimators By: Srinivas Peddi Computer Integrated Surgery II April 6 th, 2001.

Investigating Lightning Cessation at KSC Holly A. Melvin Henry E. Fuelberg Florida State University GOES-R GLM Workshop September 2009.

UAH 28 Sept 2008R. Boldi NSSTC/UAH 1 Hazardous Cell Tracking Robert Boldi 29 September 2008 NSSTC/UAH.

Testing of Objective Analysis of Precipitation Structures (Snowbands) using the NCAR Developmental Testbed Center (DTC) Model Evaluation Tools (MET) Software.

Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.

Application of the CRA Method Application of the CRA Method William A. Gallus, Jr. Iowa State University Beth Ebert Center for Australian Weather and Climate.

Travis Smith U. Of Oklahoma & National Severe Storms Laboratory Severe Convection and Climate Workshop 14 Mar 2013 The Multi-Year Reanalysis of Remotely.

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Intro to Machine Learning

Paper Review Jennie Bukowski ATS APR-2017

A dual-polarization QPE method based on the NCAR Particle ID algorithm Description and preliminary results Michael J. Dixon1, J. W. Wilson1, T. M. Weckwerth1,

Pamela Eck, Brian Tang, and Lance Bosart University at Albany, SUNY

General framework for features-based verification

Presented by Pat McCarthy Prairie and Arctic Storm Prediction Centre

High resolution radar data and products over the Continental United States National Severe Storms Laboratory Norman OK, USA.

A Real-Time Automated Method to Determine Forecast Confidence Associated with Tornado Warnings Using Spring 2008 NWS Tornado Warnings John Cintineo Cornell.

Nic Wilson’s M.S.P.M. Research

Automated Extraction of Storm Characteristics

Tornado Warning Verification and its Relationship to Storm Type

An overview of real-time radar data handling and compression

Prepared by: Mahmoud Rafeek Al-Farra

A Real-Time Learning Technique to Predict Cloud-To-Ground Lightning

Predicting Frost Using Artificial Neural Network

Computer Vision Chapter 4

3.1.1 Introduction to Machine Learning

A Neural Network for Detecting and Diagnosing Tornadic Circulations

Aiding Severe Weather Forecasting

MIS2502: Data Analytics Clustering and Segmentation

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Ensemble learning.

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

A New Approach to Tornado Warning Guidance Algorithms

Identifying Severe Weather Radar Characteristics

Presentation transcript:

The 2008 Artificial Intelligence Competition Valliappa Lakshmanan National Severe Storms Laboratory & University of Oklahoma Elizabeth E. Ebert Bureau of Meteorology Research Center, Australia Sue Ellen Haupt Penn State University, State College, PA Sponsored by Weather Decision Technologies 12/2/2018 lakshman@ou.edu

Why a competition? AI committee organizes: Conference with papers Tutorial session before conference (every 2 years) The tutorial sessions are very popular, but: Gets repetitive Same set of techniques presented too often Often by same speakers! Not clear what the differences are Different datasets, etc. Can I not just use a machine intelligence or neural network toolbox? Purpose of competition is to replace tutorial but provide learning experience Same dataset, different techniques Competitive aspect is just a sideshow – don’t put too much stock into it! 12/2/2018 lakshman@ou.edu

The 2008 Artificial Intelligence Competition Dataset Results 12/2/2018 lakshman@ou.edu

Project 1: Skill Score By Storm Type Try to answer this question (posed by Travis Smith) Very critical, but hard to answer based on current knowledge Is it the type of weather or is it the forecaster skill? Initially, concentrate on tornadoes Based on radar imagery, classify the type of storms at every time step Take NWS warnings and ground truth information for a lot of cases Compute skill scores by type of storm Summer REU project Eric Guillot, Lyndon State Mentors: Travis Smith, Don Burgess, Greg Stumpf, V Lakshmanan Does the skill score of a forecast office as evaluated by the NWS depend on the type of storms that the NWS office faced that year? 12/2/2018 lakshman@ou.edu

Project 2: National Storm Events Database Build a national storm events database With high-resolution radar data combined from multiple radars Derived products Support spatiotemporal queries Collaboration between NSSL, NCDC and OU (CAPS, CSA) 12/2/2018 lakshman@ou.edu

Approach Project 1: How to get classify lots and lots of radar imagery? Need automated way to identify storm type Technique: Cluster radar fields Extract storm characteristics for each cluster Associate storm characteristics to human-identified storm type Train learning technique (NN/decision tree) to do this automatically Let it loose on entire dataset Project 2: How to support spatiotemporal queries on radar data? Can create polygons based on thresholding data But need to tie together different data sources Need automated way to extract storm characteristics for querying 12/2/2018 lakshman@ou.edu

WDSS-II CONUS Grids In real-time, combine data from 130+ WSR-88Ds Reflectivity and azimuthal shear fields Use these to derive products: Reflectivity Composite VIL Echo top heights Hail probability (POSH), Hail size estimates (MESH), etc. Low-level, mid-level shear Many others (90+) Have the 3D reflectivity and shear products archived Can use these to recreate derived products 12/2/2018 lakshman@ou.edu

Cluster Identification Using Kmeans Hierarchical clustering using texture segmentation and K-means clustering Lakshmanan, V., R. Rabin, and V. DeBrunner, 2003: Multiscale storm identification and forecast. J. Atm. Res., 67, 367-380 Technique yields 3 different scales of clustering Chose D to train the decision tree Cluster attributes at 420 km^2 (scale D) used for our study 12/2/2018 lakshman@ou.edu

Manual Storm Classification Manually classified over 1,000 storms over three days worth of data (March 28th, May 5th, and May 28th of 2007). Used all the fields ultimately available to automated algorithm VIL, POSH, MESH, Rotation Tracks, etc. Available in real-time at http://wdssii.nssl.noaa.gov/ over entire CONUS 12/2/2018 lakshman@ou.edu

Hail Case (Apr. 19, 2003; Kansas) Reflectivity Composite from KDDC, KICT, KVNX and KTWX 12/2/2018 lakshman@ou.edu

Height of echo above 18 dBZ Echo Top Height of echo above 18 dBZ 12/2/2018 lakshman@ou.edu

Maximum expected size of hail MESH Maximum expected size of hail 12/2/2018 lakshman@ou.edu

Vertical Integrated Liquid VIL Vertical Integrated Liquid 12/2/2018 lakshman@ou.edu

Cluster Table ConvectiveArea in km^2 MaxEchoTop and LifetimeEchoTop Each identified cluster has these properties: ConvectiveArea in km^2 MaxEchoTop and LifetimeEchoTop MESH and LifetimeMESH MaxVIL, IncreaseInVIL and LifetimeMaxVIL Centroid, LatRadius, LonRadius, Orientation of ellipse fitted to cluster MotionEast, MotionSouth in m/s Size in km^2 One set of clusters per scale We used only the 420km^2 cluster 12/2/2018 lakshman@ou.edu

Controlling the Cluster Table Can choose any gridded field for output From gridded field, can compute the following statistics within cluster Minimum value, Maximum value Average, Standard deviation Area within interval (Useful to create histograms) Increase in value temporally Does not depend on cluster association being correct Computed image-to-image Lifetime maximum/minimum Depends on cluster association being correct, so better on larger clusters 12/2/2018 lakshman@ou.edu

Continued on next slide Input Parameters AspectRatio dimensionless An ellipse is fitted to the storm. This is the ratio of the length of the major axis to the length of the minor axis of the fitted ellipse. ConvectiveArea km^2 Area of the storm that is convective LatRadius km Extent of the storm in the north-south direction LatitudeOfCentroid Degrees Location of storm's centroid LifetimeMESH mm Maximum expected hail size of the storm over its entire past history LifetimePOSH Peak probability of severe hail of the storm over its entire past history LonRadius Extent of the storm in the east-west direction Continued on next slide 12/2/2018 lakshman@ou.edu

Input Parameters (contd.) LonRadius km Extent of the storm in the east-west direction LongitudeOfCentroid Degrees Location of the storm's centroid LowLvlShear s^-1 Shear closest to the ground as measured by radar MESH mm Maximum expected hail size from storm MaxRef dBZ Maximum reflectivity observed in storm MaxVIL kg/m^2 Maximum vertical integrated liquid in storm MeanRef Mean reflectivity within storm MotionEast MetersPerSecond Speed of storm in easterly direction MotionSouth Speed of storm in southerly direction 12/2/2018 lakshman@ou.edu Continued on next slide

Input Parameters (contd.) OrientationToDueNorth degrees Orientation of major axis of ellipse to due north. A value of 90 indicates a storm that is oriented east-west. The more circular a storm is (see aspect ratio), the less reliable this measure is. POSH dimensionless Peak probability of severe hail in storm Rot120 s^-1 Rot30 Maximum azimuthal shear observed in storm over the past 30 minutes RowName Storm id Size km^2 Storm size Speed MetersPerSecond Speed of storm 12/2/2018 lakshman@ou.edu

Types of Storms Four categories: Not organized Isolated supercell Convective lines Includes lines with embedded supercells Pulse storms 12/2/2018 lakshman@ou.edu

Decision Tree Training Trained decision tree using manually classified storms in order to develop a logical process for automatically classifying them Tested this decision tree on three additional cases (April 21st of 2007, and May 10th and 14th of 2006) TSS=0.58; good enough for NWS study to continue 12/2/2018 lakshman@ou.edu

Decision Tree Didn’t know whether the dataset was tractable Why decision tree? Didn’t know whether the dataset was tractable Wanted to be able to analyze resulting “machine” Make sure extracted rules were reasonable 12/2/2018 lakshman@ou.edu

The 2008 Artificial Intelligence Competition Dataset Results 12/2/2018 lakshman@ou.edu

Entries Received 6 official, and one unofficial, entry by competition deadline Unofficial entry not accompanied by abstract or AMS manuscript Neil Gordon (Met Service, New Zealand): random forest Not eligible for prize, but included in comparisons Official Entries: John K. Williams and Jenny Abernathy: random forests and fuzzy logic Ron Holmes: neural network David Gagne and Amy McGovern: boosted decision tree Jenny Abernathy and John Williams: support vector machines Luna Rodriguez: genetic algorithms Kimberly Elmore: discriminant analysis and support vector machines 12/2/2018 lakshman@ou.edu

Distribution of storm categories Truth Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy 12/2/2018 lakshman@ou.edu

Classifications for observed class 0 (Not severe) Holmes Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Rodriguez Williams & Abernethy Not severe Isolated supercell Convective line Pulse storm 12/2/2018 lakshman@ou.edu

Classifications for observed class 1 (Isolated supercell) Holmes Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Rodriguez Williams & Abernethy Not severe Isolated supercell Convective line Pulse storm 12/2/2018 lakshman@ou.edu

Classifications for observed class 2 (Convective line) Holmes Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Rodriguez Williams & Abernethy Not severe Isolated supercell Convective line Pulse storm 12/2/2018 lakshman@ou.edu

Classifications for observed class 4 (Pulse storm) Holmes Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Rodriguez Williams & Abernethy Not severe Isolated supercell Convective line Pulse storm 12/2/2018 lakshman@ou.edu

Similarity matrix - % of identical classifications among entries Truth Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy 100 74 72 67 77 76 62 53 69 84 52 70 83 80 61 75 54 55 73 93 57 58 91 32 12/2/2018 lakshman@ou.edu

Statistical results – True Skill Statistic Joint First Third Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy 12/2/2018 lakshman@ou.edu

Statistical results – Accuracy and Heidke Skill Score Baseline Abernethy & Williams Elmore & Richman Gagne & McGovern Gordon Holmes Rodriguez Williams & Abernethy 12/2/2018 lakshman@ou.edu

Acknowledgements Thanks to: Weather Decision Technologies for sponsoring the prizes The AMS probability and statistics committee For loaning us Beth Ebert’s expertise All the participants for entering competition and explaining methodology Can be hard to find time to do “extra-curricular” work Very grateful that you could enter this competition 12/2/2018 lakshman@ou.edu

Where to go from here? Please share with us your thoughts and suggestions Is such a competition worth doing? Was this session a learning experience? How can it be improved in the future? Is there something that you would have done differently? Why? Our thoughts: Classification is not the only aspect of machine intelligence Estimation, association finding, knowledge capture, clustering, … Perhaps a future competition could address one of these areas Address another aspect of AMS besides short-term severe weather 12/2/2018 lakshman@ou.edu