NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota
Climate Change : The defining issue of our era Greenhouse gas emissions are the cause of global warming Human induced ecosystem changes (e.g. deforestation) Increased use of fossil fuels Consequences of Global Warming include : Increased occurrence of extreme events Melting ice caps/rising sea levels Heat waves/Droughts/Floods Shocks in supplies of water and food
Need of the day Ability to answer questions such as: What is the impact of climate change on intensity, duration and frequency of extreme events? E.g. Droughts, Floods, Hurricanes. Heat Waves What is the impact of deforestation on global carbon cycle? What is the relationship of crop yield and prices to deforestation dynamics and greenhouse gas emissions?
A Golden Opportunity for the KDD community Data sets need to answer the questions above are becoming available Remote Sensing data from satellites and weather radars Data from in-situ sensors and sensor networks Output from climate and earth system models Data guided processes can complement hypothesis guided data analysis to develop predictive insights for use by climate scientists, policy makers and community at large.
Challenges in Mining Earth Science Data Analysis and Discovery approaches need to be cognizant of climate and ecosystem data characteristics such as: Spatio-temporal autocorrelation Low-frequency variability Long-range spatial dependence Long memory temporal processes (teleconnections) Nonlinear processes Multi-scale nature Non-Stationarity
Illustrative Application: Forest Cover Change Changes in forests account for over 20% of the greenhouse gas emissions 2 nd only to fossil fuel emissions Terrestrial carbon can provide up to 25% of the climate change solution Ability to monitor changes in global forest cover over space and time is critical for enabling inclusion of forests in carbon trading ⇒ The need for a scalable technological solution to assess the state of forest ecosystems and how they are changing has become increasingly urgent. Deforestation moves large amounts of carbon into the atmosphere in the form of CO2. Good to Go Green: SFO Unveils Carbon Offset Kiosks 'Carbon Offset' Business Takes Root by Martin KasteMartin Kaste
Illustrative Application: Finding Climate Indices El Nino Events Nino 1+2 Index l A climate index is a time series of sea surface temperature or sea level pressure l Climate indices capture teleconnections The simultaneous variation in climate and related processes over widely separated points on the Earth Sea surface temperature anomalies in the region bounded by 80 W-90 W and 0 -10 S
Discovery of Climate Indices Using Clustering l An alternative approach for finding candidate indices. –Clusters represent ocean regions with relatively homogeneous behavior. –The centroids of these clusters are time series that summarize the behavior of these ocean areas, and thus, represent potential climate indices. –Clusters are found using the Shared Nearest Neighbor (SNN) method that eliminates “noise” points and tends to find regions of “uniform density”. –Clusters are filtered to eliminate those with low impact on land points l Many SST clusters and SLP cluster pairs reproduce well-known climate indices l Provides a better physical interpretation than those based on the SVD/EOF paradigm, and provide candidate indices with better predictive power than known indices for some land areas. DMI SOI NAO AO Steinbach, M., Tan, P., Kumar, V., Klooster, S., and Potter, C Discovery of climate indices using clustering. In Proceedings of the Ninth ACM SIGKDD international Conference on Knowledge Discovery and Data Mining (Washington, D.C., August , 2003). KDD '03. ACM, New York, NY,
Finding New Patterns: Indian Monsoon Dipole Mode Index Recently discovered Indian Ocean Dipole Mode index (DMI)* DMI is defined as the difference in SST anomaly between the region 5S-5N, 55E- 75E and the region 0-10S, 85E-95E. DMI and is an indicator of a weak monsoon over the Indian subcontinent and heavy rainfall over East Africa. The difference of SLP clusters 16 and 22 is a surrogate for the DMI index that is defined using SST. * N. H. Saji, B. N. Goswami, P. N. Vinayachandran and T. Yamagata, “A dipole mode in the tropical Indian Ocean,” Nature 401, (23 September 1999). DMI Plot of cluster 16 – cluster 22 versus the Indian Ocean Dipole Mode index. (Indices smoothed using 12 month moving average.)
Dynamic Climate Indices Most well-known indices based on data collected at fixed land stations. NAO computed as the normalized difference between SLP at a pair of land stations in the Arctic and the subtropical Atlantic regions of the North Atlantic Ocean However, underlying phenomenon may not occur at exact location of the land station. e.g. NAO Challenge: Given sensor readings for SLP at different points in the ocean, how to identify clusters of low/high pressure points that may move with space and time. Source: Portis et al, Seasonality of the NAO, AGU Chapman Conference, 2000.
Illustrative Application: Relationship Mining Example of a non-random association pattern between FPAR-Hi and NPP-Hi events and the land locations where such pattern is observed frequently. Left: Locations that support the association pattern {abnormally high FPAR => abnormally high NPP}. Right: Land locations that correspond to grassland and shrubland regions. The remarkable similarity between the two figures suggest that grasslands are vegetation that is able to more quickly take advantage of periodically high precipitation (and possibly solar radiation) than forests.