Download presentation
Presentation is loading. Please wait.
Published byBenjamin Bell Modified over 9 years ago
1
Databases and Global Environmental Change: Information Technology for Sustainable Development Gilberto Câmara INPE, Instituto Nacional de Pesquisas Espaciais Brazilian Academy of Sciences, Annual Meeting, May 2012
2
source: IGBP How is the Earth’s environment changing, and what are the consequences for human civilization? The fundamental question of our time
3
Global Change Where are changes taking place? How much change is happening? Who is being impacted by the change?
4
Limits for Models source: John Barrow (after David Ruelle) Complexity of the phenomenon Uncertainty on basic equations Solar System Dynamics Meteorology Chemical Reactions Hydrological Models Particle Physics Quantum Gravity Living Systems Global Change Social and Economic Systems
5
Limits for Models source: John Barrow (after David Ruelle) Complexity of the phenomenon Uncertainty on basic equations Solar System Dynamics Meteorology Chemical Reactions Hydrological Models Particle Physics Quantum Gravity Living Systems Global Change Social and Economic Systems e-science
6
Collaborative e-science Territory (Geography) Money (Economy) Culture (Antropology) Modelling (IT) Connect expertise from different fields Make the different conceptions explicit
7
Até 10% 10 - 20% 20 – 30% 30 – 40% 40 – 50% 50 – 60% 60 – 70% 70 – 80% 80 – 90% 90 – 100% Amazonia (4.000.000 km2 = size of Europe) Deforestation in Amazonia
8
Data (we need a lot of it) Deforestation in Brazilian Amazonia (1988-2011) dropped from 27,000 km 2 to 6,200 km 2
9
Daily warnings of newly deforested large areas Real-time Deforestation Monitoring
10
166-112 116-113 116-112 30 Tb of data 500.000 lines of code 150 man/years of software dev 200 man/years of interpreters How much it takes to survey Amazonia?
11
166-112 116-113 116-112 TerraAmazon – open source software for large-scale land change monitoring Spatial database (PostgreSQL with vectors and images) 2004-2008: 5 million polygons, 500 GB images
12
Terrestrial Airborne Near- Space LEO/MEO Commercial Satellites and Manned Spacecraft Far- Space L1/HEO/GEO TDRSS & Commercial Satellites Deployable Permanent Forecasts & Predictions Aircraft/Balloon Event Tracking and Campaigns User Community Vantage Points Capabilities Welcome to the Age of Data-intensive Science!
13
Weather and climate source: WMO 11,000 land stations (3000 automated) 900 radiosondes, 3000 aircraft 6000 ships, 1300 buoys 5 polar, 6 geostationary satellites
14
ARGOS Data Collection System (16000 plats) 650,000 messages processed daily
15
Argo bouy network
16
Data chain in Earth System Science fonte: NASA
17
Data-intensive Science = principles and applications of information technology for handling very large data sets
18
IT concepts are essential to global change researchers (but most of them don’t know it) Global change challenges will motivate new research in IT (but most of us are not looking there) Conjectures
19
Which data is out there? How to organize big data? How to get the data I need? Challenges for data-intensive science How to model big data? How to access and use big data?
20
Stage 1 – A scientist’s personal database Local database User interface Database creationAnalysisDatabase access
21
Stage 1 – A scientist’s personal database Local database User interface Database creationAnalysisDatabase access The good: data is close to you (or so you think) The bad: no long-term data preservation no data sharing
22
Stage 2 – A scientific lab database Corporate database User interface Database creation AnalysisDatabase access
23
Stage 2 – A scientific lab database Corporate database User interface Database creation AnalysisDatabase access The good: long-term data preservation data sharing inside the lab reusable corporate software The bad: substantial costs on data admin little outside data sharing
24
ECMWF Metview – MOPTC June 2004 - 24 Metview
25
ECMWF Metview – MOPTC June 2004 - 25 Field plotting
26
Stage 3 – A scientific lab database in the cloud Corporate database User interface Database creation AnalysisDatabase access
27
Stage 3 – A scientific lab database in the cloud Corporate database User interface Database creation AnalysisDatabase access The good: long-term data preservation shared costs on data admin The bad: rewrite software for cloud processing outside data sharing still not solved
28
Risk Analysis Analysis
29
On-line data feed ModelsSatellite/RadarDCP Rain total Fixed time and irregular – alert Point data One file per DCP Grid 4km Total rain 1h Total rain 24h Current (mm/h) Binary file ETA 40, 20, 5 Km Ensemble 40 Km Total rain 72h 72 files ASCII grid file
30
TerraMA 2 - Natural Disasters Monitoring and Alert System
31
Stage 4 – Multidatabase access Data source Data source Data source Modelling Data discoveryData accessAnalysis Remote Analysis
32
Stage 4 – Multidatabase access Data source Data source Data source Modelling Data discoveryData accessAnalysis Remote Analysis The good: long-term data preservation shared costs on data admin access to large external database The bad: rewrite software for cloud processing finding data is a major problem
33
Data Access Hitting a Wall Current science practice based on data download How do you download a petabyte?
34
Data Access Hitting a Wall Current science practice based on data download How do you download a petabyte? You don’t! Move the software to the archive
35
Scientific Data Management in the Coming Decade (Jim Gray, 2005) Next-generation science instruments and simulations will produce peta-scale datasets. Such peta-scale datasets will be housed by science centers that provide substantial storage and processing for scientists who access the data via smart notebooks. The procedural stream-of- bytes-file-centric approach to data analysis is both too cumbersome and too serial for such large datasets. Database systems will be judged by their support of common metadata standards and by their ability to manage and access peta-scale datasets.
36
36 Virtual Observatory If data is online, internet is the world ’ s best telescope Scientific Data Management in the Coming Decade (Jim Gray)
37
Where is scientific database going?
38
From tables to arrays nomeCPF cargo SQL language selection, projection, join, relation (table) SELECT * FROM images WHERE date=“today ” relational algebra SELECT Mean (A.B) FROM Array A AQL language Spatial queries, Math operations Scientific data Array Algebra
39
Communicating concepts is hard Image source: WMO vulnerability? climate change? poverty?
40
degradation We’re bad at representing meaning deforestation? degradation? disturbance? Communicating concepts is hard
41
When did the Aral Sea reach the tipping point? Communicating change is very hard
42
Describing events and processes is very hard When did the flood occur?
43
Earth System Science data management poses a major challenge for the database community We need new techniques, architectures and data handling techniques to deal with scientific data Conclusions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.