Www.geongrid.org GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.

Slides:



Advertisements
Similar presentations
Supporting Research on Campus - Using Cyberinfrastructure (CI) Public research use of ICT has rapidly increased in the past decade, requiring high performance.
Advertisements

The Open Earth Framework (OEF) A Data Integration Environment for Earth Sciences G. Randy Keller - Univ. Oklahoma Matt Fouch - Arizona State Univ. Chris.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
The Geosciences Network (GEON) An Example of Democratizing Science G. Randy Keller - University of Oklahoma (Cyberinfrastructure in Action)
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
March 25-27, 2002EarthScope IT Workshop USArray Breakout Group The USArray system: Sensor  Collection point Collection point  Archive Archive  User.
Oklahoma Supercomputing Symposium 2008 Oct 7 th 2008 Mining for Science and Engineering Presented by: Kenji Yoshigoe.
Data Conservancy: A Life Sciences Perspective Sayeed Choudhury Johns Hopkins University
NSF and Environmental Cyberinfrastructure Margaret Leinen Environmental Cyberinfrastructure Workshop, NCAR 2002.
1 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Global Earth Observation Grid Workshop, Bangkok, Thailand, March Integration Platform.
EInfrastructures (Internet and Grids) - 15 April 2004 Sharing ICT Resources – “Think Globally, Act Locally” A point-of-view from the United States Mary.
Dogan Seber, PhD San Diego Supercomputer Center University of California, San Diego I. DLESE Library II. DISCOVER OUR EARTH Earth Science Resources for.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
GEON: The User Perspective Choonhan Youn Dogan Seber, Chaitan Baru, Ashraf Memon San Diego Supercomputer Center, University of California at San Diego.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
UCAR Office of Programs Status Report URC Meeting April 2007 Jack Fellows Vice President of Corporate Affairs and Director of UOP Director of UOP.
Unidata Policy Committee Meeting Bernard M. Grant, Assistant Program Coordinator for the Atmospheric and Geospace Sciences Division May 2012 NSF.
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
GEON Science Application Demos
Computational Scientometrics Studying science by scientific means Dr. Katy Börner Cyberinfrastructure for Network Science Center, Director Information.
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES High Performance Computing applications in GEON: From Design to Production Dogan Seber.
Imagine a World…. With easy, unlimited access to scientific data from any field Where you can easily plot data of interest and display it any way you want.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Investigators: Chaitan Baru, Randy Keller, Dogan Seber, Krishna Sinha, Ramon Arrowsmith, Boyan Brodaric, Karl Flessa, Eric Frost, Ann Gates, Mark Gahegan,
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
ESIP Federation: Connecting Communities for Advancing Data, Systems, Human & Organizational Interoperability November 22, 2013 Carol Meyer Executive Director.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Semantically-Enabled Science Data Integration (SESDI) and The Virtual Solar-Terrestrial Observatory (VSTO) Semantically-enabled (large-scale) Scientific.
ESIP Federation Air Quality Cluster Partner Agencies.
CyberInfrastructure workshop CSG May Ann Arbor, Michigan.
Where to find LiDAR: Online Data Resources.
“ … a new age has dawned in scientific and engineering research, pushed by continuing progress in computing, information and communication technology,
Judith E. Skog Biological Sciences Directorate Emerging Frontiers Division H. Richard Lane Geological Sciences Directorate Earth Systems Science.
Geosciences Network (GEON): Enabling Discoveries in the Earth Sciences Dogan Seber San Diego Supercomputer Center University of California,
GEO Work Plan Symposium 2014 Data Management Task Force.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
Breakout #2 Generic Classes of Issues Hardware –big iron (capability, not just capacity) Network –last-mile problem –computational grid Software/frameworks.
National Science Foundation Revolutionizing science and engineering research though cyberinfrastructure by David G. Messerschmitt Member, NSF Blue Ribbon.
The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.
Kepler includes contributors from GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs (SEEK), EAR (GEON), DOE DE-FC02-01ER25486.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON Project Management Dogan Seber (GEON PI and Project Manager) San Diego Supercomputer Center.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
Near Real-Time Verification At The Forecast Systems Laboratory: An Operational Perspective Michael P. Kay (CIRES/FSL/NOAA) Jennifer L. Mahoney (FSL/NOAA)
Mid-Ocean Ridge Science Education for Teachers Ridge 2000 National Science Foundation Penn State University.
06/22/041 Data-Gathering Systems IRIS Stanford/ USGS UNAVCO JPL/UCSD Data Management Organizations PI’s, Groups, Centers, etc. Publications, Presentations,
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
CUAHSI HIS: Science Challenges Linking small integrated research sites (
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
EScience for All: Not If, But When Jeannette M. Wing Assistant Director, NSF CISE President’s Professor of Computer Science, CMU.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON: Education Components at the University of Texas at El Paso Ann Gates Department of Computer.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
A Science Collaboration Environment for the Network for Earthquake Engineering Simulation (NEES) Choonhan Youn Chaitan Baru, Ahmed Elgamal,
Human Social Dynamics: Interoperability Strategies for Scientific Cyberinfrastructure: The Comparative Interoperability Project ( ) initiates a.
GEON IT Solutions: Products and Demos Chaitan Baru San Diego Supercomputer Center.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
EarthCube Sustaining the Geosciences for 21 st Century Challenges Credits: from top to bottom: NOAA Okeanos Explorer Program (CC BY-SA 2.0), NASA/Kathryn.
Joslynn Lee – Data Science Educator
Problem: Ecological data needed to address critical questions are dispersed, heterogeneous, and complex Solution: An internet-based mechanism to discover,
Digital library for Earth System Education Teaching Boxes
Workshop on Cyberinfrastructure National Science Foundation
Growing importance of metadata for synthetics: Calculating and Sharing Synthetic Seismic Data Dogan Seber University of California, San Diego San Diego.
Presentation transcript:

GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA

Earth science research is moving towards a “systems approach”. To understand the Earth we need to look at it as a whole. Scientists have expertise in specific areas in their sub- disciplines and knowledge about sister disciplines is limited. Can cyberinfrastructure help?

Some common IT problems in the Geosciences Exponential increase in data volumes Diversity and complexity of data sets Data storage, access and preservation Data integration (semantic and syntactic) Computational challenges and access to HPC Advance visualization (3D/4D) Archiving publications with reusable components

A Scientific Effort Vector Background Background Research Research Data Collection and Data Collection and Compilation/ Compilation/ Software issues Science Back- Back- ground ground Research Research Data Collection Data Collection and andCompilation/ Software Issues Science Science Science - Analysis, Modeling, Interpretation, Discovery Source: R. Keller

Enabling Scientific Discoveries: Pathway to Discovery AccessProcessAnalyzeInterpretDiscovery DataKnowledge

Large/Complex Data Volumes National/International Observatories/projects EarthScope ES is a US project to collect data across the entire US over the next 10 years. Includes seismic, GPS and drill hole data LiDAR data Airborne and ground based data collection (large volumes of data sets) Global Observations A variety of satellites gathering data at different resolutions Hydrology, Environmental, Natural resource development projects, etc. Small projects Individual researchers maintain a lot of data sets, such as geology maps, geochemistry databases, earthquake catalogues, etc. Collectively reusable data reach large volumes and complex dimensions Challenge: How to manage these data so that vast amounts of data can be used by all scientists in an easy-to-use environment

Data Storage, Access and Preservation Preservation of digital and legacy data sets Since research needs and styles of each scientist vary, each researcher has his/her own data with their own “flavors” Access to other scientist’s data is limited When scientists do not continue to maintain their data, it is lost forever! Challenge: How to build a framework to exchange data and help preserving collected data sets

Data Integration Issues Integration requires both syntactic and semantic level integration. e.g., How can a geologist merge multiple geology maps to make a seamless (“integrated”) map that overlaps with national and international boundaries.

Integrate Geologic Data From Multiple Sources What is available is multiple distinct data sets

Integration Across Disciplines Earthquakes Aquifers Tectonics Moho depth Geology Faults Magnetics Mines Topography Focal Mechanisms Sediment thickness Gravity

Computational Challenges in Geosciences Developing/Accessing community codes Parallelizing software for efficient runs Accessing small to very large clusters Technical expertise to use high-end systems/clusters Challenges: How to build a system that helps scientists run advance software without having access to significant resources (computers and technical) How to build a system that helps scientists to focus on science rather than technological challenges/problems

(Goldstein 2001) Example: Can we build a system that not only a few privileged, but also the entire community could use to run 3D seismic modeling?

Geosciences are Visualization Oriented Once large volume data sets are accessed, how can we visualize them to get a better understanding of each data set? To build an effective visualization environment powerful software and hardware needed. Challenge: How to build a visualization system that helps scientists analyze large and complex data sets dynamically.

Archiving results and publications with reusable components Science progresses incrementally. New knowledge is built on top of existing knowledge. Scientific validity is shown by repeatability. Challenges: How to preserve scientific results and help others to repeat the analysis as efficiently as possible? How to share algorithms and processing flows with others?

Efforts underway… Numerous projects are funded to address these questions E.g., GEON, SCEC ITR, CUAHSI, EarthChem NSF funding opportunities in GEO and CISE directorates Professional societies getting involved in CI GSA Geoinformatics Division AGU Earth and Space informatics focus group Extensive level of outreach and learning activities taking place

Lessons Learned 1/2 Building cyberinfrastructure resources is a “social experimentation” Equal partnerships between domain and IT is a must Understand the needs of the domain sciences Community outreach is critical (workshops, seminars, scientific meetings, etc) Get it right the first time! Define the goals clearly, and publicize them Learn to differentiate “a system that works” and “a system that is usable”

Work with those who are willing and interested Identify “killer apps”, use them to attract more interest Teach! Help building a community of users and resource builders Problems are similar. Work with other communities, solutions may be out there Lessons Learned 2/2