1 从信息化基础设施角度展望 下一代地理信息系统 王少文
2 What is Cyberinfrastructure? It was six men of Indostan To learning much inclined, Who went to see the elephant (Though all of them were blind), That each by observation Might satisfy his mind It’s Network! It’s Grid! It’s HPC! And more!: -Applications -Data -E-community -Instruments -Virtual Organization -Etc. It’s Sharing It’s Storage It’s middle ware After Charlie Catlett
3 Supercomputer Centers PACI Terascale | | | | | | Cyberinfrastructure NSF Networking After Deborah L. Crawford Cyberinfrastructure Evolution
4 Integration – Holism "The whole is more than the sum of its parts. “ By Aristotle in the Metaphysics Borromean rings, after Daniel E. Atkins Image source:
5 Motivation – What ’ s Beyond/Next? Google Earth: ESRI ArcGIS: Microsoft Virtual Earth:
6 Challenges Problems User interface – not intuitive Based on window, icon, menu, pointing device windowiconmenupointing device Single user Desktop-based Hard to collaborate Low performance How much data can we analyze?
7 Purpose Illustrate how GISolve – a cyberinfrastructure-based GIS is developed to help advance research and education of GIScience and cyberinfrastructure Demonstrate science impact of GISolve and the use of GISolve as an education tool Background Design Demo Implementation Education Conclusions Purpose
8 Background Geographic information quantity Ever increasing Application driven GPS, location based services, remote sensing Computationally intensive geographic information analysis Heuristics and optimization Simulation Spatial statistical methods Cyberinfrastructure (CI) High-performance computing Virtual organization Grid computing, middleware Data, visualization, and knowledge Education and workforce development Background Design Demo Implementation Education Conclusions Purpose
9 Review CI-based geographic analysis Wang et al. 2008, Wang and Armstrong 2008, Wang and Zhu, 2008, Wang and Armstrong 2003 Domain-specific CI activities GEON (Geosciences Network) LEAD (Linked Environments for Atmospheric Discovery) NEON (National Ecological Observatory Network) WATERS (WATer and Environmental Research Systems) Network Internet/Web-based GIS Tsou 2004, Wang et al. 2005, Yang et al Ontology-driven GIS Fonseca et al Background Design Demo Implementation Education Conclusions Purpose
10 CI Complexity Cyberinfrastructure Is evolving Has many sophisticated components Has NOT been developed to directly focus on the requirements of domain- specific problem solving Background Design Demo Implementation Education Conclusions Purpose
11 Managing CI Complexity Science and engineering gateway Rooted in CI Problem solving environments Rooted in domain science and engineering Background Design Demo Implementation Education Conclusions Purpose
12 Wang and Zhu (2008) GISolve – Integrating CI Capabilities and GIS
13 GISolve Middleware Spatial computational domain Domain decomposition Task scheduling Information broker and resource discovery Data access module Problem solving environments implemented using Web 2.0 technologies Monitoring servicesProtocols and services for data access on the Grid, such as the Globus GridFTP Middleware such as the Globus Toolkit and Condor Resource management
14 Computational Intensity = Wattage?! For CI-based geographic problem solving, computational intensity metrics are critically important! Background Design Demo Implementation Education Conclusions Purpose
15 Spatial Computational Domain Wang, S., and Armstrong, M. P “ A Theoretical Approach to the Use of Cyberinfrastructure in Geographical Analysis. ” International Journal of Geographical Information Science, DOI: / Background Design Demo Implementation Education Conclusions Purpose
16 Information Broker and Resource Discovery Self-Organized Grouping method for Grid resource discovery Padmanabhan, A., Wang, S., Ghosh, S., and Briggs, R “ A Self-Organized Grouping (SOG) Method for Efficient Grid Resource Discovery. ” In: Proceedings of the Grid 2005 Workshop, Seattle, WA, November , 2005, IEEE Press, pp Modular Information Provider to support interoperable information brokering Wang, S., Shook, E., Padmanabhan, A., Briggs, R., Pearlman, L “ Developing a Modular Information Provider to Support Interoperable Grid Information Services. ” In: Proceedings of Grid and Cooperative Computing - GCC 2006: The Fifth International Conference, IEEE Computer Society, pp Background Design Demo Implementation Education Conclusions Purpose
17 Domain Decomposition and Task Scheduling 1, 1 0, 02, 4 4, 25, 3 11, 13 14, 1415, 15 3, 5 6, 67, 7 8, 89, 910, 12 12, 1013, 11 1, 1 0, 02, 4 4, 25, 3 11, 13 14, 1415, 15 3, 5 6, 67, 7 8, 89, 910, 12 12, 1013, 11 1, 1 0, 02, 4 4, 25, 3 11, 13 14, 1415, 15 3, 5 6, 67, 7 8, 89, 910, 12 12, 1013, 11 1, 1 0, 02, 4 4, 25, 3 11, 13 14, 1415, 15 3, 5 6, 67, 7 8, 89, 910, 12 12, 1013, 11 Small Capacity Large Capacity Medium Capacity Background Design Demo Implementation Education Conclusions Purpose
18 GISolve Workflow Background Design Demo Implementation Education Conclusions Purpose
19 TeraGrid GIScience Gateway Based on GISolve ( Background Design Demo Implementation Education Conclusions Purpose
20 TeraGrid Image source:
21 Open Science Grid Image source:
22 GISolve Services Security Decomposition and task scheduling Geographic data access Information broker and resource discovery Workflow Background Design Demo Implementation Education Conclusions Purpose
23 Service- oriented approach Background Design Demo Implementation Education Conclusions Purpose
24 Spatio-Temporal Data Handling and Visualization Bioenergy data portal Background Design Demo Implementation Education Conclusions Purpose
25 Bayesian Geostatistical Modeling – Markov chain Monte Carlo Communication topology management Help split processors into groups The processors of each group belong to the same computer Each group runs a single chain Cross-cluster communication cost is minimal Node 1 Node 3 Node 2 Node 4 Node 5 Node 7 Node 6 Node 8 Node 9 Node 11 Node 10 Node 12 Chain 1Chain 2 Chain 3 Supercomputer B Supercomputer A Background Design Demo Implementation Education Conclusions Purpose
26
27 Analyses Supported by the Gateway Bayesian geostatistical modeling Yan, J., Cowles, M. K., Wang, S., and Armstrong, M. P. (2007) Parallelizing MCMC for Bayesian spatiotemporal geostatistical models. Statistics and Computing, 17 (4): Detection of local spatial clustering Wang, S., Cowles, M. K., and Armstrong, M. P. (2008) Grid computing of spatial statistics: using the TeraGrid for G i *(d) analysis. Concurrency and Computation: Practice and Experience, forthcoming Spatial interpolation Wang, S., and Armstrong, M. P. (2003) A quadtree approach to domain decomposition for spatial interpolation in Grid computing environments. Parallel Computing, 29 (10): Under development ABM (Agent-Based Modeling) Spatial Genetic Algorithms Background Design Demo Implementation Education Conclusions Purpose
28 Integrated CI-based Workbench for Geospatial Scientists Background Design Demo Implementation Education Conclusions Purpose
29 Education and Outreach In classrooms The University of Iowa, 2006, 2007 Foundations of Geographic Information Systems (undergraduate) Principles of Geographic Information Systems (undergraduate and graduate) Bayesian Statistics (undergraduate and graduate) Computing in Statistics (undergraduate and graduate) The University of Illinois at Urbana-Champaign, 2007, 2008 Advanced Geographic Information Systems (undergraduate and graduate) Introduction to Geographic Information Systems (undergraduate) TeraGrid07 student competition High-school students Supercomputing 2007 education program High-school and college teachers Background Design Demo Implementation Education Conclusions Purpose
30 Conclusions GISolve principles Integrated Collaborative Distributed High-performance Service-oriented GISolve is effective to teach CI GIScience CI-based GIS Background Design Demo Implementation Education Conclusions Purpose
31 CIGI – CyberInfrastructure and Geospatial Information Laboratory / Virtual-Organization High-Performance, Distributed and Collaborative GIS Geospatial Analysis and Modeling Base Cyberinfrastructure Applications Energy, Environment, Public Health GISolve Computational Intensity Open Science Grid, TeraGrid Application Driven Computational Thinking Multidisciplinary Interactions
32 Disciplines Involved in the CIGI VO Biology Computer Science Geography GIScience Environmental engineering History Hydrology Statistics
33 Global Malaria Risk
34 From James D. Myers
35 Ongoing R&D Interoperability of GISolve services Spatiotemporal computational domain Adaptive domain decomposition services Visualization services Evaluation of GISolve performance Extension of the types of geographic information analysis Provenance management
36 Acknowledgments CyberInfrastructure and Geospatial Information Laboratory (CIGI) National Center for Supercomputing Applications (NCSA) Faculty Fellowship Department of Energy Open Science Grid National Science Foundation ITR: iVDGL (International Virtual Data Grid Laboratory) OCI Open Science Grid TeraGrid SES060004N TeraGrid SES070004N Colleagues Dr. Marc P. Armstrong (Geography, UIowa) Dr. David A. Bennett (Geography, UIowa) Mr. Tim Cockerill (NCSA, UIUC) Dr. Mary Kathryn Cowles (Statistics, UIowa) Mr. Yan Liu (CIGI/NCSA, UIUC) Mr. Doru Marcusiu (NCSA, UIUC) Dr. James D. Myers (NCSA, UIUC) Dr. Anand Padmanabhan (CIGI/NCSA, UIUC) Ms. Ruth Pordes (Open Science Grid) Dr. Brian J. Smith (Biostatistics, UIowa) Mr. Eric Shook (Geography, UIUC) Dr. Wenwu Tang (CIGI/NCSA, UIUC) Mr. John W. Towns (NCSA, UIUC) Dr. Edward Walker (TACC, UT-Austin) Ms. Nancy Wilkins-Diehr (SDSC/TeraGrid) Dr. Jun Yan (Statistics, UConn) Dr. Xin-Guang Zhu (Biology, UIUC)
37 谢谢 ! Comments and/or questions?