Download presentation
Presentation is loading. Please wait.
Published byBeryl Carr Modified over 9 years ago
1
Data- and Compute-Driven Transformation of Modern Science How e-Infrastructure & Policy Support Paradigm Shifts in Research Edward Seidel Senior Vice President, Research and Innovation Skolkovo Institute of Science and Technology 1
2
2 Part 1: We are in a period of unprecedented change in Science and Society…the crises & opportunities this creates
3
3 3 Profound Transformation of Science Collision of Two Black Holes 1972: Hawking. 1 person, no computer 50 KB 1994: 10 people, NCSA Cray Y-MP, 50MB 1998: 15 people, NCSA Origin, 50GB
4
Community Einstein Toolkit “Einstein Toolkit : open software for astrophysics to enable new science, facilitate interdisciplinary research and use emerging petascale computers and advanced CI.” Consortium: 92 members, 46 sites, 15 countries Whole consortium engaged in directions, support, development Simulation: Luciano Rezzolla, Max Planck Institut für Gravitationsphysik (AEI) 4 Community + software + algorithms + hardware + … Many groups can do this: field explodes! Major triumph of Computational Science---solve EEs!
5
New Frontiers: Relativistic Matter Nuclear equations of state Collab. with astrophysicists General relativistic magnetohydrodynamics Some groups have ideal MHD Radiation Transport (neutrinos/photons) Expensive and complicated! Requires opacities/emissivities Chemical reactions (thermonuclear, chemical) SN community! Computation: Multiphysics!! GRMHD: petascale problem Radiation transport beyond this Schnetter et al, PetaScale Computing: Algorithms and Applications, 2007 Zoom in to just this part: post BH formation and evolution of jet Zoom in to just this part: post BH formation and evolution of jet
6
Schnetter, et al Post BH Formation/Evolution of Jet Multiphysics GR, Neutrinos, MHD, Nuclear EOS Computer Science 10 level AMR, optimized for Blue Waters Science 200s evolution, analyze it all Blue Waters 6M Pflop@1PF sustained = 70days! 6 Multiphyiscs framework needed for fluids in astrophysics and porous media… Rezzolla, et al
7
7 Theory and Observation of Universe Gravitational Waves! Complex problems in relativistic astrophysics Relativity, hydrodynamics, nuclear physics, radiation, neutrinos, magnetic fields: globally distributed collab! Observe (PB), compute (PF) signals Gravity and general relativity are transformed 4 centuries of small science, small data culture 2-3 decades of radical change in both data (factors of 1000 per~5 years) and collaboration LIGO/VIRGO/GEO New era of science after a century! Data- and compute- dominated gravitational wave astronomy!
8
US Council on Competitiveness Ping Golf Moved from workstation to Cray, now make prototype only at last stage of design Too effective: had to simulate “less effective” design! Proctor & Gamble Pringles “flying” off manufacturing line causing significant lost product and revenue. Using CFD codes that Boeing uses, airflow over the Pringle modeled, design so Pringles did not “lift off” 8
9
9 Part 1a: The Growth of Data Data Tsunami “I’m still here…” But I’m your new baby big brother… With millions of processors…
10
Going Beyond a Community Transient & Multi-Messenger Astronomy 10 New era: seeing events as they occur Here now ALMA, EVLA in radio Ice Cube neutrinos On horizon 24-42m optical? LSST = SDSS (40TB) every night! SKA = exabytes Simulations integrate all physics Data-intensive = compute-intensive Astronomy 1500-2010 was passive. No longer! Communities need to share data, software, knowledge, in real time Will require integration across disciplines, end-to- end
11
Big Data vs The Long Tail of Science Many “Big Data” projects are “special” Tend to be highly organized, have singular sources of data, professionally curated, a lot attention paid to them What about the “Long Tail” (the other 99%)? Thousands of biologists sequencing communities of organisms Thousands of chemist and materials scientists developing a “materials genome” Millions of people “Tweeting”… Characteristics: Heterogeneous, perhaps hand generated Not curated, reused, served, etc… 11 How do we harness the power of this long tail? News Flash! NYT 6/3/13: Drug side effects discovered by mining web logs: paroxetine + pravastatin = high blood sugar!
12
12 Grand Challenge Communities Combine it All... Where is it going to go? 12 Same CI useful for black holes, hurricanes
13
13 Grand Challenge Communities for Complex Problems Require many disciplines, all scales of collaborations Individuals, groups, teams, communities Multiscale Collaborations: Beyond teams Are dynamic and highly multidisciplinary Time domain astronomy, emergency forecasting, metagenomincs, materials genome… Drive sharing technologies and methodologies Researchers collaborate, work by sharing data. Places requirements on eInfrastructre: Software, networks, collaborative environments, data, sharing, computing, etc Scientific culture, reproducibility, access, university structures “Publications.” What is a modern publication? 13 Social, behavioral and economic sciences will be critical in helping us understand these issues…
14
Scenarios like this in all fields 14 NEON+GIS
15
Framing the Challenge: Science and Society Transformed by Data Modern science Data- and compute- intensive Integrative, multiscale 4 centuries of constancy, 4 decades 10 9-12 change! Multi-disciplinary Collaborations Individuals (Galileo!) Groups, teams, Grand Challenge Communities Big Data + Long Tail Sea of Data Age of Observation 15 We still think like this… …But such radical change cannot be adequately addressed with (current) incremental approach! Students take note!
16
Part 2: Crises, Challenges, Opportunities Computing Data Software End-to-end Networks Organizational structures Education No, we are not… Cybe r Instruments & Facilities
17
17 Five Crises “ CDSE” Community needs to address Computing Technology Multicore: processor is new transistor Programming model, fault tolerance, etc New models: clouds, grids, GPUs, … Data, provenance, and visualization How do we create “data scientists”? What is an international data infrastructure? Software treated as e-Infrastructure Complex applications on coupled compute- data-networked environments, tools needed Modern apps: 10 6 + lines, many groups contribute, take decades
18
18 Five Crises Organization for Multidisciplinary & Computational Science “Universities must significantly change organizational structures: multidisciplinary & collaborative research are needed [for US] to remain competitive in global science” “Itself a discipline, computational science advances all science…inadequate/outmoded structures within Federal government and the academy do not effectively support this critical multidisciplinary field” Education The CI environment is running away from us! How do we develop a workforce to work effectively in this world? How do universities transition?
19
Scientific Computing and Imaging Institute, University of Utah Data Crisis: Information Big Bang PCAST Digital Data NSF Experts Study Wired, Nature Storage Networking Industry Association (SNIA) 100 Year Archive Requirements Survey Report “there is a pending crisis in archiving… we have to create long-term methods for preserving information, for making it available for analysis in the future.” 80% respondents: >50 yrs; 68% > 100 yrs Industry
20
The Shift Towards a “Sea of Data” Implications Science & society are now data-dominated Experiment, computation, theory US mobile phone traffic exceeded 1 exabyte! Classes of data Collections, observations, experiments, simulations Software Publications Totally new methodologies Algorithms, mathematics, culture Data become the medium for Multidisciplinarity, communication, publication, science, economic development… 20 How do we attribute credit for this new publication form? How are data peer reviewed? What is a publication in the modern data-rich world? What is a business model for OA? Fundamental questions become focused around data: What to curate, how to remove boundaries? How to incentivize sharing? IP?
21
21 Part 2a: Recommendations
22
Software ACCI Task Force Reports Final recommendations presented to the NSF Advisory Committee on Cyberinfrastructure Dec 2010 More than 25 workshops and Birds of a Feather sessions, 1300 people involved Final reports on-line “Permanent programmatic activities in Computational and Data-Enabled Science & Engineering (CDS&E) should be established within NSF.” Grand Challenges Task Force “NSF should establish processes to collect community requirements and plan long-term software roadmaps.” Software Task Force “ Higher education should adopt criteria for tenure and promotion that reward…the production of digital artifacts of scholarship. Such artifacts include widely used data sets, scholarly services delivered online, and software. ” Campus Bridging Task Force 22 Campus Bridging Data & Viz Grand Challenge HPC Learning
23
Recommendation of NSF Advisory Committee on Cyberinfrastructure ACCI "The National Science Foundation should create a program in Computational and Data-Enabled Science and Engineering (CDS&E), based in and coordinated by the NSF Office of Cyberinfrastructure. The new program should be collaborative with relevant disciplinary programs in other NSF directorates and offices." 23
24
24 Part 3: Universities attempt to respond We have to do all this and revolutionize the state/national economy?
25
S koltech: Example of a 21 st Century University in the Making
26
Skoltech at a Glance A unique Russian institution in international context – This decade: a community of 200 faculty, 300 post- docs, 1200 graduate students Focused on science, engineering and technology – Addressing problems and issues in IT, Energy, Biomedicine, Space and Nuclear Interdisciplinary by design; no departments – 15 centers organized around complex problems With strong programs in support of innovation and entrepreneurship – Creating a culture of innovation in every student, professor, staff member Important part of the Skolkovo innovation ecosystem Integrated data, compute, instrumentation infrastructure and policy under development for Interdisciplinary research Accelerating discovery Economic development Integrated data, compute, instrumentation infrastructure and policy under development for Interdisciplinary research Accelerating discovery Economic development
27
27 Part 3a: You can help lead this revolution Kathryn Gray
28
Modern Research & Education Ecosystem 28 SoftwareSoftware SoftwareSoftware Track 2 CampusCampusCampusCampusCampusCampus CampusCampus CampusCampusCampusCampusCampusCampusCampusCampus DataData DataData DataData DataData XSEDE Education Crisis: I need all of this to start to solve my problem! Blue Waters
29
The Opportunity (US picture)! Now have emerging national Integrated, High Performance Research Architecture Blue Waters and beyond towards exascale: high end Extraordinary science continues lead at cutting edge Traditional and novel large data applications Few places can house, field, or drive such a facility XSEDE architecture can connect… Campus Bridging: campus to national CI… –Campus Assets: MRI, Instruments, DNA sequencers… –Facilities: Supercomputers, telescopes, accelerators, light sources, NEON … »”More silicon than Steel” –Networks: end-to-end connectivity »Where are those optical network apps? 29
30
Much to do to build CDSE on this Background: address the “5 Crises” Education Many new opportunities and challenges CSE already has its struggles Now data: what is a “data scientist”? CDSE emerges Data opportunities for education and citizen science Faculty development, curriculum development Needed on every campus Talk to NSF, DOE, EC, your national agencies Recommendations of ACCI, MPSAC, etc New programs needed: See NSF CDS&E, CI TraCS, CAREER, “LWD”, etc You can help make this happen 30
31
Key Messages Astounding rate of change of the “Triple Helix” of Research, Education, and Innovation Computing and Data radically change methods Culture of collaboration around complex problems These create many crises and opportunities From technology to methodology to culture… Deep integration required for science Emergence of Computational and Data- enabled Science and Engineering as a discipline and your role! A key part of the paradigm shift 31
32
32 & Data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.