SFA-1 Best Visuals of CSME
SFA-2 Computational science challenges arise in a variety of applications l Computational science is emerging as its own discipline l Simulation is becoming a peer to theory and experiment in the process of scientific discovery l Integration is the key —domain science expert —applied mathematician —computer scientist Turbulence Fusion Environment Biology Lasers Materials
SFA-3 Computational Science & Engineering l A “multidiscipline” on the verge of full bloom —Envisioned by Von Neumann and others in the 1940’s —Undergirded by theory (numerical analysis) for the past fifty years —Empowered by spectacular advances in computer architecture over the last twenty years —Enabled by powerful programming paradigms in the last decade l Adopted in industrial and government applications —Boeing 777’s computational design a renowned milestone —DOE NNSA’s “ASCI” (motivated by CTBT) —DOE SC’s “SciDAC” (motivated by Kyoto, etc.)
SFA-4 Simulation complements experimentation Environment global climate wildland firespread Energy combustion fusion Physics cosmology radiation transport Experiments expensive Engineering electromagnetics aerodynamics Experiments prohibited or impossible Scientific Simulation Experiments dangerous Experiments difficult to instrument Ex #2Ex #4Ex #3Ex #1 personal examples
SFA-5 Niche for computational science l Has theoretical aspects (modeling) l Has experimental aspects (simulation) l Unifies theory and experiment by providing common immersive environment for interacting with multiple data sets of different sources l Provides “universal” tools, both hardware and software Telescopes are for astronomers, microarray analyzers are for biologists, spectrometers are for chemists, and accelerators are for physicists, but computers are for everyone! l Costs going down, capabilities going up every year
SFA-6 Applied Math and CS Science and Engineering Applications Computational Applied Math Domain Science Science Computer Science Engineering += Biology Physics Chemistry Engineering Environmental Math sparse linear solvers nonlinear equations differential eqns multilevel methods AMR techniques optimization eigenproblems CS data management data mining visualization program’g models languages, OS compilers, debuggers architectural issues Computational scientists bring applied mathematics and computer science capabilities to bear on challenging problems in science and engineering Computational Science & Engineering is a team effort!
SFA-7 Example: Solving PDEs on increasingly finer meshes l Traditional supercomputing applications involve the solution of a PDE on a computational grid —computational fluid dynamics —oil reservoir and groundwater management —stockpile stewardship —ICF and MFE applications l Bigger machines and smarter algorithms have allowed more realistic simulations —Moore’s Law and massively parallel computers have provided unprecedented computing power —scalable algorithms enable large-scale simulations
SFA-8 Theory, Experiment and Computation Growth in the expectations for and applications of CSE methodology has been fueled by rapid and sustained advances over the past 30 years of computing power and algorithm speed and reliability, and the emergence of software tools for the development and integration of complex software systems and the visualization of results. In many areas of science and engineering, the boundary has been crossed where simulation, or simulation in combination with experiment is more effective (in some combination of time/cost/accuracy) than experiment alone for real needs. In addition, simulation is now a key technology in industry.
SFA-9 Growth of Capabilities of Hardware and Algorithms Updated version of chart appearing in “Grand Challenges: High performance computing and communications”, OSTP committee on physical, mathematical and Engineering Sciences, 1992.
SFA-10 The power of optimal algorithms l Advances in algorithmic efficiency rival advances in hardware architecture l Consider Poisson’s equation on a cube of size N=n 3 l If n=64, this implies an overall reduction in flops of ~16 million YearMethodReferenceStorageFlops 1947GE (banded) Von Neumann & Goldstine n5n5 n7n Optimal SOR Youngn3n3 n 4 log n 1971CGReidn3n3 n 3.5 log n 1984Full MGBrandtn3n3 n3n3 2u=f2u=f 64 *On a 16 Mflop/s machine, six-months is reduced to 1 s
SFA-11 year relative speedup Algorithms and Moore’s Law l This advance took place over a span of about 36 years, or 24 doubling times for Moore’s Law l 2 24 16 million the same as the factor from algorithms alone!
SFA-12 The power of optimal algorithms l Since O(N) is already optimal, there is nowhere further “upward” to go in efficiency, but one must extend optimality “outward”, to more general problems l Hence, for instance, algebraic multigrid (AMG), obtaining O(N) in anisotropic, inhomogeneous problems AMG Framework Choose coarse grids, transfer operators, etc. to eliminate, based on numerical weights, heuristics error damped by pointwise relaxation algebraically smooth error
SFA-13 Modeling Framework
SFA-14 Mechanistic Modeling Framework
SFA-15 The Revolution at the Microscale Behavior near walls and boundaries is critical Large molecules moving through small spaces Interaction with the macroscale world is still important
SFA-16 The Multiscale World Quasicontinuum method (Tadmor, Ortiz, Phillips, 1996) Links atomistic and continuum models through the finite element method. A separate atomistic structural relaxation calculation is required for each cell of the FEM mesh instead of using empirical constitutive information. Predicts observed mechanical properties of materials on the basis of their constituent defects Hybrid finite element/molecular dynamics/quantum mechanics method (Abraham, Broughton, Bernstein, Kaxiras, 1999) Massively parallel, but designed for systems which involve a central defective region surrounded by a region which is only slightly perturbed from equililibrium Nakano et al.
SFA-17 More Multiscale Hybrid finite element/molecular dynamics/quantum mechanics algorithm (Nakano, Kalia and Vashista, 1999) Adaptive mesh and algorithm refinement (Garcia, Bell, Crutchfield, Alder, 1999) Embeds a particle method (DSMC) within a continuum method at the finest level of an adaptive mesh refinement hierarchy – application to compressible fluid flow Coarse stability and bifurcation analysis using time-steppers (Kevrekidis, Qian, Theodoropoulos, 2000) The “patch” method This is only a small sample: There is a new journal devoted entirely to multiscale issues! Nakano et al.
SFA-18 Engineering Meets Biology Computational Challenges: Multiscale simulation Understanding and controlling highly nonlinear network behavior (140 pages to draw a diagram for network behavior of E. Coli) Uncertainty in network structure Large amounts of uncertain and heterogeneous data Identification of feedback behavior Simulation, analysis and control of hybrid systems Experimental design
SFA-19 Multiscale Simulation of Biochemical Networks In the heat-shock response in E. Coli, an estimated sigma-32 molecules per cell play a key role in sensing the folding state of the cell and in regulating the production of heat shock proteins. The system cannot be simulated at the fully stochastic level, due to Multiple time scales (stiffness) The presence of exceedingly large numbers of molecules that must be accounted for in SSA Khammash et al.
SFA-20 Beyond Simulation: Computational Analysis Sensitivity analysis Forward and adjoint methods – ODE/DAE/PDE; hybrid systems Multiscale, stochastic,… still to come Uncertainty analysis Polynomial chaos, deterministic systems with uncertain coefficients Many other ideas – special issue in progress, SIAM SISC Design optimization/optimal control Design of experiments – to what extent can you learn something from incomplete information?, where is the most predictive power?
SFA-21 More Computational Analysis Determination of nonlinear structure – multiscale, stochastic, hybrid Bifurcation Mixing Long-time behavior Invariant manifolds Chaos Control mechanisms – identifying feedback mechanisms Reduced/simplified models – deterministic, multiscale, stochastic, hybrid systems, identify the underlying structure and mechanism Data analysis – revealing the interconnectedness, dealing with complications due to data uncertainties
SFA-22 Computer Science will Play a Much Larger Role Pragmatic reasons: Significant help from software tools Source-code generation Automatic differentiation – enables greater accuracy and reliability (and saves work in writing derivative routines and especially in debugging!) in generation of Jacobian matrix Fix the dumb things we have done in codes, like ‘if’ statements in functions that are supposed to be continuous Thread-safety - identify and fix the problems so that the code is ready for parallel/grid computing Some exceptions and coming developments: Matlab Semi-automatic generation of GUI (MAUI,JMPL), for big production codes and dusty decks Component technologies (PETSC) User interfaces: by current standards in the rest of the computer world, user interfaces for scientific computing look like this:
SFA-23 Computer Science will Play a Much Larger Role The deeper reason: At the smaller scales, we are dealing with and manipulating large amounts of discrete, stochastic, Bayesian, Boolean information. These are the foundations of Computer Science. Bioinformatics is just the tip of the iceberg.
SFA-24 Imagine the future of computational science by looking at today’s challenges l Consider the process of scientific simulation —software development —problem definition and simulation setup —data analysis and understanding l There has been no equivalent of Moore’s Law for how we develop our software l Increasingly complex simulations often require months to set up and months to analyze the results
SFA-25 Investment needed in several areas (illustrative, not exhaustive) l Multi-level methods for multi-scale problems l Rapid problem setup tools (mesh generation and discretization methods for complex geometries) l Flexible software frameworks and interoperable s/w components for rapid application development l Computer architectures & performance optimization l Information exploitation (data management, image analysis, info/data visualization, data mining) l Systems engineering to integrate simulation, sensors, and info analysis into a decision support capability l Discrete simulation (scenario planning) l Validation and Verification (coupling to experiments)
SFA-26 This workshop is about shaping CS&E programs for federal funding agencies l We should focus on how CSE can benefit the nation —enhancing national & homeland security —promoting economic vitality and energy security —improving human health l We need to emphasize the multi-disciplinary nature of CS&E and its track record in delivering! —distinguish ourselves from constituent disciplines —need to do a better job of getting the word out! l Think big: $250M, multi-agency initiative!
SFA-27 We have long-time and natural partners in the federal government l DOE has been long-time leader in CS&E —ASCI re-invigorated supercomputing —Office of Science is championing the cause with its successful SciDAC initiative l NSF has long invested in IT and CS, and is beginning to think more about CS&E l DHS has pressing needs for help in simulation and information fusion l NIH should be a bigger player than it is, but there are serious cultural obstacles
SFA-28 Computational Science Research and Education: Funding Considerations l Fellowship programs l Need for critical mass l Focus l Baseline support of sufficient duration is optimal
SFA-29 Thoughts on CSME programs l Need to teach the importance of working on teams —Rarely have a single PI —We need to recognize team efforts l Need more opportunities for students to solve “real” problems in a research environment l We need opportunities for everybody to learn new fields l Integration between agencies as well as integration across disciplines?
SFA-30 Thoughts on CSME research challenges l Biotechnology —Biophysical simulations —Data management —Stochastic dynamical systems l Nanoscience —Multiple scales (time and length) —Scalable algorithms for molecular systems —Optimization and predictability
SFA-31 ‘97‘98‘99‘00‘01‘02‘03‘04‘05‘ Tflop / 30 TB Time (CY) Capability 1+ Tflop / 0.5 TB Plan Develop Use 30+ Tflop / 10 TB Red 3+ Tflop / 1.5 TB Blue 10+ Tflop / 4 TB White 50+ Tflop / 25 TB Our Algorithms Run on Largest Platforms… NNSA has roadmap to go to 100 Tflop/s by Sandia Los Alamos Livermore Livermore
SFA-32 Bringing the CS&E and Statistics Communities Together l Example : Inverse problems and validation for complex computer models l Barriers to closer association l Mechanisms for closer association
SFA-33 Barriers to Bringing the CS&E and Statistics Communities Together l To many disciplinary scientists — we are each ‘providers of tools they can use’ — we are indistinguishable quantitative experts l Program and project funding rarely encourage inclusion of both CS&E and statistical scientists. l Our traditional application areas generally differ —CS&E tradition: physical sciences and engineering —Statistics tradition: strongest – as the statistics discipline – in social sciences, medical sciences,… (This could be an organizational strength for the CS&E initiative, but is a barrier at the personal level.)
SFA-34 Mechanisms for Bringing the CS&E and Statistics Communities Together l Most important is simply to bring them together on interdisciplinary teams. l Institute programs (e.g., at SAMSI), for extended cooperation —joint workshops —joint working groups l Emphasize need for joint funding on interdisciplinary projects. l At Universities?
SFA-35 Research Challenges l Statistical computational research challenges: —MCMC development and implementation —data confidentiality and large contingency tables —dealing with large data sets –in real time –off-line —bioinformatics, gene regulation, protein folding, … —data mining —utilizing multiscale data —data fusion, data assimilation —graphical models/causal networks —open source software environments —visualization —many many more.
SFA-36 Research Challenges, Continued l Challenges in the synthesis of statistics and development of computer modeling: —Statistical analysis in non-linear situations can require thousands of model evaluations (e.g., using MCMC), so the ‘real’ computational problem is the product of two very intensive computational problems; this is needed for –designing effective evaluation experiments; –estimating unknown model parameters (inverse problem), with uncertainty evaluation; –assessing model bias and predictive capability of the model; –detecting inadequate model components.
SFA-37 Research Challenges, Continued —Simultaneous use of statistical and applied mathematical modeling is needed for –effective utilization of many types of data, such as –data that occurs at multiple scales; –data/models that are individual-specific. –replacing unresolvable determinism by stochastic or statistically modeled components (parameterization) This general area of validation of computer models should be a Grand Challenge.
SFA-38 Five Investment Models for CS&E to Prosper l Laboratory institutes (hosted at a lab) ICASE, ISCR (more details to come) l National institutes (hosted at a university) IMA, IPAM l Interdisciplinary centers ASCI Alliances, SciDAC ISICs, SCCM, TICAM, CAAM, … l CS&E fellowship programs CSGF, HPCF l Multi-agency funding (cyclical to be sure, but sometimes collaborative) DOD, DOE, NASA, NIH, NSF, …
SFA-39 CSE philosophy: Science is borne by people l Be “eyes and ears” for CSE by staying abreast of advances in computer and computational science l Be “hands and feet” for CSE by carrying those advances into the laboratory l Three principal means for packaging scientific ideas for transfer —papers —software —people l People are the most effective!
SFA-40 Need pipelines people between the university and the laboratory Universities Generic CSE Center (GCC) Lab programs Students Faculty Lab Employees Faculty visit the GCC, bringing students Most faculty return to university, with lab priorities A few faculty become lab employees Some students become faculty, with lab priorities Some students become lab employees
SFA-41 GCC sponsors and conducts meetings on timely topics for lab missions l Bay Area NA Day l Common Component Architecture l Copper Mountain Multigrid Conference l DOE Computational Science Graduate Fellows l Hybrid Particle-Mesh AMR Methods l Mining Scientific Datasets l Large-scale Nonlinear Problems l Overset Grids & Solution Technology l Programming ASCI White l Sensitivity and Uncertainty Quantification
SFA-42 A curricular challenge l CS&E majors without a CS undergrad need to learn to compute! l Prerequisite or co-requisite to becoming useful interns at a lab l Suggest a “bootcamp” year-long course introducing: —C/C++ and object-oriented program design —Data structures for scientific computing —Message passing (e.g., MPI) and multithreaded (e.g., OpenMP) programming —Scripting (e.g., Python) —Linux clustering —Scientific and performance visualization tools —Profiling and debugging tools l NYU’s sequence G /G is an example for CS
SFA-43 “Red skies at morning” l Difficult to get support for maintaining critical software infrastructure and “benchmarking” activities l Difficult to get support for hardware that is designed with computational science and engineering in mind l Difficult for pre-tenured faculty to find reward structures conducive to interdisciplinary efforts l Unclear how stable is the market for CS&E graduates at the entrance to a 5-year pipeline l Political necessity of creating new programs with each change of administrations saps time and energy of managers and community
SFA-44 “Red skies at night” l DOE’s SciDAC model being recognized and propagated l NSF’s DMS budgets on a multi-year roll l SIAM SIAG-CSE attracting members from outside of traditional SIAM departments l CS&E programs beginning to exhibit “centripetal” potential in traditionally fragmented research universities e.g., SCCM’s “Advice” program l Computing at the large scale is weaning domain scientists from “Numerical Recipes” and MATLAB and creating thirst for core enabling technologies (NA, CS, Viz, …) l Cost effectiveness of computing, especially cluster computing, is putting a premium on graduate students who have CS&E skills
SFA-45 Opportunity: nanoscience modeling l Jul 2002 report to DOE l Proposes $5M/year theory and modeling initiative to accompany the existing $50M/year experimental initiative in nano science l Report lays out research in numerical algorithms and optimization methods on the critical path to progress in nanotechnology
SFA-46 Opportunity: integrated fusion modeling l Dec 2002 report to DOE l Currently DOE supports 52 codes in Fusion Energy Sciences l US contribution to ITER will “major” in simulation l Initiative proposes to use advanced computer science techniques and numerical algorithms to improve the US code base in magnetic fusion energy and allow codes to interoperate