2010-1844 C Software Life-cycle and Integration Issues for CS&E R&D Software and Experiences from Trilinos (Part I) Roscoe A. Bartlett http://www.cs.sandia.gov/~rabartl/

Slides:



Advertisements
Similar presentations
Page 1 APP + Trilinos Integration Status, Opportunities, and Challenges Roscoe A. Bartlett Department of Optimization.
Advertisements

PROC-1 3. Software Process. PROC-2 What’s a process? Set of activities in creating software It involves creativity –hard to automate –Requires human judgment.
Chapter 2 Modeling the Process and Life Cycle Shari L. Pfleeger
Alternate Software Development Methodologies
 The Rise of Computer Science ◦ Machine Language (1 st Gen) ◦ Assembly Language (2 nd Gen) ◦ Third Generation Languages (FORTRAN, BASIC, Java, C++, etc.)
Problem with Software Requirements are complex The client does not know the functional requirements in advance Requirements may be changing Technology.
An Agile View of Process
Overview of the TriBITS Lifecycle Model First Workshop on Maintainable Software Practices in e-Science (SoftwarePractice12) October 9, 2012 Dr. Roscoe.
Page 1 Software Life-cycle and Integration Issues for CS&E R&D Software and Experiences from Trilinos (Part I) Roscoe A. Bartlett
Page 1 The State of Trilinos Software Engineering Recent Progress, Current Status, and Future Issues Roscoe A. Bartlett
MS17 A Case Study on the Vertical Integration of Trilinos Solver Algorithms with a Production Application Code Organizer: Roscoe A. Bartlett Sandia National.
Chapter 3 – Agile Software Development Lecture 1 1Chapter 3 Agile software development.
Page 1 Trilinos Software Engineering Technologies and Integration Capability Area Overview Roscoe A. Bartlett Trilinos Software Engineering Technologies.
Page 1 Trilinos Software Engineering Technologies and Integration Capability Area Overview Roscoe A. Bartlett Department.
Trilinos Strategic (and Tactical) Planning Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Page 1 Trilinos Software Engineering Status and Future Issues Roscoe A. Bartlett Department of Optimization & Uncertainty.
PyTrilinos: A Python Interface to Trilinos Bill Spotz Sandia National Laboratories Reproducible Research in Computational Geophysics August 31, 2006.
Page 1 Trilinos Software Engineering Technologies and Integration Capability Area Overview Roscoe A. Bartlett Department.
Page 1 Trilinos Software Engineering Technologies and Integration Numerical Algorithm Interoperability and Vertical Integration –Abstract Numerical Algorithms.
Page 1 Embedded Sensitivities and Optimization From Research to Applications Roscoe A. Bartlett Department of Optimization & Uncertainty Estimation Sandia.
Page 1 Software Life-cycle and Integration Issues for CS&E R&D Software and Experiences from Trilinos (Part II, Integration Issues) Roscoe A. Bartlett.
Page 1 Trilinos Release Improvement Issues Roscoe A. Bartlett Department of Optimization & Uncertainty Estimation Trilinos.
Strategic Goals: To align the many efforts at Sandia involved in developing software for the modeling and simulation of physical systems (mostly PDEs):
Rapid software development 1. Topics covered Agile methods Extreme programming Rapid application development Software prototyping 2.
Chapter 3 Agile Software Development (1/2) Yonsei University 2 nd Semester, 2015 Sanghyun Park.
Chapter 2 Software processes. Topics covered Software process models Process activities Coping with change.
Lecture 4 – XP and Agile 17/9/15. Plan-driven and agile development Plan-driven development A plan-driven approach to software engineering is based around.
Page 1 Almost Continuous Integration for the Co-Development of Highly Integrated Applications and Third Party Libraries Roscoe A. Bartlett
Trilinos Strategic (and Tactical) Planning Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United.
SOFTWARE ENGINEERING. Objectives Have a basic understanding of the origins of Software development, in particular the problems faced in the Software Crisis.
Requirements Engineering Requirements Engineering in Agile Methods Lecture-28.
Page 1 CMake Trilinos? Roscoe A. Bartlett Department of Optimization & Uncertainty Estimation Esteban J. Guillen Department.
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multi-program laboratory.
Modelling the Process and Life Cycle. The Meaning of Process A process: a series of steps involving activities, constrains, and resources that produce.
Page 1 Integration Strategies for Computational Science & Engineering Software Roscoe A. Bartlett Department of Optimization.
1Managed by UT-Battelle for the U.S. Department of Energy Roadmap for Sustainable CSE Ecosystems A Roadmap for Sustainable Ecosystems of CSE Software Roscoe.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 17 Slide 1 Rapid software development.
Embedded Systems Software Engineering
Software engineering (cs3529) target classes: cs14: A,B
CS223: Software Engineering
Process 4 Hours.
DOE/Office of Science/ASCR (Sandia National Laboratories)
Methodologies and Algorithms
Appendix B Agile Methodologies
Lecture 3 Prescriptive Process Models
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Software & Software Engineering Pertemuan-4 Dosen :Kundang K Juman
Chapter 18 Maintaining Information Systems
Software Development methodologies
Software development life cycle models
Software Life Cycle Models
Prescriptive Process Models
Introduction to Software Engineering
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Trilinos Software Engineering Technologies and Integration
Chapter 2 Modeling the Process and Life Cycle Shari L. Pfleeger Joanne M. Atlee 4th Edition.
Agile Process: Overview
P Almost Continuous Integration for the Co-Development of Highly Integrated Applications and Third Party Libraries Roscoe A. Bartlett
Chapter 3 – Agile Software Development
Baisc Of Software Testing
Software Processes Process should be
Chapter 8 Software Evolution.
Appendix B Agile Methodologies
Extreme Programming.
Rapid software development
Embedded Nonlinear Analysis Tools Capability Area
System Development Methods
Presentation transcript:

2010-1844 C Software Life-cycle and Integration Issues for CS&E R&D Software and Experiences from Trilinos (Part I) Roscoe A. Bartlett http://www.cs.sandia.gov/~rabartl/ Department of Optimization & Uncertainty Estimation Trilinos Software Engineering Technologies and Integration Lead Sandia National Laboratories SIAM Parallel Processing 2010, 2/24/2010 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000. Page 1

Overview of CS&E Software Engineering Challenge Progress in Computational Science and Engineering (CS&E) is occurring due to greater numbers of more complex algorithms and methods Discretization: a) meshing, b) advanced discretizations, c) adaptively, … Parallelization: a) parallel support, b) load balancing, … General numerics: a) automatic differentiation, … Solvers: a) linear-algebra, b) linear solvers, c) preconditioners, d) nonlinear solvers, e) time integration, … Analysis capabilities: a) error-estimation, b) stability analysis and bifurcation, c) optimization, d) UQ, … New architectures: a) multi-core, b) GPUs, … Visualization ... Each technology requires specialized PhD-level expertise Almost all technologies need to be integrated into single applications Set of algorithms/software is too large for any single organization to create Too large to be developed under single blanket of Continuous Integration (CI) Software Engineering and Software Integration are key bottlenecks for CS&E to have the fullest impact! Page 2

Overview of Trilinos

Evolving Trilinos Solution Trilinos1 is an evolving framework to address these challenges: Follow a TOOLKIT approach. Fundamental atomic unit is a package. Includes core set of vector, graph and matrix classes (Epetra/Tpetra packages). Provides a common abstract solver API (Thyra package). Provides a ready-made package infrastructure: Source code management (git [New]). Build tools (CMake [New]). Automated regression testing (CTest/CDash [New]). Communication tools (Mailman mail lists). Specifies requirements and suggested practices for package SQA. In general allows us to categorize efforts: Efforts best done at the Trilinos framework level (useful to most or all packages). Efforts best done at a package level (peculiar or important to a package). Allows package developers to focus only on things that are unique to their package. 1. Trilinos loose translation: “A string of pearls” 4

Trilinos Package Summary Objective Package(s) Discretizations Meshing & Spatial Discretizations phdMesh, Intrepid, Phalanx, Shards, Pamgen, Sundance, FEI, STK, Mesquite Time Integration Rythmos Methods Automatic Differentiation Sacado Mortar Methods Moertel Core Linear algebra objects Epetra, Jpetra, Tpetra Abstract interfaces Thyra, Stratimikos, RTOp Load Balancing Zoltan, Isorropia “Skins” PyTrilinos, WebTrilinos, ForTrilinos, Ctrilinos C++ utilities, I/O, thread API Teuchos, EpetraExt, Kokkos, Triutils, TPI, Optika Solvers Iterative (Krylov) linear solvers AztecOO, Belos, Komplex, Lyno Direct sparse linear solvers Amesos Direct dense linear solvers Epetra, Teuchos, Pliris Iterative eigenvalue solvers Anasazi ILU-type preconditioners AztecOO, IFPACK, Tifpack Multilevel preconditioners ML, CLAPS Block preconditioners Meros, Teko Nonlinear system solvers NOX, LOCA, Stalix Optimization MOOCHO, Aristos, GlobiPack, OptiPack, TriKota, Piro Stochastic PDEs Stokhos, Stalix Number of Trilinos Packages: Current = 57 Growth = 5-10 new packages per year! Number of Trilinos External Third Partly Libraries (TPLs) = 42 5

Trilinos Strategic Goals Scalable Computations: As problem size and processor counts increase, the cost of the computation will remain nearly fixed.   Hardened Computations: Never fail unless problem essentially intractable, in which case we diagnose and inform the user why the problem fails and provide a reliable measure of error. Full Vertical Coverage: Provide leading edge enabling technologies through the entire technical application software stack: from problem construction, solution, analysis and optimization. Grand Universal Interoperability: All Trilinos packages will be interoperable, so that any combination of solver packages that makes sense algorithmically will be possible within Trilinos.  Universal Accessibility: All Trilinos capabilities will be available to users of major computing environments: C++, Fortran, Python and the Web, and from the desktop to the latest scalable systems. Universal Solver RAS: Trilinos will be: Reliable: Leading edge hardened, scalable solutions for each of these applications Available: Integrated into every major application at Sandia Serviceable: Easy to maintain and upgrade within the application environment. Algorithmic Goals Software Goals 6

Automatic Intra-Package Dependency Handling http://trilinos.sandia.gov/packages/interoperability.html Page 7

CS&E Environment at Sandia National Labs for Trilinos SIERRA (APP) Largest and most complex Alegra (APP) TPL: Third Party Lib Provides functionality to multiple APPs The “Supplier” to the APP APP: Application Delivers end user functionality The “Customer” of the TPL Charon (APP) Trilinos (TPL) Xyce (APP) Titan/VTK (APP) Aleph (APP) Tramonto (APP) ... Sophisticated CS&E applications Discretized PDEs (SIERRA, Alegra, Aleph, Charon) Circuit network models (Xyce) Other types of calculations (Titian/VTK, Tramonto) (Massively) parallel MPI (Gordon Bell Winners) Almost entirely developed by non-software people Wide range of research to production (i.e. from Aleph to SIERRA) Page 8

Vision for a Confederation of CS&E Software?

The Vision: A Confederation of CS&E Codes? Develop a confederation of trusted, high-quality, reusable, compatible, software packages/components including capabilities for: Discretization: a) meshing, b) advanced discretizations, c) adaptively, … Parallelization: a) parallel support, b) load balancing, … General numerics: a) automatic differentiation, … Solvers: a) linear-algebra, b) linear solvers, c) preconditioners, d) nonlinear solvers, e) time integration, … Analysis capabilities: a) error-estimation, b) stability analysis and bifurcation, c) optimization, d) UQ, … New architectures: a) multi-core, b) GPUs, … Visualization ... APP1 Pkg A Pkg B CS&E Confederation Pkg A Pkg B Pkg C ... APP2 Pkg X Pkg Y Pkg Z Pkg A Pkg B Pkg C ... Pkg X Pkg Y Trilinos itself is a smaller example of this! Page 10

Challenges for the Reuse of External CSE Software Many CS&E organizations and individuals are adverse to using externally developed CS&E software! Using externally developed software can be as risk! External software can be hard to learn External software may not do what you need Upgrades of external software can be risky: Breaks in backward compatibility? Regressions in capability? External software may not be well supported External software may not be support over long term What can reduce the risk of depending on external software? Develop and maintain strong organizational relationships Long term commitment and support (i.e. 10-30 years) Apply strong software engineering processes and practices (high quality, low defects, frequent releases, regulated backward compatibility) 11

Overview of Modern Lean/Agile Software Engineering

Defined: Life-Cycle, Agile and Lean Software Life-Cycle: The processes and practices used to design, develop, deliver and ultimately discontinue a software product or suite of software products. Example life-cycle models: Waterfall, Spiral, Evolutionally Prototype, Agile, … Agile Software Engineering Methods: Agile Manifesto (2001) (Capital ‘A’ in Agile) Founded on long standing wisdom in SE community (40+ years) Push back against heavy plan-driven methods (CMM(I)) Focus on incremental design, development, and delivery (i.e. software life-cycle) Close customer focus and interaction and constant feedback Example methods: SCRUM, XP (extreme programming) Becoming a dominate software engineering approach Lean Software Engineering Methods: Adapted from Lean manufacturing approaches (e.g. the Toyota Production System). Focus on optimizing the value chain, small batch sizes, minimize cycle time, automate repetitive tasks, … Hard to distinguish between Lean and Agile … References: http://www.cs.sandia.gov/~rabartl/readingList.html Page 13

Relevant Lean/Agile Software Engineering Principles Agile Design: Reusable software is best designed and developed by incrementally attempting to reuse with new clients and incrementally redesigning and refactoring the software as needed keeping it simple. Technical debt in the code is management through continuous incremental (re)design and refactoring. Agile Quality: High quality defect-free software is most effectively developed by not putting defects into the software in the first place. High quality software is best developed collaboratively (e.g. pair programming and code reviews). Software is fully verified before it is even written (i.e. Test Driven Development for system verification and unit tests). High quality software is developed in small increments and with sufficient testing in between sets of changes. Agile Integration: Software needs to be integrated early and often Agile Delivery: Software should be delivered to real (or as real as we can make them) customers is short (fixed) intervals. References: http://www.cs.sandia.gov/~rabartl/readingList.html Page 14

Lean/Agile Methods: Development Stability Common Approach NOT AGILE! Code instability or #defects Regression! Time Release X Branch for Release X+1 Release X+1 Problems Cost of fixing defects increases the longer they exist in the code Difficult to sustain development productivity Broken code begets broken code (i.e. broken window phenomenon) Long time between branch and release Difficult to merge changes back into main development branch Temptation to add “features” to the release branch before a release High risk of creating a regression Page 15

Lean/Agile Methods: Development Stability The Agile way! Code instability or #defects Time Release X Branch for Release X+1 Release X+1 Advantages Defects are kept out of the code in the first place Code is kept in a near releasable state at all times Shorten time needed to put out a release Allow for more frequent releases Reduce risk of creating regressions Decrease overall development cost Page 16

Realizing a Confederation of CS&E Software

The Vision: A Confederation of CS&E Codes? Develop a confederation of trusted, high-quality, reusable, compatible, software packages/components including capabilities for: Discretization: a) meshing, b) advanced discretizations, c) adaptively, … Parallelization: a) parallel support, b) load balancing, … General numerics: a) automatic differentiation, … Solvers: a) linear-algebra, b) linear solvers, c) preconditioners, d) nonlinear solvers, e) time integration, … Analysis capabilities: a) error-estimation, b) stability analysis and bifurcation, c) optimization, d) UQ, … New architectures: a) multi-core, b) GPUs, … Visualization ... APP1 Pkg A Pkg B CS&E Confederation Pkg A Pkg B Pkg C ... APP2 Pkg X Pkg Y Pkg Z Pkg A Pkg B Pkg C ... Pkg X Pkg Y Trilinos itself is a smaller example of this! Page 18

Requirements/Challenges for Confederation of CS&E Codes Software quality and usability => Design, testing, collaborative development Building the software in a consistent way and linking => Common build approach? Reusability and interoperability of software components => Incremental Agile design, resource management, … Documentation, tutorials, user comprehension => SE education, better documentation and examples Critical new functionality development => Closer development and integration models Upgrading compatible versions of software => Frequent fixed-time releases Safe upgrades of software => Regulated backward compatibility, software quality Long term maintenance and support = > Stable organizations, stable projects, stable staff Self-sustaining software (clean design, clean implementation, well tested with unit tests and system verification tests) => Anyone can maintain it! The Trilinos is taking (baby) steps to address all of these issues at some level.

Regulated Backward Compatibility

Backward Compatibility Considerations Backward compatibility is critical for: Safe upgrades of new releases Composability and compatibility of different software collections Page 21

Example of the Need for Backward Compatibility SIERRA Y+1 (released against Trilinos SIERRA Y+1) Xyce J+1 (released against Trilinos X) VTK M+1 (released against Trilinos X+1) Trilinos SIERRA Y+1? Multiple releases of Trilinos presents a possible problem with complex applications Solution: => Provide sufficient backward compatibility of Trilinos X through Trilinos SIERRA Y+1 Page 22

Backward Compatibility Considerations Backward compatibility is critical for: Safe upgrades of new releases Composability and compatibility of different software collections Maintaining backward compatibility for all time has downsides: Testing/proving backward compatibility is expensive and costly Encourages not changing (refactoring) existing interfaces etc. => Leads to software “entropy” which kills a software product A compromise: Regulated backward compatibility (Trilinos approach) Maintain a window of “sufficient” backward compatibility over major version numbers (e.g. 1-2 years) Provide “Deprecated” compiler warnings Example: GCC’s __deprecated__ attribute enabled with –DTrilinos_SHOW_DEPRCATED_WARNINGS:BOOL=ON Provide strong automated testing of Trilinos backward compatibility Drop backward compatibility between major version numbers Page 23

Regulated Backward Compatibility in Trilinos Trilinos Version Numbering X.Y.Z: X: Defines backward compatibility set of releases Y: Major release (off the master branch) number in backward compatible set Z: Minor releases off the release branch X.Y Y and Z: Even numbers = release, odd numbers = dev Makes logic with Trilinos_version.h easier Backward comparability between releases Example: Trilinos10.6 is backward compatible with 10.0 through 10.4 Example: Trilinos 11.X is not compatible with Trilinos 10.Y Maintain backward compatibility of 11.0 with only 10.3 but drop all other deprecated code! 10.2 10.4 10.6 11.2 11.4 11.5 (Dev) 10.0 (Sept 2009) 11.0 (Sept 2011) Test backward compatibility of Dev with current release every night! Example: Major Trilinos versions change every 2 years with 2 releases per year Page 24

Summary

CS&E Software Engineering Challenge Progress in Computational Science and Engineering (CS&E) is occurring due to greater numbers of more complex algorithms and methods Discretization: a) meshing, b) advanced discretizations, c) adaptively, … Parallelization: a) parallel support, b) load balancing, … General numerics: a) automatic differentiation, … Solvers: a) linear-algebra, b) linear solvers, c) preconditioners, d) nonlinear solvers, e) time integration, … Analysis capabilities: a) error-estimation, b) stability analysis and bifurcation, c) optimization, d) UQ, … New architectures: a) multi-core, b) GPUs, … Visualization ... Each technology requires specialized PhD-level expertise Almost all technologies need to be integrated into single applications Set of algorithms/software is too large for any single organization to create Too large to be developed under single blanket of Continuous Integration (CI) Software Engineering and Software Integration are key bottlenecks for CS&E to have the fullest impact! Page 26

Requirements/Challenges for Confederation of CS&E Codes Software quality and usability => Design, testing, collaborative development Building the software in a consistent way and linking => Common build approach? Reusability and interoperability of software components => Incremental Agile design, resource management, … Documentation, tutorials, user comprehension => SE education, better documentation and examples Critical new functionality development => Closer development and integration models Upgrading compatible versions of software => Frequent fixed-time releases Safe upgrades of software => Regulated backward compatibility, software quality Long term maintenance and support = > Stable organizations, stable projects, stable staff Self-sustaining software (clean design, well tested with unit tests and system verification tests) => Anyone can maintain it! See Part II, Integration Issues at 10:50 AM Possible topics for Round Table Discussion at 6:00 PM

THE END