A Data Analysis Framework for the Neutron Community Michael M. McKerns Materials Science and Applied Physics Center for Advanced Computing Research California.

Slides:



Advertisements
Similar presentations
Integration of MBSE and Virtual Engineering for Detailed Design
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
© Chinese University, CSE Dept. Software Engineering / Software Engineering Topic 1: Software Engineering: A Preview Your Name: ____________________.
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
Report from DANSE Workshop Sept. 3-8, 2003 Goals: 1) To explain DANSE to selected scientists and engineers who develop software for neutron scattering.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
DANSE Distributed Data Analysis for Neutron Scattering Experiments Michael M. McKerns, Michael A.G. Aivazis, Tim M. Kelley, June Kim, and Brent Fultz.
ARCS Data Analysis Software An overview of the ARCS software management plan Michael Aivazis California Institute of Technology ARCS Baseline Review March.
Management of Change Project Status –Schedule, Cost, and Earned Value –Issues and a Path Forward Project Management Infrastructure –Agility of the WBS.
Experimental Facilities DivisionORNL - SNS June 22, 2004 SNS Update – Team Building Steve Miller June 22, 2004 DANSE Meeting at Caltech.
DANSE Central Services Michael Aivazis Caltech NSF Review May 23, 2008.
Java Programming, 3e Concepts and Techniques Chapter 1 An Introduction to Java and Program Design.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
The ARCS Data Analysis Software Michael Aivazis California Institute of Technology.
Introduction to DANSE Brent Fultz Prof. Materials Science and Applied Physics California Institute of Technology Distributed Data Analysis Architecture.
The ARCS Data Analysis Software Michael Aivazis California Institute of Technology.
© , Michael Aivazis DANSE Software Architecture Challenges and opportunities for the next generation of data analysis software Michael Aivazis.
An overview of the DANSE software architecture Michael Aivazis Caltech DANSE Kick-Off Meeting Pasadena Aug 15, 2006.
Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani Free University of Bozen - Bolzano Lesson 2 – Components.
Pyre: a distributed component framework Michael Aivazis Caltech DANSE Developers Workshop January 22-23, 2007.
Course Instructor: Aisha Azeem
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
–Streamline / organize Improve readability of code Decrease code volume/line count Simplify mechanisms Improve maintainability & clarity Decrease development.
Software Development Concepts ITEC Software Development Software Development refers to all that is involved between the conception of the desired.
Java Programming, 2E Introductory Concepts and Techniques Chapter 1 An Introduction to Java and Program Design.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 18 Slide 1 Software Reuse.
Software Engineering Muhammad Fahad Khan
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
OpenAlea An OpenSource platform for plant modeling C. Pradal, S. Dufour-Kowalski, F. Boudon, C. Fournier, C. Godin.
An Introduction to Software Architecture
Nick Draper Teswww.mantidproject.orgwww.mantidproject.org Instrument Independent Reduction and Analysis at ISIS and SNS.
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
DANSE Diffraction Software for the SNS: DiffDANSE S.J.L. Billinge Dept. Physics and Astronomy Michigan State University.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 07. Review Architectural Representation – Using UML – Using ADL.
Architecting Web Services Unit – II – PART - III.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
DANSE Central Services Michael Aivazis Caltech NSF Review May 31, 2007.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Brent Fultz; Co-PIs are Michael Aivazis, Ian Anderson; PM is Mike McKerns California Institute of Technology.
Nick Draper 05/11/2008 Mantid Manipulation and Analysis Toolkit for ISIS data.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Extending the Neutron Scientist’s Toolkit Michael McKerns Materials Science and Applied Physics Center for Advanced Computing Research California Institute.
Modeling Component-based Software Systems with UML 2.0 George T. Edwards Jaiganesh Balasubramanian Arvind S. Krishna Vanderbilt University Nashville, TN.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Introduction to soarchitect. agenda SOA background and overview transaction recorder summary.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Cmpe 589 Spring 2006 Lecture 2. Software Engineering Definition –A strategy for producing high quality software.
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
Connections to Other Packages The Cactus Team Albert Einstein Institute
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Mantid Stakeholder Review Nick Draper 01/11/2007.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Slide 1 Service-centric Software Engineering. Slide 2 Objectives To explain the notion of a reusable service, based on web service standards, that provides.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
OOD OO Design. OOD-2 OO Development Requirements Use case analysis OO Analysis –Models from the domain and application OO Design –Mapping of model.
The Integrated Spectral Analysis Workbench (ISAW) DANSE Kickoff Meeting, Aug. 15, 2006, D. Mikkelson, T. Worlton, Julian Tao.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
V7 Foundation Series Vignette Education Services.
Software Engineering Salihu Ibrahim Dasuki (PhD) CSC102 INTRODUCTION TO COMPUTER SCIENCE.
reduction data treatment for ARCS
Hierarchical Architecture
An Introduction to Software Architecture
Presentation transcript:

A Data Analysis Framework for the Neutron Community Michael M. McKerns Materials Science and Applied Physics Center for Advanced Computing Research California Institute of Technology (Distributed Data Analysis for Neutron Scattering Experiments):

Serving a Growing Community With the availability of OPAL, SNS, and JPARK fast approaching, the neutron community has the potential to undergo a large growth spurt. Software is a vital part of scattering research, and unless the software is both robust and easy to use, that growth may be limited. Mature packages do exist (McStas, ISAW, DAVE, …), and commercial packages are also used (Matlab, IDL, Abaqus, IGOR Pro, …) in the analysis process. However, groups often use cryptic legacy code for at least one step. To grow as a community, we need: –a way to cultivate and maintain the valuable portions of these legacy codes –to make legacy and community-standard codes interoperable define common data structures and interfaces stop duplication of effort –to allow scientists to concentrate more on science by lowering the barriers to software engineering

There is much to do… Software is needed to support the massive quantity of data that will be produced at modern neutron facilities. Existing software may be incapable of utilizing the full richness of the data that will be produced. Although the barrier to developing new software must be reduced, it is also critical that more complex software technologies (i.e. high-performance and grid-based computing) are enabled. Time is short – we must use the best existing tools to provide a robust solution… yet be flexible enough to allow for the easy substitution of better future solutions.

Software User Stereotypes I Instrument Scientist –author of prepackaged and specialized tools –wants: portable building and debugging tools large toolkit of robust modules and support code rapid application development GUI builder to compose interactive widgets, forms, and wizards to focus on supporting the instrument, not writing software Visiting Scientist –user of prepackaged and specialized tools –wants: UI that is simple to understand & easy to use reasonable defaults for most choices well diagnosed and explained error messages intelligently concealed complexity

Software User Stereotypes II Established Researcher –coordinator/author/reviewer, designer of new applications –wants: flexible UI that enables interactive exploration access to a comprehensive set of data transformations access to modeling and simulation packages tools to compare outputs of different analyses casually useable high-end graphics Beginning Student –user of tools and documentation as learning environment –wants: well documented interface and modules access to a set of standard applications flexible UI that enables interactive exploration

Software User Stereotypes III Analysis Expert –author of analysis, modeling, or simulation software –wants: portable building and debugging tools large toolkit of robust modules and support code easy access to sample data to solve physics problems, not software engineering problems Software Engineer –binds software to common environment, extends software to the framework –wants: portable building and debugging tools large toolkit of robust modules and support code well documented access to the software and framework integration layer validation, verification, and regression testing Framework Maintainer –maintains and extends the software infrastructure

What is DANSE? a 12M$ five-year NSF IMR-MIP software construction project a collaborative effort between software professionals, neutron scattering scientists, and facilities a software engineering effort –open-source development environment –framework for the interoperability of modular components –integration of legacy codes and community-standard software –connectivity to facility databases and software repositories a scientific endeavor –to develop software modules for different subfields of neutron scattering –to enhance neutron scattering research and facilitate new science –to build tools for education, collaboration, and plausibility assessment an integration framework for building data analysis, visualization, modeling, and instrument simulation tools for all areas of neutron scattering

The Power of Python The fundamental commodity for neutron scattering software is found within the cores of time-tested community-standard software. Rather than rewrite or duplicate this software, we can use python to provide an integration path into a common language. Python is –a modern object-oriented language –robust, portable, mature, well-supported, well-documented –easily extendable –supports rapid application development Python scripting enables us to –compose computations at runtime and discover capabilities without recompilation or relinking –organize large numbers of user-tunable parameters Binding Python to other languages (C++, Fortran, …) allows integration without measurable impact on performance or scalabilityallows integration

Building a Scientific Toolkit Through Python, DANSE will have access to many tools –basic data structures, optimization algorithms, numerical libraries –basic data reduction library [obtain I(Q), S(Q), S(E), S(Q,E)] –graphical/plotting environments IDL, Matlab, Matplotlib, Gnuplot, Grace, ParaView, ACIS (AutoCAD), … –instrument simulation McStas, VITESS, sample simulation framework, … –materials simulation ABINIT, VASP, GAMESS, NWChen, NAMD, CHARMM, … –crystallography cctbx, FOX, ObjCryst++, … –molecular viewers and format translators OpenBabel, Molden, PyMol, ViewMol, DRAWxtl, VMD, AtomEye, … –and MORE! ISAW, texture analysis (MAUD), SLD calculator, scattering intensity, …

The Power of a Framework While a single application can be built relatively quickly without using a framework, much effort will be spent on error handling, logging, UI construction, and other services. A software framework provides –a specification for organization of the software –a description of the crucial structural elements and their interfaces –a specification of the possible collaborations of these elements –a strategy for the composition of new elements –flexibility and robustness under evolutionary pressures –services life cycle management, logging and monitoring network client and server support, authentication should not be rewritten for every application, but simply reused A framework increases reusability & decreases the development time

DANSE uses Pyre Framework Pyre software architecture –robust, stable, open-source foundation –>75,000 lines of Python; 30,000 lines of C++ component-based runtime environment –components are pre-compiled and connected by the user at runtime –user directs component interconnections using visual, script-based, or shell programming a set of co-operating abstract services –framework provides structural girdle –executive layer manages application life cycle –applications built from modular components –components tie software cores to data streams –UI independent of underlying framework application-general application-specific framework computational engines Component CORE

Modularity of Components granularity allows reusability of object-oriented components rebinning application modularity provides flexibility and extensibility NeXusReader Selector Bckgrnd Selector Energy NeXusWriter times instrument info raw counts filename time interval energy bins filename

Component Data Flow Paradigm scientific analysis codes constitute the cores of software components components mediate interaction between cores and environment –inherit methods (such as message passing and error handling) from environment –responsible for initialization of programs within their component core –access centralized mechanism for logging status, errors, and history –negotiate data exchanges with XML-based data exchange protocols components utilize data streams to pass information between ports –interact with executive layer to negotiate execution flow –facilitate physical decoupling of computation among distributed resources Component CORE Component CORE

Component Implementation build core engine (Python, Fortran, C++, Java, Matlab, IDL, …) –legacy or custom code and third-party libraries –provide life-cycle management and exception handling strategy construct Python bindings –select entry points to expose to Python –modularize entry points to monolithic compiled libraries cast as a component –extend and leverage framework services –describe user-configurable parameters –provide meta-data that specify the IO port characteristics test code –satisfy functional requirements with concurrent test development –utilize interactive runtime testing within Python interpreter –demonstrate integration with other components Component CORE

Building Abstract Applications DANSE uses a design pattern that enables the assembly of components at runtime under user control Facilities are named abstract application requirements Components are concrete named engines that satisfy the requirements Power of an API –the application author provides: a specification of the application facilities as part of the application definition a component to be used as the default –the application user can construct scripts that create alternative components that comply with the facility interface –the end user can: configure the properties of the component select which component is to be bound to a given facility Abstraction is required for dynamic and distributed applicationsdynamic and distributed applications

Visual Programming Interface Workflow graphs are a naturally dynamic interface due to the correspondence between logical and physical descriptions of the computation. There are multiple views of each computation –data flow –control flow –deployment of distributed components Should allow interactive editing of component state –access to modify component properties –dynamic interface generation from component-supplied specifications NeXusReader Selector Bckgrnd Selector Energy NeXusWriter times instrument info raw counts filename time interval energy bins filename

Distributed/Parallel Computing Enabled by design –component framework utilizing data streams –requirements for building distributed and parallel computations nearly the same as those for building applications in a visual programming interface Pyre originally designed to compose and control parallel applications –bindings to mpi –encapsulation of python interpreter in mpi Enable distributed computing with currently available technologies –initial authentication and deployment based on ssh & scp –authentication and security using pyre services –access constrained to user space Take advantage of Grid services as they become available…

Broad Scientific Scope data reduction and experiment simulation –diffraction, engineering diffraction, and inelastic scattering data reduction –SANS/USANS and neutron reflectometry data reduction –instrument and microstructure simulation modeling –full profile modeling in real and reciprocal space (GSAS, FullProf, PDFFIT) –finite element modeling (ABAQUS); self-consistent modeling –constrained fitting by use of data from other experimental techniques –1D/2D model fitting; model independent peak fitting –direct modeling of physical systems; ab-initio modeling –scattering kernel; multiple scattering –neutron weight correction; separation of nuclear and spin scattering –micromagnetic simulations (OOMMF); disordered spin dynamics –chemical spectroscopy dynamics (CLIMAX)

Facilitates New & Better Science better data analysis –FEM calculations of strains in microstructures –Monte-Carlo inversions of S(Q,E) to obtain parameters of structure and dynamics models –model refinements with multiple data sets integration of theory –micromechanics using correlations of local strains –phase diagrams from thermodynamic functions –ab-initio calculations of spin interactions –soft matter structure using atomic force fields guided by diffraction experiment planning and execution –single crystals on chopper spectrometers –feedback control and real-time assessment –plausibility testing and contingency planning –assessment of science/data trends from previous data

Goals & Objectives The goal of DANSE is to provide a community supported open-source software environment for scattering research that: –integrates the basic data reduction, analysis, modeling, and simulation capabilities that are available today –provides powerful new applications for data reduction, analysis, modeling, and simulationpowerful new applications –enables new types of science in all major subfields of neutron scattering research –provides a coherent framework onto which software components can easily be added by scientists –lowers the barrier to software development –minimizes duplication of effort in the scattering software community –decreases the time and effort in creating new software applications –provides a certification and quality assurance process to aid with facility integration

DANSE Project Information milestones for the DANSE software –project start2006 –beta release2008 –release –transition to community/SNS2010 documentation, tutorials, and further information –the DANSE wiki at –the Pyre homepage at contacts –Brent Fultz Michael Aivazis, Ian –Simon Billinge, Ersan Üstündag, Paul Butler, Paul Kienzle, Tom Swain –Michael McKerns

End Presentation