Brent Fultz California Institute of Technology DANSE Brent Fultz California Institute of Technology ARCS Software Project Distributed Data Analysis for Neutron Scattering Experiments Components and Data Streams Towards a National Project
ARCS Spectrometer Moderator Shutter Guides Choppers Sample Introduce self Point out picture Familiarize you with the different components Hope that you’ll be able to see the logic that went into some of the decisions Detector Array Beamstop
Scientific Software for Neutron Scattering Software enables science from an inelastic neutron spectrometer such as ARCS Some experiments are impossible owing to present software Many experiments produce better science with better software (optimize beamtime usage, experimental procedures tuned on-the-fly) New opportunities to connect to theory of materials
Hardware Project Schedule
Software Development
Software Project Schedule
Software Project Milestones Software Baseline Design Jan. 2003 Software First Build July 2004 Software Beta Release Mar. 2005 Software Release 1.0 Feb. 2006 End ARCS Project Sept. 2006
Software Roadmap v. 1.0
Data Reduction Account for incident flux Remove background Convert from time to energy Correct for detector efficiency Bin into rings of constant scattering angle Convert from angle to momentum Subtract multiphonon and multiple scattering Correct for absorption
Architectural Coherence Left: Saint-Sernin, Toulouse, Romanesque Pilgrimage Church, c. 1080-1120 Right: Notre-Dame, Amiens, French Gothic Cathedral, begun 1220 Inconsistencies in Architecture Cathedrals --- Charming Software --- Annoying
Enforce Coherence -- Write the Manual Now Define needs and scope Reference for developers Identify and show interrelationships Standardize notation Documentation Experimental Inelastic Neutron Scattering Theory of neutron scattering Theory of excitations in condensed matter Data analysis procedures Software architecture Reference manual
Key Concept: Data Analysis as a Web Service Data analysis is a service controlled by user Computation is arranged by the web server user’s laptop issues commands and receives results
Key Concept: Data Analysis as a Web Service XML-RPC is an open standard for remote procedure calls The user’s web browser issues commands to a server The server distributes the work to appropriate computers
Early User Interface (Browser/Java) Users navigate the web site with the left frame (content is generated dynamically by the web server). Users select data and calculational tools, and add them to the Java applet on the right (like “Labview”) Mention: Authorized users can continually add new transformations or data sets, and since the content is dynamically generated HTML, these new items are immediately available to all users.
The User Interface (Python/Viper/Cobra) The user “wires” the boxes together to represent data flows We intend to allow users to insert new code in empty boxes, and archive sessions and procedures
Data Analysis Execution User hits “Run” GUI translates wiring diagram into XML-RPC commands Server receives commands, arranges Python script, starts data processing. This allows us to have a central repository of data and computing resources which many users can share. The users themselves do not need powerful machines or fast internet connections to process their data.
Big Concept of a Web Service The server can provide access to the best combination of hardware and software Experimental data and analysis codes reside on the servers, so little data bandwidth is needed Computing resources can be changed without affecting the user Computation can be local or non-local Clean separation of GUI from analysis code One web portal for all neutron instruments(?)
The Bigger Concept Underneath Components Pre-compiled Python objects called and re-arranged by the Python Interpreter Data Streams Standard communication protocol between components Note: Standard streams can connect components located anywhere…
Tools for Programmers Component Templates Standard Data Streams
Component
Rebinner
Levels of Code Development For using existing scripts, entry barrier nearly zero. For altering existing Python scripts, entry barrier is very low. For writing new Python code, entry barrier is modest. Performance may be comparable to IDL or Matlab. Transition to high performance compiled code ARCS: Writing Python bindings for C++ DANSE: Component templates for C++, FORTRAN, Java?
Data Rebinning – Tim’s Test Tested t to E rebin computation in several forms. 300 seconds with IDL 60 seconds with IDL using a DLL compiled from C++ 2 seconds with C++ “Rebinner” class
GUI form/function
Extension Beyond ARCS Science Large-scale structures Diffraction Engineering diffraction Inelastic scattering Theory of structure and dynamics Neutronics and nuclear physics Engineering Software Tools Hardware Tools
Management Structure User Base (Help with Requirements and Testing) Virtual Instrument Development Team User Opinions and Testing Executive Committee (Direct DANSE) Subproject Leaders Meetings Every 3 Months, Later 6 Months Steering Committee (Institutional Issues) Facilities Directors Caltech Provost’s Office
Money Is Helpful Software tools need development (people) Hardware for multiple users (people and machines) Software subprojects at facilities must have DANSE funding Five-Year Budget 20 M$ 5 subprojects 3 FTE @ 0.8 M$/y 10 M$ Central Resources 7 FTE + hardware 30 M$
Agenda Morning 8:30 Fultz: Overview of ARCS software project, and its extension to other neutron science. 9 Lin and Kelley: Demo and commentary on distributed data analysis. 9:20 Aivazis: Overview of the DANSE architecture, and how it fits into future computing infrastructures. 10 AM: Dan Meiron (Vice-Provost for Computing): Large-scale scientific computing at Caltech Break Discussion of topics such as: Needs of facilities, Needs of Caltech, Needs of DANSE project, Management of DANSE project, Organization of subprojects, Proposal Lunch Afternoon Further discussion of DANSE management and subproject organization, white paper revisions Dinner 7 PM Café Santorini, Old Pasadena
End of Presentation
Software Roadmap v. 1.0
Born - von Kármán Lattice Dynamics Simplicity: Complexity: Undergrowth of indices for tensor quantities: Crystal structure: lattice, l, basis, k allowed elements depend on symmetry
J. M. Ziman, Electrons and Phonons
Electronic Publishing library documentation ? ? ? user codes server instrument and data
Software Survey 10 responses, from serious users. very detailed comments.
User Ratings of Software Packages User Opinion (–2 to +2) Strength of Opinion (0 to +3) Number of Respondents Rating = UO*SO/N
Software Survey Results Interesting: CS: ISAW, Unisoft, Computational Crystallographic Toolkit User: ISAW, DAVE, Mslice, McStas, Gulp, Chop Problems: Code Quality and User Opinion are not necessarily correlated (McStas is an exception) 2. We are still looking at details, but we can already see that there will be more rewriting than expected
Issues in Code Development X hours to develop a code for yourself 3 X hours to make it convenient for others 9 X hours to make it run for others on several platforms Adding personnel to a late software project makes the project later. (Brooks Law)