ACAT 2000 FNAL, October 2000 Lassi A. Tuura Analysis Environment Challenges Lassi A. Tuura Northeastern University, Boston.

Slides:



Advertisements
Similar presentations
Copyright 2001, ActiveState. XSLT and Scripting Languages or…XSLT: what is everyone so hot and bothered about?
Advertisements

SDL+ The Simplest, Useful Enhanced SDL-Subset The documentation is the design, the design is the system! Copyright © SDL Task Force Consortium.
Design, prototyping and construction
Object-Oriented Software Engineering Visual OO Analysis and Design
Database System Concepts and Architecture
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
Programming Paradigms and languages
IT Requirements Capture Process. Motivation for this seminar Discovering system requirements is hard. Formally testing use case conformance is hard. We.
Stat-JR: eBooks Richard Parker. Quick overview To recap… Stat-JR uses templates to perform specific functions on datasets, e.g.: – 1LevelMod fits 1-level.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
From requirements to design
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Two main requirements: 1. Implementation Inspection policies (scheduling algorithms) that will extand the current AutoSched software : Taking to account.
Core Application Software Activities Ian Fisk US-CMS Physics Meeting April 20, 2001.
1 CS115 Class 7: Architecture Due today –Requirements –Read Architecture paper pages 1-15 Next Tuesday –Read Practical UML.
Russell Taylor Lecturer in Computing & Business Studies.
SM3121 Software Technology Mark Green School of Creative Media.
Course Instructor: Aisha Azeem
Distributed Systems: Client/Server Computing
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Software Development Unit 6.
1 An introduction to design patterns Based on material produced by John Vlissides and Douglas C. Schmidt.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Who am I? ● Catalin Comanici ● QA for 10 years, doing test automation for about 6 years ● fun guy and rock star wannabe.
The Client/Server Database Environment
Client/Server Architectures
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Object Oriented Databases by Adam Stevenson. Object Databases Became commercially popular in mid 1990’s Became commercially popular in mid 1990’s You.
What is R By: Wase Siddiqui. Introduction R is a programming language which is used for statistical computing and graphics. “R is a language and environment.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Chapter 7 Designing Classes. Class Design When we are developing a piece of software, we want to design the software We don’t want to just sit down and.
Java Analysis Studio Status Update 12 May 2000 Altas Software Week Tony Johnson
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
WJEC Applied ICT Spreadsheet Skills 1.Introduction to Financial Modelling Definition A model is a program which has been developed to copy the way.
Team Skill 6: Building the Right System From Use Cases to Implementation (25)
4/2/03I-1 © 2001 T. Horton CS 494 Object-Oriented Analysis & Design Software Architecture and Design Readings: Ambler, Chap. 7 (Sections to start.
CMPD 434 MULTIMEDIA AUTHORING Chapter 06 Multimedia Authoring Process IV.
Requirements To Design--Iteratively Chapter 12 Applying UML and Patterns Craig Larman.
Programming for Geographical Information Analysis: Advanced Skills Lecture 1: Introduction Programming Arc Dr Andy Evans.
Event View G. Watts (UW) O. Harris (UW). Philosophy EventView Goals Object Identification & Interpretation Is that a jet or an electron? Is that jet a.
March 16 & 21, Csci 2111: Data and File Structures Week 9, Lectures 1 & 2 Indexed Sequential File Access and Prefix B+ Trees.
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
Systems Analysis and Design in a Changing World, 3rd Edition
The european ITM Task Force data structure F. Imbeaux.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Software Engineering Saeed Akhtar The University of Lahore Lecture 6 Originally shared for: mashhoood.webs.com.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Acat OctoberRene Brun1 Future of Analysis Environments Personal views Rene Brun CERN.
Version 5. ¿What is PAF? PAF is a tool to easily and quickly implement… …distributed analysis over ROOT trees. …by hiding as much as possible the inherent.
CIS 112 Exam Review. Exam Content 100 questions valued at 1 point each 100 questions valued at 1 point each 100 points total 100 points total 10 each.
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen & Jussi Rasinmäki Dept. of Forest Resource Management.
1 CMPT 275 High Level Design Phase Modularization.
Grade Book Database Presentation Jeanne Winstead CINS 137.
ECE450 - Software Engineering II1 ECE450 – Software Engineering II Today: Introduction to Software Architecture.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Selenium server By, Kartikeya Rastogi Mayur Sapre Mosheca. R
Geant4 User Workshop 15, 2002 Lassi A. Tuura, Northeastern University IGUANA Overview Lassi A. Tuura Northeastern University,
CPT Week, November , 2002 Lassi A. Tuura, Northeastern University Core Framework Infrastructure Lassi A. Tuura Northeastern.
Plug-In Architecture Pattern. Problem The functionality of a system needs to be extended after the software is shipped The set of possible post-shipment.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
HEPVis May, 2001 Lassi A. Tuura, Northeastern University Coherent and Non-Invasive Open Analysis Architecture Lassi A. Tuura.
Analysis Model Zhengyun You University of California Irvine Mu2e Computing Review March 5-6, 2015 Mu2e-doc-5227.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Object Networks—ATLAS' Future Control Framework For Offline?
Database System Concepts and Architecture
Presentation transcript:

ACAT FNAL, October 2000 Lassi A. Tuura Analysis Environment Challenges Lassi A. Tuura Northeastern University, Boston

FNAL, October 2000 Lassi A. Tuura 2 What Is An Analysis Environment? v Physics analysis is to a large degree an iterative process of r Reducing data samples to more interesting subsets r Distilling the sample into information at higher abstraction level – By summarising lower level information – By calculating statistical entities from the samples Experiment Reduce Distill Interpret v A large part of the work can be done on very high-level entities in an interactive analysis and presentation tool r Hence focus on tools that work on simple summary information (DSTs, N-tuples, tag databases,...) r Additional tools for detector and event visualisation

FNAL, October 2000 Lassi A. Tuura 3 So What Is An Analysis Environment? v Analysis involves a lot more than just the interactive tool v Learn from the “PAW revolution” r N-tuples provided new, more powerful ways to work with the data r New user interface v Move towards closer integration with data continues r We can do much more and better than just a N-tuple today r Examples: ROOT added trees, CMS uses a full-blown object model v Experiments are making big jumps in data accessibility r Exploiting widely used, very powerful object models—not just data r New levels of automation and integration are becoming available for networks, distributed computing and mass-storage systems r User interfaces to these new data models need to catch up! àThe analysis environments will need considerable links with the rest of the experiment’s computing and software infrastructure

FNAL, October 2000 Lassi A. Tuura 4 The Challenge v Beyond the interactive analysis tool r Data analysis & presentation: N-tuples, histograms, fitting, plotting, … v A great range of other user activities with fuzzy boundaries r Batch r Interactive from “pointy-clicky” to Emacs-like power tool to scripting r Setting up configuration management tools, application frameworks and reconstruction packages r Data store operations: Replicating entire data stores; Copying runs, events, event parts between stores; Not just copying but also doing something more complicated—filtering, reconstruction, analysis, … r Browsing data stores down to object detail level r 2D and 3D visualisation r Moving code across final analysis, reconstruction and triggers Today this involves (too) many tools

FNAL, October 2000 Lassi A. Tuura 5 Example: Distributing Your Data Store v Problem: replicating and sharing your experiment’s data in full or in part for various analysis tasks and GRID v Tools exist but... 8 Do I understand my experiment’s world-wide configurations well enough to use the tools confidently? 8 How do I find out the data store nearest me in the first place? 8 If I want a private working store that shares the experiment data at the same time, what should I do? 8 What if I do not want just a plain file copy, but want only a copy of the reconstructed data for the calorimeter from a certain sample that includes events in tens of files? 8 What if I want to share my analysis settings and results with my colleague for a verification? v Enquiring minds want to know!

FNAL, October 2000 Lassi A. Tuura 6 What Do We Need? v A uniform integrated interface to the whole task range (within reasonable limits)? A tool suite or a work bench? v Wizards for common tasks to guide us through the choices, to give sensible defaults and to explain the terminology? v Some ideas that might prove helpful r Showing the data store or parts of it as a directory r Conceptual “home directory” in the data store r Make it easy to put stuff related to your analyses under your “home directory” (framework and reconstruction setups, parameters etc.) r Make it easy to access analysis setups and results of different groups – Keep track of configurations, input and output data selections, … – A “desktop” where you can have shortcuts/links – Standard shortcuts for common stuff One size never fits all—the tools need to adapt!

FNAL, October 2000 Lassi A. Tuura 7 Extrapolate these to a data store… Concepts In Today’s Apps (IGUANA prototype)

FNAL, October 2000 Lassi A. Tuura 8 Command-line interface that reflects actions in other windows Visualisation window Plus of course batch mode without pointy-clicky! Concepts In Today’s Apps…

FNAL, October 2000 Lassi A. Tuura 9 How To Get There?  Few can afford to develop a new interactive analysis tool, let alone coherent tools for the entire range of analysis tasks! v Divide, conquer and co-operate r Divide the problem into categories, such as GUI, event and detector visualisation, and data analysis and presentation r We need to share: use existing modules in each category where possible—write your own only where nothing suitable exists (and don’t get attached to code, ditch it when something better is available!) r Integrate the lot into a user-friendly and productive environment r Make applications by choosing from the module pool—experiments could construct their own specific environments with customisation v For this to work, the pool should be truly modular r Need to take into account all dependencies, not just the obvious ones r Need to think what it would take to test all the features provided by each component—those form its immediate dependencies

FNAL, October 2000 Lassi A. Tuura 10 What Kind of an Architecture? v Modular where it matters r Model-View-Controller and alike work to partition the domain r Layer to keep front-ends and back-ends separate r Ensure a standard for visual components to facilitate integration v Interfaces for data access v Narrow interfaces to link the analysis and visualisation sub- framework to the core framework v Not everything needs an abstract interface! r It may be better to make a strategic choice to use a particular product if it can be contained and completely replaced in 6-9 months r Example: Use OpenInventor instead of inventing your own 3D API v We need to assess and bound the risks, not total safety!

FNAL, October 2000 Lassi A. Tuura 11 More About Interfaces v Example: selecting events using high-level summary data r Pick your favourite name for the same concept: Tags, N-tuples, DSTs, B-tree indices… r N-tuple was both an access paradigm and a storage method r Historical emphasis was on storage format v Shift the emphasis to an access and query interface r Can provide the look and feel for a proven access method (N-tuple) with natural modern extensions r Implementation behind the interface may vary – Data may already be cached or accessed from deep in the event – May exploit advanced indexing and retrieval – May involve computation on demand – May even be necessary to read from tape r Other interfaces can provide access to underlying features

FNAL, October 2000 Lassi A. Tuura 12Summary v Analysis environment includes a lot more than just the interactive data analysis and presentation tools v As experiment complexity grows we need r To be able to drill down to and interact with data in many new ways r A good solid user interface for the whole range of tasks all the way from batch mode operation to the quick pointy-clicky jobs v Building all this from scratch is neither affordable nor wise r Exploit existing components—HEP, open source or commercial r Components need clearly defined responsibilities: a mission statement r Abstract interfaces are useful means to – Help people co-operate and not disturb each other too much – Provide hooks for all the cool new stuff we will see – Layer and partition the problem domain – Bound risks should a technology or a component fail

FNAL, October 2000 Lassi A. Tuura 13

FNAL, October 2000 Lassi A. Tuura 14 Some Architecture Ideas v Three-tier architecture ÀApplication model (framework, reconstruction, simulation …) ÁSpecific ways of looking at objects (3D, 2D, hierarchical browser, object inspector, fitter…) ÂRepresentation tier to tie the above two together r Dynamically load and integrate required bits together r (MV) 2 C: Representation is the view from application model, but model to the visualiser r Possible interesting result: scripting becomes “yet another view” and does not require special treatment or privilege v A host of wizards r Coherent, good human interface r Easily adapted and expanded to new tasks r Should be able to leave behind scripts or other batch mode food

FNAL, October 2000 Lassi A. Tuura 15 Interface Pros and Cons Modularity and good interfaces make a big difference r When one particular component fails, it doesn’t take others down r Easier to add new features—without disturbing existing ones r Easier to adapt to new, sometimes radically different contexts r Testing is manageable and actually gets done r Easier to manage the project and for people to co-operate (often much more of the work is in communication, not coding)  …but they come at a price r Costlier to develop up front r Bad interface can make life really awkward r Hard to justify if you have only one implementation r A good interface needs one clearly defined mission—coming up with it may require considerable work, but usually is more than worth it as doing so usually clarifies problem understanding and project strategy

FNAL, October 2000 Lassi A. Tuura 16 Do Languages Matter? 8 No—Great concepts will survive in almost any language r Especially within a common paradigm like object oriented languages r It is the paradigm changes that hurt, changing from objects to components is a more difficult change than from C++ to Java… r Will we see extern “Java” { class XYZ { … }; }? 4 Yes—Consider this scenario r Someone in the collaboration comes up with a new analysis cut r … and that cut proves very interesting r … so the analysis needs to get into the trigger express line If the analysis was done by C++ code that writes out a N-tuple that was then processed with a few-thousand lines of PAW KUMACs and FORTRAN, you’ll have a hard time finding volunteers to re-code it for the trigger, let alone someone willing to double-check it It is not (just) the languages that hurt...

FNAL, October 2000 Lassi A. Tuura 17 Data Store (Objectivity) File Federation wizards Cmscan Data Browser Analysis job wizards Other Non- IGUANA Tools IGUANA OSCAR ORCA CARF Tony’s scripts Objytools GRIDTools CMS Analysis Architecture At a Glance

FNAL, October 2000 Lassi A. Tuura 18 Modularity Example: IgAPDlab Could pick only a subset for some related task

FNAL, October 2000 Lassi A. Tuura 19 Current IGUANA Tools (By Origin) IGUANA LHC++ or HEP Public- domain Commercial

FNAL, October 2000 Lassi A. Tuura 20 Current IGUANA Tools (By Purpose)