GLAST Collaboration Meeting, March 2008 T.Johnson1/22 GLAST Large Area Telescope Data Access Tony Johnson Stanford Linear Accelerator Center

Slides:



Advertisements
Similar presentations
NIMAC 2.0 Basics for AUs: Searching, Downloading, and Assigning Files 1www.nimac.us.
Advertisements

DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
For Details Visit : or For any Help Contact the Librarian EBSCOhost 2.0.
1 Configuring Internet- related services (April 22, 2015) © Abdou Illia, Spring 2015.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
We are partners in learning.. Note: Office 365 works best in Internet Explorer V 9 or above. Some features do not work in PWCS’s Chrome Browser or in.
What is so good about Archie and RevMan 5
PROACTIS: Supplier User Guide Contract Management.
Outlook 2007 Tips, Tricks, and Tools. Overview Main Screen Navigation Pane View Pane Reading Pane To–Do Bar Create a New Message Contacts Create a Signature.
TrendReader Standard 2 This generation of TrendReader Standard software utilizes the more familiar Windows format (“tree”) views of functions and file.
In the Sandbox Playing with SkillPort 7 for the first time.
Classroom User Training June 29, 2005 Presented by:
User Instruction: Lets Log in. Logging In Getting Around Using the Company Calendar ProntoMessaging.
Developing Workflows with SharePoint Designer David Coe Application Development Consultant Microsoft Corporation.
Copyright ®xSpring Pte Ltd, All rights reserved Versions DateVersionDescriptionAuthor May First version. Modified from Enterprise edition.NBL.
AYAN MITRA CHRIS HOFFMAN JANA HUTCHINS Arizona Geospatial Data Sharing Web Application Development April 10th, 2013.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Copyright © 2012 Rockwell Automation, Inc. All rights reserved. Rockwell Automation Online Support Center Updated Version released August 29, 2015.
GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.
Arc: AddIns Dr Andy Evans. Java Direct access to ArcObjects Framework inside and outside Arc. Ability to add components to the GUI. Ability to communicate.
New Features in Release 9.2 (July 27, 2009). 2 Release 9.2 New Features Updated Shopping Experience Home/Shop page Shop at the top search New Hosted Supplier.
Tutorial 121 Creating a New Web Forms Page You will find that creating Web Forms is similar to creating traditional Windows applications in Visual Basic.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
Java Root IO Part of the FreeHEP Java Library Tony Johnson Mark Dönszelmann
1 OPOL Training (OrderPro Online) Prepared by Christina Van Metre Independent Educational Consultant CTO, Business Development Team © Training Version.
Installing and Using MySQL and phpMyAdmin. Last Time... Installing Apache server Installing PHP Running basic PHP scripts on the server Not necessary.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
ARCSDE & ARCIMS Mr. David A. Perini. ARCIMS  Internet Mapping Server Distribute GIS information over the Internet Integrates with addition ESRI softwareESRI.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Microsoft Office Outlook 2013 Microsoft Office Outlook 2013 Courseware # 3252 Lesson 6: Organizing Information.
What’s new in Kentico CMS 5.0 Michal Neuwirth Product Manager Kentico Software.
1 AutoCAD Electrical 2008 What’s New Name Company AutoCAD Electrical 2008 What’s New AMS CAD Solutions
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Copenhagen, 7 June 2006 Toolkit update and maintenance Anton Cupcea Finsiel Romania.
Reports and Learning Resources Module 5 1. SLMS Primary Administrator Training Module 5: Reports and Learning Resources 2.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Train & Assess IT Office XP and Office 2003 Web-based Training and Assessment in a Single Product! New and Improved Enhancements Direct from Market and.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
Interactive Data Analysis on the “Grid” Tech-X/SLAC/PPDG:CS-11 Balamurali Ananthan David Alexander
Windows XP Lab 2 Organizing Your Work Competencies.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Oct HPS Collaboration Meeting Jeremy McCormick (SLAC) HPS Web 2.0 OR Web Apps and Databases (Oh My!) Jeremy McCormick (SLAC)
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
NIMAC for Accessible Media Producers: February 2013 NIMAC 2.0 for AMPs.
Fermi Fermi (previously GLAST) Gamma-Ray Space Telescope Processing Pipeline and Data CatalogGamma-Ray Space Telescope Processing Pipeline and Data Catalog.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
The Palantir Platform… …Changes in 2.3
SharePoint 101 – An Overview of SharePoint 2010, 2013 and Office 365
Using the Personal Image Photo Library
BI Share Quick reference guide.
Simulation Production System
Glast Collaboration Data Server and Data Catalog
Status of the CERN Analysis Facility
“Running Monte Carlo for the Fermi Telescope using the SLAC farm”
Software Testing With Testopia
Bomgar Remote support software
Central Document Library Quick Reference User Guide View User Guide
My Oracle Support (The next generation Metalink experience) lynn
Lessons Learned: The Organizers
SharePoint Essentials Toolkit
Oracle Sales Cloud Sales campaign
LAT Data Server Serve what?
Configuring Internet-related services
To the ETS – Encumbrance Online Training Course
To the ETS – Encumbrance Online Training Course
Data Challenge 1 Closeout Lessons Learned Already
Microsoft Office Illustrated Fundamentals
Microsoft Azure Data Catalog
Presentation transcript:

GLAST Collaboration Meeting, March 2008 T.Johnson1/22 GLAST Large Area Telescope Data Access Tony Johnson Stanford Linear Accelerator Center Gamma-ray Large Area Space Telescope

GLAST Collaboration Meeting, March 2008 T.Johnson2/22 Outline Topics Covered –xrootd –LAT Data Catalog Features Web Interface Tools –Download Manager –Skimmer –WIRED –Astro Server –Miscellaneous

GLAST Collaboration Meeting, March 2008 T.Johnson3/22 xrootd xrootd –System developed at SLAC to manage large datasets –Distributes files across disks Maximizes throughput Minimizes manual disk management Automates archiving datasets to (and restoring from) tape Provides more reliability and scalability than NFS Supports access control based on GLAST collaborator list Has been in used for OpsSim2 and “Big MC Run” –Mostly working smoothly Miscellaneous idiosyncrasies that need to be understood Timeout problems when reading files

GLAST Collaboration Meeting, March 2008 T.Johnson4/22 LAT Data Catalog Data catalog is a database designed for tracking LAT datasets –Can be used with Disk files in AFS, NFS, or XROOTD servers, or tape archives Data created inside or outside of processing pipeline Data created/stored at SLAC or elsewhere One or more locations per dataset –Simplifies access to data by providing a uniform view of files irrespective of their physical location –Allows data to be organized into a tree of “virtual” folders Folders don’t have to correspond to physical location of data –Allows data to have associated “meta-data” Some meta-data is required and verified by catalog –size, location, run range, creation date Other meta-data is user-defined and arbitrarily extensible –Data can be Browsed using virtual folders and “groups” –Folders contain arbitrary sub-folders, datasets and groups –Groups contain homogeneous list of datasets Searched using meta-data –E.g. DatasetType=MC && RunMin > 50 && RunMin < 100 –Data crawler As new datasets are registered crawler validates files and extracts meta- data (file size, number of events, etc).

GLAST Collaboration Meeting, March 2008 T.Johnson5/22 LAT Data Catalog - Web Interface Browsable tree of datasets Events, file size, run range automatically set by “crawler” Access/ Authentification handled by web Meta-data added by creator Supports mirroring at multiple sites Dataset Description

GLAST Collaboration Meeting, March 2008 T.Johnson6/22 LAT Data Catalog - Tools Pipeline Tools –From within “Pipeline Scriptlet” datasets can be registered together with meta-data and multiple locations located using meta-data and passed to subsequent processing stages Command Line Tools –Available now registerDataset –Wildcards supported for registering many datasets at once find –List/search for files addLocation addMetadata –Coming soon remove move Java API –Programmatic access to full functionality More Info –Data catalog User’s Guide –

GLAST Collaboration Meeting, March 2008 T.Johnson7/22 Recent Improvements Line-mode client find command –datacat find -G merit /MC-Tasks/OpsSim/opssim2-GR-v13r9/runs -s RunMin root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root –datacat find --recurse --search-groups -F 'DataType=="MERIT"&&nMetStart>= && nMetStart<= ' -S SLAC_XROOT -s TaskName -s Name /MC-Tasks/OpsSim/ root://glast-rdr//glast/mc/OpsSim/opssim2-GR-HEAD /merit/opssim2-GR-HEAD merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-HEAD /merit/opssim2-GR-HEAD merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9/merit/opssim2-GR-v13r merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p1/merit/opssim2-GR-v13r9p merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p1/merit/opssim2-GR-v13r9p merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p2/merit/opssim2-GR-v13r9p merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p2/merit/opssim2-GR-v13r9p merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p2-np/merit/opssim2-GR-v13r9p2-np merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p2-np/merit/opssim2-GR-v13r9p2-np merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p3/merit/opssim2-GR-v13r9p merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-GR-v13r9p3/merit/opssim2-GR-v13r9p merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-nocel/merit/opssim2-nocel merit.root root://glast-rdr//glast/mc/OpsSim/opssim2-nocel/merit/opssim2-nocel merit.root –Available now in DEV, feedback encouraged Dan is preparing adding to data catalog user’s guide Enhancements to data catalog access in pipeline –Access meta-data from search results

GLAST Collaboration Meeting, March 2008 T.Johnson8/22 Recent Improvements New faster crawler –Original crawler was not able to keep up with MC running at full throttle. –New crawler processes files in parallel and can easily keep up –During Ops Sim2 problems discovered with files >2GB in length Now fixed

GLAST Collaboration Meeting, March 2008 T.Johnson9/22 Status/Problems/Plans Problems –Can be painfully slow (with 5,000,000 datasets) New oracle database being tested now Karen working on adding “materialized views” Further optimization of queries needed Sensible pagination of large datasets –Web interface needs to allow selection of data based on Run number range Time range Meta-data search (c.f. line-mode client) –File versions As of Ops Sim 2 L1Proc registers multiple versions of files –r _v001_merit.rootr _v001_merit.root –r _v002_merit.rootr _v002_merit.root Data catalog does not know these are multiple versions of the same file –Sends them both to the skimmer  duplicate events Propose to add versioning to data catalog (show only latest by default) –Need Custom Views of data E.g. All ASP products for run nnn source abc Plan –Fix problems

GLAST Collaboration Meeting, March 2008 T.Johnson10/22 Download Manager One-click download of multiple files Inherits authorization from web login –note no anonymous FTP in future – SLAC account will be required for data access Works with ftp:, http: and root: –Validates files (length, checksum) against data catalog Supports simultaneous download of multiple files Does not download files which already exist in target dir –So easy to fetch recently added files Can resume download of partially downloaded files

GLAST Collaboration Meeting, March 2008 T.Johnson11/22 Status/Problems/Plans Several problems discovered during Ops Sim 2 –100% CPU usage after file recovery (fixed) –Bad error message if checksum inconsistent (fixed) –Problems downloading files >2GB (almost fixed) New feature –Start/Pause download requested (now available) Feature requests pending –Ability to download select run/time ranges This will work automatically once this feature is added to data catalog web application –Non-GUI version for automated download/sync of data –Ability to select files to download from GUI (without web)

GLAST Collaboration Meeting, March 2008 T.Johnson12/22 LAT Data Skimmer Allows data to be selected using “TCut” on tuple columns –Can output either Root or Fits (FT1) files –Uses Pipeline II for data processing Allows parallel processing for large tasks –Output available for download for 10 days –Complete skim history maintained for later reuse

GLAST Collaboration Meeting, March 2008 T.Johnson13/22 3 Ways to Access Data Skimmer Directly from Data Portal – –click on “Simple Skimmer” Data Processing Page(s) From the Data Catalog

GLAST Collaboration Meeting, March 2008 T.Johnson14/22 LAT Data Skimmer

GLAST Collaboration Meeting, March 2008 T.Johnson15/22 Status/Problems/Plans Problems –Backend/root crashes new (compiled) backend available soon – notification should include data dir even if failed Need to be able to navigate from pipeline  > data dirs Skimmer improvements in progress –Ability to skim more types of files “svac” “cal” and “gcr” added by David Chamont –Web interface needs to catch up –Ability to output more event types Full Recon, Digi, MC trees “Extended Event” (intermediate between FT1 and Merit) Event Lists –CompositeEventLists (CEL) files –Access to more “expert” options

GLAST Collaboration Meeting, March 2008 T.Johnson16/22 Event Display (WIRED) WIRED allows quick look at detector response –can be installed directly from Web with no additional GLAST software required. –Uses “HepRep” interchange format/infrastructure (shared with FRED)

GLAST Collaboration Meeting, March 2008 T.Johnson17/22 Event Display (WIRED)

GLAST Collaboration Meeting, March 2008 T.Johnson18/22 Status/Problems/Plans According to rumour doesn’t work outside my office –Actually it doesn’t work in my office either –But it did work fine for DC2 data Invariant under spatial translations/rotations Now being hooked up to data catalog/xrootd –Issue related to CEL files in gleam being investigated –Should be working again in next few days –“Event Display” link will appear it data catalog Will support browsing events or selection of specific events

GLAST Collaboration Meeting, March 2008 T.Johnson19/22 Astro Data Server Similar to skimmer, allows events to be selected using cuts –Cuts can only be on position in the sky, energy, time, and event category –Works much faster than Skimmer –Currently loaded with DC2 data Currently being refurbished for use with Service Challenge data and beyond –Will load all events as soon as they are produced by L1Proc User will be able to select –all data including partial runs –only “complete” runs Loose event cuts CTBClassLevel>1 –User can select CTBClassLevel category Able to output FT1, FT2, Extended event files, Merit root files –API for programmatic event selection Will be used by ASDC tools –Closer integration with data catalog, skimmer

GLAST Collaboration Meeting, March 2008 T.Johnson20/22 Astro Data Server Astro data server will remember the last set of parameters you used Astro Server also has a “Favorites” page –Keeps a list of your “favorite” search parameters

GLAST Collaboration Meeting, March 2008 T.Johnson21/22 Status/Problems/Plans Was used for SC2 55 day run Not used in Ops Sim 2 Still plan to –Load data from L1Proc –Add programmatic interface for use by ASP/ASDC tools –Better integration with Data Portal Bottom of priority list

GLAST Collaboration Meeting, March 2008 T.Johnson22/22 Miscellaneous Data Access Restrictions –Starting very soon (this week hopefully) you will need to be a “glast collaborator” to access files from xrootd –You will need to login to access data catalog/download manager Need to define standard skims –Automate their production Part of RSP? –Automate their registration in data catalog Access to ASP/RSP data has not been discussed here –But is in the plan Feedback from Ops Sim2 has been very useful –Not all digested yet Need more/better documentation –Data Access frequently asked questions Please suggest more FAQ’s More feedback welcome –