5/12/06T.Kurca - D0 Meeting FNAL1 p20 Reprocessing Introduction Computing Resources Architecture Operational Model Technical Issues Operational Issues - Status Tibor Kurča IPN Lyon
5/12/06T.Kurca - D0 Meeting FNAL2 Introduction Goal: reprocess ~500 M RunIIb events (83 TB) with newly calibrated detector & improved reconstruction software by end of March ‘07 Where: SAMGrid/OSG & native SAMGrid (CCIN2P3) Issues: SAMGrid –OSG new environment - missing experience - fast problems identification solution ? - importance of organization, preparation efficiency ?
5/12/06T.Kurca - D0 Meeting FNAL3 Computing Resources Needs: 2000 CPUs, 4 TB disk cache, 1Gb links Available : #CPU (not guaranted) Disk Cache Oklahoma U TB Indiana U TB Nebraska U 250 CMS –FNAL 250 Fermilab 4 TB NERSC 250 SPRACE 250 Purdue ? Florida ? CCIN2P3 500 (non-OSG, SAMGrid)
5/12/06T.Kurca - D0 Meeting FNAL4 Basic Architecture SAM-Grid OSG SAM-Grid / OSG Forwarding Node SAM-Grid VO-Specific Services Flow of Job Submission Offers services to … Main issues to track down: Accessibility of the services Usability of the resources Scalability New SAM
5/12/06T.Kurca - D0 Meeting FNAL5 Current Configuration FW SAM- Grid C S FW C S Network Boundaries Forwarding Node LCG Cluster SAM Stager VO-Service (SAM) Job Flow Offers Service - new SAM FNAL C IU C C SPRACE UNL NERSC stg CMS C OU stg C
5/12/06T.Kurca - D0 Meeting FNAL6 Operational Model Production & Merging: - production – reconstruction at each site unmerged TMBs at FNAL - merging preferentially at FNAL Organization - define submitter teams à 2 person - assign datasets to each team - define primary resp. secondary OSG clusters for each team where they should submit - submission from central UI installed on d0mino Operation Problems Solution - multilevel expertise - identify problem : SAMGrid or OSG SAMGrid: contact d0_reprocessing - official OSG way : open ticket at GOC-Indiana - contact directly local administrators (if 1st way not working)
5/12/06T.Kurca - D0 Meeting FNAL7 Multilevel Expertise Submitters/shifters: to submit, check logfiles report problems Mandy ROMINSKY, Sohrab HOSSAIN University of Oklahoma Joseph STEELE Louisiana Tech University Yanwen LIU University of Science and Technology of China Dag GILLBERG, Zhiyi LIU Simon Fraser University, Canada Experienced people: first aid Daniel, Joel, Mike, Tibor …. & others SAMGrid experts: problem solving, intervention Andrew, Gabriele, Parag OSG experts /local administrators: OSG related issues …we have established contacts ; … (subject should mention dzero reprocessing)
5/12/06T.Kurca - D0 Meeting FNAL8 Technical Issues Tests done: OSG clusters & CCIN2P3 tested deployment of storage queues done working To be done !: - central UI installation on d0mino node jim_client, d0repro tools …. Most urgent, UI for all submitters Issue: - binary input RTE-file size ~600 MB !!! …. To be shipped with each job … 2x raw data file size!!! ???? To reduce it very desirable !!!!!!
5/12/06T.Kurca - D0 Meeting FNAL9 Operational Issues - Status Relevant information at Grid certificates: most of the submitters have their DOEGrids certificates user account at NERSC ….. Procedure started Test runs submitters training hands on experience - jobs submission using d0repro tools - where to look for logs to be done ! …this Thursday? d0mino UI ? large scale test -scalability issues? … early next week Production start - early January ?