MC, REPROCESSING, TRAINS EXPERIENCE FROM DATA PROCESSING.

Slides:



Advertisements
Similar presentations
Workflows in Archie IMS Support Person: Sonja Henderson
Advertisements

Lectures on File Management
1 Authority on Demand Flexible Access Control Solution.
P5, M1, D1.
July 2010 D2.1 Upgrading strategy Javier Soto Catalog Release 3. Communities.
ONYX RIP Version Technical Training General. Overview General Messaging and What’s New in X10 High Level Print and Cut & Profiling Overviews In Depth.
Adcos – one shifters wildest dreams… Wahid Bhimji.
Microsoft ® Office Outlook ® 2003 Virtually Working for You presents:
* Requisition Processing Common Problems * Budget Checking Errors * Run Controls * Process Scheduler Request & Process Monitor * Questions.
Logging In Go to web site:
An End-User Perspective On Using NatQuery Building a Datawarehouse T
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
1 Overview of Upgrade Enhancements For Requisitioners and PO Managers December 22, 2004.
©2006, CSA Using COS Funding Alert Automatic Notification of Relevant New Opportunities from the World’s Largest Funding Database ™ Easily Accessible Via.
Form Handling, Validation and Functions. Form Handling Forms are a graphical user interfaces (GUIs) that enables the interaction between users and servers.
Business Processes and Workflow How to go from idea to implementation
Systems Development Life Cycle Dirt Sport Custom.
What is Sure BDCs? BDC stands for Batch Data Communication and is also known as Batch Input. It is a technique for mass input of data into SAP by simulating.
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
Enrolment Services – Class Scheduling Fall 2014 Course Combinations.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Lead Management Tool Partner User Guide March 15, 2013
11/10/2015S.A.1 Searches for data using AMI October 2010 Solveig Albrand.
6 th Annual Focus Users’ Conference Manage Integrations Presented by: Mike Morris.
Family Resource Centers in ePlan August 2015 Adding a Budget in ePlan.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
PanDA Monitor Development ATLAS S&C Workshop by V.Fine (BNL)
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
For each customer interface record, a new instance of Workflow main process is kicked off, as below Click to proceed…………
AMB HW LOW LEVEL SIMULATION VS HW OUTPUT G. Volpi, INFN Pisa.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.
Analysis trains – Status & experience from operation Mihaela Gheata.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
M1G Introduction to Programming 2 3. Creating Classes: Room and Item.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
Unit 17: SDLC. Systems Development Life Cycle Five Major Phases Plus Documentation throughout Plus Evaluation…
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Ops Portal New Requirements.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Observing the Current System Benefits Can see how the system actually works in practice Can ask people to explain what they are doing – to gain a clear.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
Data processing Offline review Feb 2, Productions, tools and results Three basic types of processing RAW MC Trains/AODs I will go through these.
Analysis Trains Costin Grigoras Jan Fiete Grosse-Oetringhaus ALICE Offline Week,
LHCbDirac and Core Software. LHCbDirac and Core SW Core Software workshop, PhC2 Running Gaudi Applications on the Grid m Application deployment o CVMFS.
Using Workflow With Dataforms Tim Borntreger, Director of Client Services.
Executive Summary - Human Factors Heuristic Evaluation 04/18/2014.
Interactions & Automations
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
How to complete and submit a Final Report through Mobility Tool+ Technical guidelines Authentication, Completion and Submission 1 Antonia Gogaki IT Officer.
I/O and Metadata Jack Cranshaw Argonne National Laboratory November 9, ATLAS Core Software TIM.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI GLUE 2: Deployment and Validation Stephen Burke egi.eu EGI OMB March 26 th.
Advanced Taverna Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Taverna allows you to automatically iterate through large data sets. This section introduces you to some of the more advanced configuration options for.
AliRoot survey: Calibration P.Hristov 11/06/2013.
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
How to complete and submit a Final Report through
Web Portal tricks.
Training Documentation – Replacing GSPR with RFQ 2.0
AMI – Status November Solveig Albrand Jerome Fulachier
The ADC Operations Story
Online Software Status
Job Processing Database consolidation Task recovery De-cronification
FAST Administration Training
Order Processing and Requisition Accelerator
Performing Database Recovery
Presentation transcript:

MC, REPROCESSING, TRAINS EXPERIENCE FROM DATA PROCESSING

Chicago MC Production some of this requests / comments are already in JIRA but some maybe missing Request Interface Functionality : Request Placing : It would need some cosmetic changes and features : - Improve notification mail - Implementation of the default notification list for requests - Possibility to add more than one address in CC - Even though it is not clear how widely it will be used, we need a JIRA ticket for the Request - After second “Submit” the modification should be disabled for non MC managers (also only managers should be able to modify the “manager” field) 2 ADC Technical Interchange Meeting

Chicago MC Production Request Approval : - MC Pattern should be extended to include “project_mode” configuration - Finding inputs needs to be optimized for large requests, we have not yet hit a problem here but soon will happen - It is a bit confusing still the steps to perform to submit requests (evgen, fullchain). It would need to be simplified. Request Monitoring and - Would like to know from the request page list which request need attention and which not without having to look at the request details. Now it is possible to check details by click but they would need to be improved or modified. 3 ADC Technical Interchange Meeting

Chicago MC Production Request Handling : - Additions implementations to improve usability : - Delete slices that had problems or are not needed - Clone implemented but something like fix will be handy. - Full Request Clone (needed for physics validation tasks) Request Post-processing : - “Finish ASAP” will save some mails to experts Task/Job Monitoring : and - Really improved, improvements would come from better error reporting (dedicated JIRA XXXX) Mail Notifications : - Working, now we need to pay more attention to them. We could promote “failed to refine task” to the Alarm mail. 4 ADC Technical Interchange Meeting

Chicago MC Production Processing types and ProdSys-2 : Most standard configuration Full Sim MC12 tested : Event Generation : - All configurations working in prodsys tested - Creation of tarball needed to create the production e-tag (done using scripts). We could maintain this a bit longer even if prodys is gone but not desired. - We need to commission the new transform Generate_tf (not working either in prodsys). Likely will be needed for MC15 but it is not yet fully clear when JOs will be available. HITS Merging : - JEDI Merging : Not completely tested for MC Production (my bad). 5 ADC Technical Interchange Meeting

Chicago MC Production Standard Configurations : We have to review again all possible configurations. This is the status of the last ones : 6 ADC Technical Interchange Meeting TypeFull SimFast Sim MC11 okok? MC12 ok (benchmark) ok? (to be checked) MC14 Fixed on Friday Not yet available

Chicago MC Production Special Configurations : - Overlay : Needs absolutely FullChain transform. It needs a person available to develop it! - FTK : Needs implementation, uses special inputs types that need to be defined in ProdSys-2. Just recently implemented in our scripts, I can help now to implement it in ProdSys-2. We can live without FullChain, but will make processing easier. - Physics Validation : Not really a special configuration but it has special needs including the new transform for merging. - Cosmics reconstruction (even now we do not have a proper workflow with our scripts). - Other historically forgotten chains … 7 ADC Technical Interchange Meeting

Chicago Reprocessing Request definition : Ensure all transforms and workflows needed for campaigns done in 2012 can also be submitted to ProdSys2: Next week we will do this. - already known NTUPMerge_tf.py is not yet validated/debugged, is needed (given "automatic" merging)? - Automatic and custom merging (in JEDI?), followed by clean-up (deletion) - Ability to use dataset pattern as input, e.g. physics containers, COMA periods - Append and adjust an existing request (merging, a new reco step,...) - Start new request by cloning an existing request (and then modify details / tags) - Exclusion pattern to mask known problematic files / LB - Rucio and JEDI using GoodRunLists in production, to avoid bad Lumi Blocks ? - Also: what about those that *can* be processed, but will not be used for analysis - save resources ? 8 ADC Technical Interchange Meeting

Chicago Reprocessing Request definition (II) : - Preservation of lumi-blocks in merging steps (ESD, AOD, TAG formats..) - metadata from Rucio? - Ensure all "project_mode" options available individual options: no longer use LIST files fraction_ESD, cmtconfig, splitjobs, lumblock, diskcount,...) - Define input:tag:output formats, e.g. merging only for AOD even if (when available a "Merging_tf.py") transform also does ESD 9 ADC Technical Interchange Meeting

Chicago Reprocessing Request / task management : - Simple way to abort individual jobs, tasks, group of tasks or whole request - Resubmit jobs within tasks, so we don't have to unecessarily redo 1000s of jobs and waste resources - Ability to increase memory requirement for a job or all jobs in a task - Easy way to download RAW files used on jobs that fail - Creation of JIRA tickets for (groups of) tasks 10 ADC Technical Interchange Meeting

Chicago Reprocessing Monitoring (for BigPanda) - Ensure Reprocessing Monitor functionality integrated into BigPanda onitor/ - Automatic statistics needs catching (algorithm in athena, selecting error message,.. ) in log files: - Display statistics on TASK page of failed jobs - Display statistics on JOB page of failed jobs - List of jobs with maximum number of attempts per task - Last update in job count (jobs done, running, etc) Dataset handling - Automatic deletion of unmerged datasets ? - Ability to recover lost data from any step in a repro campaign 11 ADC Technical Interchange Meeting

Chicago Reprocessing In addition to submitting our slice tests in ProdSys2, to be done: - Need to test tag creation in AMI rather than Panda interface - Need full commissioning plan for reprocessing, to test system thoroughly in 3 (6) month timescale - Look at how MC production are using spreadsheets, see if it is relevant to data reprocessing 12 ADC Technical Interchange Meeting

Chicago additionally from Paul “Overlaps with other feedback expected to be 100% !” - LB boundary-awareness seemed to be a pain sometimes in Run 1, the 1k vs 10k problem, NTUP_COMMON running limited to 5000 events which then broke metadata. Are we sure we have a robust system that can handle this? LB- awareness is most important for DataPrep. Upgraded LB-accounting should help but this is still being worked on so this is one for next S&C week. - Merging (internal) - how does JEDI decide on merging parameters? Particularly for derivations (outputs can vary by orders of magnitude depending on the derivation type) this will need to be fairly dynamic - using the pilot (?) may be ok, but this will also need to be LB-aware - don’t split complete LBs - ever ! - Number of retries - set at 25 for reprocessing due to long tails, but such a large number can just mean delaying spotting a problem. - Related - better error codes (on “us” to define) - can we use error codes better in monitoring and automated responses? - Request - can we predict the time to finish based on the rate of progress ? Particularly useful for large campaigns like reprocessing. Accuracy is a different question! 13 ADC Technical Interchange Meeting