Castor-dev planning and resources for 2007 Castor Development Team Castor Delta Review –December 2006 German Cancio, Giuseppe Lo Presti, Sebastien Ponce.

Slides:



Advertisements
Similar presentations
Making the System Operational
Advertisements

Configuration management
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Project Management Summary Castor Development Team Castor Readiness Review – June 2006 German Cancio, Giuseppe Lo Presti, Sebastien Ponce CERN / IT.
CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Persistency Framework CORAL, POOL, COOL – Status and Outlook A. Valassi, R. Basset,
Technical Design Technology choices Core framework Castor Readiness Review – June 2006 Giuseppe Lo Presti, German Cancio, Sebastien Ponce CERN / IT.
16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
Software Evolution Managing the processes of software system change
CASTOR Upgrade, Testing and Issues Shaun de Witt GRIDPP August 2010.
Software Reengineering 2003 년 12 월 2 일 최창익, 고광 원.
Configuration Management Process and Environment MACS Review 1 February 5th, 2010 Roland Moser PR a-RMO, February 5 th, 2010 R. Moser 1 R. Gutleber.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Operation of CASTOR at RAL Tier1 Review November 2007 Bonny Strong.
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
Functional description Detailed view of the system Status and features Castor Readiness Review – June 2006 Giuseppe Lo Presti, Olof Bärring CERN / IT.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR2 Disk.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t CASTOR Status March 19 th 2007 CASTOR dev+ops teams Presented by Germán Cancio.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Report from CASTOR external operations F2F meeting held at RAL in February Barbara Martelli INFN - CNAF.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CASTOR status Presentation to LCG PEB 09/11/2004 Olof Bärring, CERN-IT.
Grid Deployment Enabling Grids for E-sciencE BDII 2171 LDAP 2172 LDAP 2173 LDAP 2170 Port Fwd Update DB & Modify DB 2170 Port.
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
CERN IT Department CH-1211 Genève 23 Switzerland t Load Testing Dennis Waldron, CERN IT/DM/DA CASTOR Face-to-Face Meeting, Feb 19 th 2009.
HNDIT23082 Lecture 06:Software Maintenance. Reasons for changes Errors in the existing system Changes in requirements Technological advances Legislation.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
FTS monitoring work WLCG service reliability workshop November 2007 Alexander Uzhinskiy Andrey Nechaevskiy.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
D. Duellmann, IT-DB POOL Status1 POOL Persistency Framework - Status after a first year of development Dirk Düllmann, IT-DB.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR Overview.
CTA: CERN Tape Archive Rationale, Architecture and Status
Castor dev Overview Castor external operation meeting – November 2006 Sebastien Ponce CERN / IT.
Item 9 The committee recommends that the development and operations teams review the list of workarounds, involving replacement of palliatives with features.
CASTOR: possible evolution into the LHC era
Jean-Philippe Baud, IT-GD, CERN November 2007
CASTOR Giuseppe Lo Presti on behalf of the CASTOR dev team
StoRM: a SRM solution for disk based storage systems
Giuseppe Lo Re Workshop Storage INFN 20/03/2006 – CNAF (Bologna)
Overview – SOE PatchTT November 2015.
CASTOR-SRM Status GridPP NeSC SRM workshop
Olof Bärring LCG-LHCC Review, 22nd September 2008
LCGAA nightlies infrastructure
Ákos Frohner EGEE'08 September 2008
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
Data Management cluster summary
Leigh Grundhoefer Indiana University
Chapter 2: Operating-System Structures
Lecture 06:Software Maintenance
Chapter 2: Operating-System Structures
Presentation transcript:

Castor-dev planning and resources for 2007 Castor Development Team Castor Delta Review –December 2006 German Cancio, Giuseppe Lo Presti, Sebastien Ponce CERN / IT

German Cancio (IT/FIO/FD) 2 Outline  Evolution of development resources since  Update on development plans for (Q1/2) 2007

German Cancio (IT/FIO/FD) 3 DeveloperFromTo%Main development responsibilities Olof Barring (FIO/FS)1999now5Castor-1, tape area maintenance, bughunting Hugo Cacote (FIO/TSI)11/05now60Tape driver software (stepping down) Felix Ehm (tech student)11/0512/06100Repack Rosa Garcia Rioja03/06now100GridFTP/xrootd, 64-bit port, web pages, strong authentication Giuseppe Lo Presti03/0501/08100SRM-2, stager + DB code, ETICS integration Sebastian Lopienski11/0506/0650Strong authentication Sebastien Ponce09/03now100Project leader+main architect, stager, scheduler Miroslav Siket01/07now30Tape system optimisations (file recalls) Giulia Taurelli01/06now100Test suite, RFIO, stager Dennis Waldron11/05now50DLF Arne Wiebalck (FIO/LA)11/06now50Tape driver software (taking over from Hugo) Victor Kotlyar, Vitali Motyakov (Russia) 07/03now0.5 FTEDLF, GridFTP-2 Shaun de Witt (RAL) 04/05now0.5 FTESRM-2 Evolution of Resources since 06/06 “senior” team member former team member “junior” team member ext. collaborator (<=1 year) Few changes since Q (compared to Q4 05 ->Q2 06): Team members new in Q1/Q2 06 now trained and up-to-speed S. Lopienski reassigned to other tasks outside CASTOR H. Cacote being replaced by A. Wiebalck (in FIO/TSI) M. Siket (30%) from Q on Technical Student to replace Felix in Q1/ (TBC) Giuseppe’s contract running out beginning of 2008

German Cancio (IT/FIO/FD) 4 Planning document update  Latest version of planning document updated beginning of December  Now 33 pages, 66 tasks defined  Takes into account collected tracking information (June- November)  Added ‘ongoing’ tasks such as 3 rd - level support, releases, management  More realistic %FTE allocation for development tasks Leading to postponed tasks…  Most tasks have now allocated manpower Avoiding concentrations on Giuseppe/Sebastien  Spent manpower now accounted separately from progress/duration 

German Cancio (IT/FIO/FD) 5 R16 : Credible WBS (VI)  Pending/ongoing:  Define more formal (but lightweight..) mechanisms for reviewing development WBS and progress with Castor Ops team  Identify a light-weight effort tracking tool (for both Castor-dev and ops teams) which can be used at a (bi-)weekly basis  Include external collaborators in tracking (outside the section)  Move to more deliverable-based rather than activity-based planning e.g. GDPM  Concerns:  High level of interruptions to Castor-development team In particular for ‘senior’ team members Significantly improved with knowledge build-up on Castor-Ops side, but still too high (e.g. Tier-1 sites) A weekly support rota for developers has been put in place  Many external dependencies make planning ahead difficult  Management overhead not always appreciated Keep it to the required minimum Resources are limited and not easy to reallocate to this task Not all developers see clear benefits Yesterday’s presentation

German Cancio (IT/FIO/FD) 6 Planning support tool  Still using GanttProject for helping in the definition of tasks, and their visual representation  chart information automatically generated from WBS document medium, high, low importance ongoing task

German Cancio (IT/FIO/FD) 7 Planning support tool  Still using GanttProject for helping in the definition of tasks, and their visual representation  chart information automatically generated from WBS document  Q > Q currently looks as follows:  Limitations:  Unflexible allocation of resources to tasks not possible to vary participation of a developer to a given task for a given time interval Leading to occasional over-allocations  Time buffers look like gaps in planning  Task, not milestone oriented Much detail required on allocations and availability (e.g. holidays, weekends, etc) which is too heavy  ganntproject sometimes crashing  WBS-to-ganttproject conversion script needs improvements..

German Cancio (IT/FIO/FD) 8 Update on new+open tasks (1) TG1, TG2 (known bugs and testing) TaskStart;duration progress Who (%) T1.3 Memory leak follow-up: verify if the suspected Oracle memleak is really gone so that stager cron restart can be removed new – followup from T1.1 15/12/06; 1dSebastien(25%) T1.4 Databases Cleanup provide a PL/SQL script for cleaning up Castor DB’s having old leftover stale requests new – followup from T1.2 1/12/06; 2dSebastien(10%) T2.2 Test DB hardware configurations for stager (including RAC and DataGuard) no progress Q1 2007; 5d? medium, high, low importance

German Cancio (IT/FIO/FD) 9 Update on new+open tasks (2) TG3 (Stager) TaskStart;duration progress Who (%) T3.1 Signal handling on client part (gracefully handle signals like SIGINT) rescheduled 15/4/07; 5dGiulia (50%) T3.2 stager_query improvements: reduce load on stager by minimising query overhead (e.g. due to regexps) partly postponed 1/4/07; 3d 50% Sebastien(25%) T3.3 Automatic cleanup at restart: remove stale DB entries after crashes of components rescheduled – awaiting higher stability 1/6/07; 10dGiulia (50%) T3.4 SvcClass specific selection policy: allow different filesystem selection policies per service classes and disk pools rescheduled 1/5/07; 5dDennis (45%) T3.5 Recall policies: allow for a smarter scheduling and grouping of recall operations rescheduled 8/1/07; 5dDennis (40%) T3.8 Accounting: Modify the RH to protect the stager from (DoS) request floods rescheduled after CMS workaround 1/9/07; 10dRosa (50%) medium, high, low importance

German Cancio (IT/FIO/FD) 10 Update on new+open tasks (3) TG4 (Scheduler) TaskStart;duration progress Who (%) T4.1 LSF limitations: study and improve limitations in LSF plugin for higher job/s throughput rates work in progress 15/8/06; 20d 50% Sebastien(25%) T4.2 LSF improvements: improve error reporting, configurable # of retries for scheduling a job, study using several LSF queues with different limits new 8/1/07; 8dSebastien(25%) T4.3 Avoid scheduling prepareTo requests: bypass LSF when launching prepareTo requests (no scheduling decision needed) new 1/12/06; 5d 40% Felix (80%) medium, high, low importance

German Cancio (IT/FIO/FD) 11 Update on new+open tasks (4) TG5, TG6 (Protocols and Security) TaskStart;duration progress Who (%) T5.1 RFIO common project: provide CASTOR plugin for new modular RFIO developed by IT/GD; adapt CASTOR code base work in progress 1/8/06; 45d 75% Giulia(40%), Giuseppe(20%) T5.2 GridFTP: add support for GridFTP2 as internal+external CASTOR-2 protocol work in progress 1/9/06; 40d 90% Victor(90%),Rosa (60%) T5.3 XROOTD: assist xrootd developers in integrating/testing support for XROOTD as internal CASTOR-2 protocol work in progress 1/4/06; 20d 90% Rosa(20%) T6.1 Strong authentication: Finish the implementation of strong authentication in CASTOR-2; define a rollout plan rescheduled 8/1/07; 40d 5% Rosa(30%) T6.2 VOMS support: Add VOMS roles/groups for authorization purposes. new 1/3/07; 25dRosa(15%) medium, high, low importance

German Cancio (IT/FIO/FD) 12 Update on new+open tasks (5) TG7, TG8 (Components to improve/change/rewrite) TaskStart;duration progress Who (%) T7.2 DLF archiving GUI: enhance DLF GUI for managing DLF archives postponed 1/10/07; 10dDennis(20%) T7.3 SRM-2 updates, deployment: modify existing SRM-2 implementation as agreed by WLCG for post-SC4 work in progress 1/5/06; 40d 80% Shaun(30%), Giuseppe(4%) T7.4 New VDQM: finish development, test and deploy new VDQM postponed 8/1/07; 30d 10% Giulia(50%) T8.1 CLIPS to Perl transition: replace complex CLIPS-based expert system by a simpler perl script postponed 1/3/07; 2dSebastien(20%) T8.2 Job Dispatching: Redesign scheduler interface part of rmmaster1/2/07; 20dDennis(40%) T8.3 CUPV: redesign old/unmaintained user privilege validation module, reduce complexity and simplify authorization setup, consider using VOMS 1/4/07; 2m? (Technical student) medium, high, low importance

German Cancio (IT/FIO/FD) 13 Update on new+open tasks (6) TG9, TG10 (Tools, platforms) TaskStart;duration progress Who (%) T9.1 Repack for CASTOR-2: rewrite tape repacking tool using new CASTOR-2 API’s, address enhancement requests by castor-ops team work in progress 1/2/06; 136d 95% Felix(85%) T9.2 Makefiles and packaging: Provide a clean, modular, well- organised build/packaging schema work in progress 15/7/06; 50d 15% Giuseppe(30%) T9.3 Automation of releases and tests: automatic regular builds and execution of test suites using ETICS framework postponed 15/4/06; 20dGiuseppe(20%) T9.4 Recover tool for Repack2: Develop an easy-to-use tool for re- activating a repacked tape (in case repack went wrong) new 22/11/06; 8d 50% Giulia(40%) T10.1 Port of servers to 64bits: port CASTOR-2 server parts to 64bit (stager, scheduler, SRM, etc) work in progress 1/9/06; 40d 65% Rosa(20%) medium, high, low importance

German Cancio (IT/FIO/FD) 14 Update on new+open tasks (7) TG11,TG12 (Documentation, Tape optimizations) TaskStart;duration progress Who (%) T11.3 Guides: Write a developer guide1/10/07; 15d? T11.4 Articles: publish articles on CASTOR-2 (architecture, deployment, HA,..) postponed 8/1/07; 10dGiuseppe(25%) T12.1 Optimization of file recalls: investigate how to optimize tape recalls (e.g. ordering of file recalls within a tape, use/tune recall policies, etc) new 8/1/07; 20dMiroslav(30%) T12.2 Support for small files: investigate ways of grouping small files for optimized tape usage new 1/5/07; 30dMiroslav(30%) medium, high, low importance

German Cancio (IT/FIO/FD) 15 Update on new+open tasks (8) Ongoing activities (all TG) TaskWho (%) T9.5 Support for Repack2 Giulia(8%) T11.2 Maintain CASTOR web site: Keep web site updated Rosa(10%) T13.1 Project Management Sebastien(20%) T13.2 Tier-1 Support in particular CNAF Giuseppe(20%) T rd -level support all T13.4 Miscellaneous bugfixes all T14.1 Castor releases Giuseppe(7%), Rosa(13%) T14.1 Meetings and workshops all

German Cancio (IT/FIO/FD) 16 Comments, questions?