Documentation, Best Practices and Procedures: Roadmap (What's missing, what's hot) Vera Hansper CSC/NDGF 2017/12/7 1
Documentation Aims for “Documentation”? Topics under Documentation Site Certification WIKIs and Document Servers Tools Links and Collation Prioritisation Documentation is vital for a healthy infrastructure – BUT – it also needs to be in good shape Fractured documentation is BAD Widely dispersed documentation is BAD Missing documentation is BAD Poor/minimalistic documentation is BAD The aim is to have documentation that is COHESIVE CENTRALLY Structured for ease of access COMPLETE INFORMATIVE 2017/12/7 2
Aims Documentation is vital for a healthy infrastructure The aim is to have documentation that is COHESIVE CENTRALLY Structured for ease of access COMPLETE INFORMATIVE It needs to be in good shape! We don't want Fractured documentation Widely dispersed documentation Missing documentation Poor/minimalistic documentation
Topics (I) Operational Documentation Procedures Best Practices Training Guides Certification and creation guides OLAs, SLDs WIKI Pages Transferring GOCWIKI knowledge base
Op. Doc : Procedures (I) Operational Procedures Important for Grid Oversight (ROD) teams Provides the day to day procedural information Important for NGIs and sites Provides the overall procedural relation between the parties involved in daily operations Provides the role that COD plays in operations
Op. Doc. : what's needed (II) Currently the Operational Procedures Manuals are in bad shape. Have not progressed since the last EGEE release in April Virtually no resources to do so Require updates to links, acronyms and the overall structure. The information contained within these is still relevant. It needs to be marketed as part of the current project (EGI) This includes updates to mailing lists, new ROD teams, new links to new documents etc.
Op. Doc. : what's being done (III) Current active team of 2 are looking into these manuals Peter Slizik (SVABA, Slovakia) Have started by reading the manuals, trying to understand the Mediawiki Useful that Peter has already done some operations work Have also had expression of interest in this from one other NGI. ALL NGIs are welcome to contribute – for this part, we need about 8 volunteers
Op. Doc. : Best Practices (I) Provides tested methods for daily operations that Have not made it to the Operational Documentation Fall between the lines of standard procedures Have recently been made relevant as a good method because of changes in tools or other documents
Op. Doc. : Best Practices (II) Currently, the Best Practices has been migrated to the EGI wiki. Has not been updated since Nov 2009 A lot has changed since NOV 2009! Links, acronyms, content New material to add? Modified procedures? The contributors are the Operators (ROD), COD, sysadmins, ... We need moderators and editors for this wiki!
Op. Doc: Training Guides (I) Necessary to facilitate new operators or operations teams from NGIs to the tools and procedures Useful as self study/training tools OR as a training tool from an experienced to a new operator Are effectively an executive summary of the Operational Documents merged with the Tools documents
Op. Doc: Training Guides (II) Training Guides available Dashboard HOWTO Last Training Presentation (COD-22, January 2010, Need to Know) Currently the old copy is available at the CIC Portal Needs to be urgently updated and made relevant for the current Operations Portal Dashboard General structure of one guide is probably too complicated Screen shots are out of date
Op. Doc: Training Guides (III) Good to have! Self guided Training Guide to Operations Can be used at general training sessions DASHBOARD HOWTO More indepth guide to the dashboard On par with a technical user manual Operations Quick Guides Something to give Operations people a quick check list/cheat sheet Ideally undertaken by some volunteers who would also communicate with the Operations Portal team.
Op. Doc: Training Guides (IV) Should be easy to edit for local (NGI) operational procedures These guides aren't the last word on operations BUT they can be used by NGIs to create their own versions, in their own language, if necessary.
TOPICS (II) : Certification Guides Certification is a necessity to ensure a good understanding between parties in the infrastructure NGI creation Site certification NAGIOS validation Agreements between parties are part of certification and creation OLAs, SLDs, SLAs ...
Certification Guides (II) Site Certification: Small team of volunteers currently in correspondence Some contributions from their NGIs will make up the bulk of the manual. It needs some discussion to create a viable document More intensive effort expected over next few weeks
Certification Guides (III) NAGIOS validation -> CERN wiki Site Cert -> EGI WIKI (in progress) NGI Creations -> EGI WIKI
WIKI Pages The EGI wiki is THE place to go! The wiki is a powerful tool to collate and share knowledge and information. Easy to insert and edit information Currently TWO wikis (EGI and GOC) GOCWIKI last updates back in 2009, or older fractures the content, confuses the end user Best to move useful information from the GOCWIKI to EGI Contributors to GOCWIKI can contact the documentation task to move useful information
TOOLS Documentation Tools necessary for operations GGUS, Operations Portal, GOCDB, GSTAT, NAGIOS Each have their own areas/URLS/wiki entries Each have separate documentation How to create a more cohesive picture of the tools? There are at least a collection of links already set up in the wiki Quick sheets and training guides are useful here
Collating Miscellaneous Parts TOOLS Documentation Tools for operations GGUS, Operations Portal, GOCDB, GSTAT, NAGIOS A lot of the collation of these occurs in the Operations Manuals Other documentation to be collated OLD CIC Portal areas Documents can be moved to the most applicable area Can prune irrelevant/old documents GOCWIKI As in CIC Portal Middleware They provide the documentation – we should have good links!
Links and things Links to documentation collated in one area The wiki is a good place for the links The wiki is a good place for evolving or living documents The document server is a good place for the stable documents ALSO Replication in some cases is a good idea – although ... Its important to have ONE place where people can find information, even if it is found in multiple areas i.e. ONE reference area
Hot Topics Site Certification REALLY URGENT Process for creating the manual has started Needs phone conf. to put pieces into a better format Operations Manuals OLAs – QUITE URGENT Draft version of an OLA is already prepared Questionnaire regarding OLAs in general has been circulated Nagios – URGENT Important that the links to this documentation is well disseminated
Matters that are missing What documentation is still missing? What about TPM? Other User support (out of scope?) Better dissemination of the current documentation is required!
Prioritisation/Roadmap Operations Manuals Site Certification Process Best Practices OLAs NGI creation (quite mature) Tools Use (Nagios, Dashboard) Including Training guides
QUESTIONS? (Operational Documentation) s#Validation_Process