GLite Status Stephen Burke RAL GridPP 13 - Durham.

Slides:



Advertisements
Similar presentations
Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
Advertisements

User Board - Supporting Other Experiments Stephen Burke, RAL pp Glenn Patrick.
Middleware Roadmap for GridPP3 R.Middleton GridPP16- QMUL.
EGEE is a project funded by the European Union under contract IST R-GMA: Status and Plans Antony Wilson / RAL GridPP 12 - Brunel
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Middleware Claudio Grandi (INFN – Bologna) Workshop Commissione.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
EGEE is a project funded by the European Union under contract IST JRA1 Testing Activity: Status and Plans Leanne Guy EGEE Middleware Testing.
UCL workshop – 4-5 March 2004 – HEP Assessment of EDG – n° 1 HEP Applications Evaluation of the EDG Testbed and Middleware Stephen Burke (EDG HEP Applications.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
Lessons for the naïve Grid user Steve Lloyd, Tony Doyle [Origin: 1645–55; < F, fem. of naïf, OF naif natural, instinctive < L nātīvus native ]native.
INFSO-RI Enabling Grids for E-sciencE Status and Plans of gLite Middleware Erwin Laure 4 th ARDA Workshop 7-8 March 2005.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/02/08 VOMS deployment Extent of VOMS usage in LCG-2 –Node types gLite 3.0 Issues Conclusions.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
LCG EGEE is a project funded by the European Union under contract IST LCG PEB, 7 th June 2004 Prototype Middleware Status Update Frédéric Hemmer.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
Documentation (& User Support) Issues Stephen Burke RAL DB, Imperial, 12 th July 2007.
Enabling Grids for E-sciencE gLite for ATLAS Production Simone Campana, CERN/INFN ATLAS production meeting May 2, 2005.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
INFSO-RI Enabling Grids for E-sciencE gLite Middleware Status Frédéric Hemmer, JRA1 Manager, CERN On behalf of JRA1.
INFSO-RI Enabling Grids for E-sciencE Technical Roadmap 3 rd JRA1 All Hands Meeting Erwin Laure Deputy EGEE Middleware Manager.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
The GridPP DIRAC project DIRAC for non-LHC communities.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
Author - Title- Date - n° 1 Partner Logo WP5 Status John Gordon Budapest September 2002.
1Maria Dimou- cern-it-gd LCG November 2007 GDB October 2007 VOM(R)S Workshop report Grid Deployment Board.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
JRA1 Testing Current Status Leanne Guy Testing Coordination Meeting, 13 th September 2004 EGEE is a project funded by the European.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
The GridPP DIRAC project DIRAC for non-LHC communities.
INFSO-RI Enabling Grids for E-sciencE Upcoming Releases Markus Schulz CERN SA1 15 th June 2005.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
II EGEE conference Den Haag November, ROC-CIC status in Italy
LCG/EGEE Operational Issues Stephen Burke RAL. November 1 st 2004LCG Operations - Issues Introduction List of problems to initiate discussion –A personal.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Jean-Philippe Baud, IT-GD, CERN November 2007
EGEE Middleware Activities Overview
U.S. ATLAS Grid Production Experience
GDB 8th March 2006 Flavia Donno IT/GD, CERN
Comparison of LCG-2 and gLite v1.0
gLite Middleware Status
Short update on the latest gLite status
Quality Control in the dCache team.
gLite The EGEE Middleware Distribution
Presentation transcript:

gLite Status Stephen Burke RAL GridPP 13 - Durham

July 6 th 2005gLite Status Overview gLite releases gLite deployment WMS DMS R-GMA VOMS Outstanding issues E&OE!

Releases

July 6 th 2005gLite Status gLite releases so far Release 1.0 on April 5 th –Released to meet deadline –WMS + CE + Fireman + gLite i/o + R-GMA + VOMS –AliEn, GAS and package manager gone –Several things missing or not working well No SE in gLite –Documentation is reasonable Release 1.1 on May 12 th –First versions of File Transfer Service (FTS), metadata catalogue –Secure file catalogues –Bug fixes

July 6 th 2005gLite Status Future releases Release 1.2 should have been on June 1 st –Delayed to end of June, now expected late July Was expected to be in LCG July release Have gLite R-GMA and VOMS as LCG upgrades Final gLite release (2.0) for EGEE 1 by end of the year –Updated architecture/design/workplan documents –Code freeze October (?) –Maybe a 1.3 release (August?), time is tight

July 6 th 2005gLite Status Timelines March 2006 December 2005 November 2005 October 2005 June 2005 End of EGEE 1 TODAY Release 1.2Release 2.0 Xmas Vacatio n Integrated 2.0 Func. freeze Final Report Mid Dec. Func. freeze ? Consequences ~ 2.5 months of development left probably only 1 or 2 releases between 1.2 and 2.0 Focus on consolidation of 1.2 and little improvements as requested from applications Very careful in introducing new services Review

July 6 th 2005gLite Status Release priorities Driven by service challenges –Especially data management –LCG Baseline Services document No time to change anything for EGEE 1 EGEE PTF disbanded –Not seen as effective –Who collects requirements? –Do non-LCG VOs have influence?

Deployment

July 6 th 2005gLite Status gLite deployments – JRA1 gLite prototype system –Used by ARDA team, biomed, some others –Very small, basically just CERN –Not properly maintained JRA1 testing testbed –Was CERN, RAL and NIKHEF –Two sites + manpower added at Imperial One person subtracted at CERN –Still small and under-resourced –Releases are not sufficiently tested 928 open bugs in savannah, 84 critical 281 ready for test, but no time to test!

July 6 th 2005gLite Status gLite deployments - LCG Pre-production system now being installed –~8 sites so far – more coming None in UK? –Currently a pure gLite system Role seems to change from week to week! –Partly working but many problems –Some users allowed in soon (now?) Production system –Various plans considered –LCG 2.6 has R-GMA and VOMS –Next steps unclear (to me at least!)

Status as of release 1.1

July 6 th 2005gLite Status Workload management Broker is a development of the EDG/LCG RB –Seems to be largely backward-compatible –Main new feature is DAGMAN (composite jobs) –Push and pull job submission –No web services Hybrid info system (CEMON + BDII) –Static configuration of WMS-CE relationships –Should change to R-GMA (?) Condor-C replaces Globus gatekeeper on CE –Several security problems –Current performance is poor Submissions often fail Cryptic error messages

July 6 th 2005gLite Status Data Management First version of metadata catalogue –No command-line clients yet, MySQL only Fireman file catalogue –Competes with new LCG File Catalogue –Various experiment-specific solutions gLite i/o –Security model still under debate (delegation, file ownership) –Doesnt yet work with dCache or DPM SRMs, only Castor! FTS – developed for service challenges –Point-to-point reliable file transfer –No interaction with Fireman catalogue No File Placement Service (FPS) yet, hence no replication! No Data Scheduler Interaction with WMS still under discussion

July 6 th 2005gLite Status R-GMA Should be an information system –But both LCG and gLite still use BDII New Service Discovery API –Still discussing service types and names LCG now making substantial use of R-GMA for monitoring, accounting etc –Lots of pressure to fix bugs! –Some stability problems, needs more testing Not ideal to test in production, but … –Seems generally in a good state

July 6 th 2005gLite Status Security gLite VOMS server now used by LCG –Some problems with gLite installation scripts WMS and DMS have limited support for VOMS –SRM, Condor-C and R-GMA dont yet Many test VOMS servers exist, but still not in production –Will probably need a long learning period to get the best use of VOMS –Not a a panacea! Security requirements mostly still not being addressed –Most date back to the start of EDG Many known security vulnerabilities

Outstanding Issues

July 6 th 2005gLite Status General Error messages, logging and fault-tolerance –Still very poor Proposal on common error handling by Steve Fisher Configuration –gLite has a common config tool (python/XML) –Underlying config not unified –Still complex, fragile and error prone –Not clear if LCG will switch May get many layers - YAIM -> XML -> m/w specific config files? Monitoring –Getting better – but all from LCG, not in gLite Single points of failure –Still have many, but some positive movement

July 6 th 2005gLite Status WMS Job submission rate too slow –Not tested (?), but probably no change Failover (RB goes down -> jobs lost) –No change so far Bulk job submission –Partial support via DAGs –Parameterised jobs coming Space management on WNs –Not being addressed Access to output from running jobs –Not yet Advance reservation –Some work, but not yet available Interaction with data management (pre-staging) –Discussion but nothing yet CPU speed, memory etc requirements not passed to batch system –May appear in future Job distribution is poor (ERT etc) –Partly addressed by new Glue schema –Still no direct support in broker

July 6 th 2005gLite Status DMS Need a metadata solution –Much discussion, seems to be converging File catalogue performance, bulk operations –Partly addressed by Fireman, LFC –LFC seems to have better performance but no bulk operations Catalogue replication –Oracle replication by LCG –gLite working towards local catalogues Small files –Not being addressed Reliable file replication –Partly addressed by FTS, need FPS as well File pinning –Not yet in SRMs or FTS Posix file access –May be addressed by gLite i/o –Security model unclear High level data management –Not yet (wait for Data Scheduler in 2.0)

July 6 th 2005gLite Status Information systems Not many issues! Glue schema not ideal –Minor update just released –Maybe new major version in ~ 1 year? Stability, scalability –Need to test in production - test systems too small

July 6 th 2005gLite Status Security VO management, groups and roles –Should come with VOMS VO policies for CEs –Some tools (LCAS, LCMAPS) –Needs experience ACLs on files –Should come with gLite File Access Service (FAS) –Not ready yet –Need to check security model satisfied sites –No support in SRM yet No outbound IP access –Some discussion, nothing yet Secure file management –Not needed for HEP, but strong need for biomed –Some work, not there yet Quotas –Some work on measurement –Enforcement? Vulnerabilities –Many known, little work –New group (Linda Cornwall)

July 6 th 2005gLite Status Summary First gLite releases are out, but are buggy and incomplete Next release is late, not much time to the end of EGEE 1 Many long-standing issues not addressed –Developers tend to follow their own interests rather than user/sysadmin needs –Functionality is less than at the end of EDG! Probably still >~ 1 year to get production quality –OK for EGEE if EGEE 2 is approved –Mismatch with LCG timescale LHC experiments are building their own Grids –How much of gLite do they need? Who decides requirements and priorities?