Long-Term Sustainability: Services & Data A User (Support) View

Slides:



Advertisements
Similar presentations
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Advertisements

The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
Long-Term Data Preservation in HEP Challenges, Opportunities and Solutions(?) Workshop on Best Practices for Data Management & Sharing.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Long-Term Data Preservation in HEP Challenges, Opportunities and Solutions(?) Joint Data Preservation RDA-3 International Collaboration.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Data Preservation at the Exa-Scale and Beyond Challenges of the Next Decade(s) APARSEN Webinar, November 2014.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
Data Preservation in High Energy Physics Towards a Global Effort for Sustainable Long-Term Data Preservation in HEP
The DPHEP Collaboration & Project(s) Services, Common Projects, Business Model(s) PH/SFT Group Meeting December 2013 International.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
LHC Computing, CERN, & Federated Identities
Data Preservation in HEP Use Cases, Business Cases, Costs & Cost Models Grid Deployment Board International Collaboration for Data.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
LHC Computing – the 3 rd Decade Jamie Shiers LHC OPN meeting October 2010.
Preservation e-Infrastructures, Certification & ADMP IGs DPHEP Status and Outlook RDA Plenary 6 Paris, September 2016 International.
International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics RECODE - Final Workshop - January.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
The DPHEP Collaboration & Project(s) Services, Common Projects, Business Model(s) EGI “towards H2020” Workshop December 2013 International.
Preparing Data Management Plans for WLCG and HNISciCloud IT International Collaboration for Data Preservation and Long Term.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
Data Preservation in HEP Use Cases, Business Cases, Costs & Cost Models Grid Deployment Board International Collaboration for Data.
Usecases: 1.ISIS Neutron Source 2.DP for HEP Matthew Viljoen STFC, UK APARSEN-EGI workshop: preserving big data for research Amsterdam Science Park 4-6.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
School on Grid & Cloud Computing International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics.
Hall D Computing Facilities Ian Bird 16 March 2001.
LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Evolution of storage and data management
Long-Term Sustainability A User (Support) View
HEP LTDP Use Case & EOSC Pilot
Long Term Data Preservation meets the European Open Science Cloud
Certification of CERN as a Trusted Digital Repository
Ian Bird WLCG Workshop San Francisco, 8th October 2016
iPRES 2016, CH
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Computing models, facilities, distributed computing
EOSCpilot WP4: Use Case 5 Material for
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
Jarek Nabrzyski Director, Center for Research Computing
APARSEN Webinar, November 2014
2. ISO Certification Discussed already at 2015 PoW and several WLCG OB meetings Proposed approach: An Operational Circular that describes the organisation's.
for the Offline and Computing groups
Experiences and Outlook Data Preservation and Long Term Analysis
Data Management and Access Policies: CERN, HEP (and beyond)
WLCG: TDR for HL-LHC Ian Bird LHCC Referees’ meting CERN, 9th May 2017.
Readiness of ATLAS Computing - A personal view
The LHC Computing Grid Visit of Her Royal Highness
WLCG Service Interventions
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
FAIR Data Management, Trustworthy Digital Repositories and Business Continuity / Disaster Preparedness
Strategy document for LHCC – Run 4
Research Data Archive - technology
Connecting the European Grid Infrastructure to Research Communities
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
The latest developments in preparations of the LHC community for the computing challenges of the High Luminosity LHC Dagmar Adamova (NPI AS CR Prague/Rez)
New strategies of the LHC experiments to meet
EGI – Organisation overview and outreach
Nuclear Physics Data Management Needs Bruce G. Gibbard
What does DPHEP do? DPHEP has become a Collaboration with signatures from the main HEP laboratories and some funding agencies worldwide. It has established.
LHCb thinking on Regional Centres and Related activities (GRIDs)
The LHC Computing Grid Visit of Professor Andreas Demetriou
Presentation transcript:

Long-Term Sustainability: Services & Data A User (Support) View International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics Long-Term Sustainability: Services & Data A User (Support) View EOSC Pilot: Science Demonstrator Jamie.Shiers@cern.ch Slides available at https://indico.cern.ch/event/643419/

LEP / (HL-)LHC Timeline Database / data management support, CERN Program Library, Distributed Computing DM R&D, DBs, WLCG, EGI Major Data Migrations(!) ESFRI roadmap as “landmark project” Robust, stable services over several decades Data preservation and re-use over similar periods “Transparent” and supported migrations SA3 – Services for Heavy User Communities

“Data Preservation” Demonstrator Goal is to demonstrate “best practices” regarding data management and their applicability to LTDP + “open” sharing + re-use PIDs for data & meta-data stored in TDRs; DOIs for documentation; S/W + environment. Equivalent to CERN Open Data Portal but using “open” – i.e. non-HEP – solutions These all exist and are “advertised” in some form But there are “questions” around: Services; Resources; Long-Term Support (& Migration)… As well as Cost of Entry / “Ownership” See www.eiroforum.org/downloads/20170425_federated-scientific-data-hub.pdf

Example Services – LTDP HEP Non-HEP Issues Trustworthy DR CERN CASTOR+EOS (ISO 16363) EUDAT (?) (DSA / WDS) How to get access to even modest resources? PID / DOI systems “Long-term” support; availability of services Digital Library CERN Document Server, INSPIRE-HEP (Invenio-based) B2SHARE, Zenodo (Invenio-based) CERNLIB documentation example (20 years) Software + Environment (+build system) CVMFS, CernVM Ditto “Software without environment is just bad documentation” For a user (community) to go “shopping around” to find the right services, resources and support is a (major?) challenge / impediment More (and more complex) services needed to support data processing, distribution and analysis (full data lifecycle=WLCG4LHC) Domain specific and generic services for Long-Term Preservation, Sharing & Re-use

What is (HEP) data? (And its not just “the bits”) Digital information The data themselves, volume estimates for preservation data of the order of a few to 10 EB Other digital sources such as databases to also be considered Software Simulation, reconstruction, analysis, user, in addition to any external dependencies Meta information Hyper-news, messages, wikis, user forums.. Publications Documentation Internal publications, notes, manuals, slides Expertise and people

User requirements / expectations (Large) user requirements often exceed available resources / budgets (and existing resources typically fully utilised) Negotiation phase to converge Service expectations (e.g. max 10’ downtime) quasi-impossible to achieve Focus on response targets, critical services and reporting metrics Regular operations meetings de-fuse situations before they arise How to scale these “solutions” to large numbers of communities in an EOSC? Community-based support, e.g. for ESFRIs, probably needed WLCG could be a successful model to look at

The Worldwide LHC Computing Grid October 2016: 63 MoU’s 167 sites; 42 countries Tier0, Tier1s & Tier2s O(1) , O(10), O(100) Don’t under estimate the scale of the problem! Building a production grid at the scale of WLCG took the best part of a decade (and a significant amount of investment, including from EU) CPU: 3.8 M HepSpec06 If today’s fastest cores: ~ 350,000 cores Disk 310 PB Tape 390 PB Running jobs: 441,353 Active cores: 630,003 Transfer rate: 35.32 GiB/sec 2 Dec 2016 Ian Bird

WLCG Service Challenges As much about people and collaboration as about technology Getting people to provide a 24 x 7 service for a machine on the other side of the planet for no clear reason was going to be hard! Regional workshops – both motivational as well as technical – plus daily Operations Calls In a grid, something is broken all of the time! Clear KPIs, “critical services” & response targets: measurable improvement in service quality despite ever increasing demands Targets for response, intervention and resolution based on severity. Monitored regularly – not guarantees!

DMPs, the EOSC and ESFRIs An EOSC must support multiple disciplines Therefore, we need a lingua franca i.e. someway of getting them to talk together And / or to the service providers! IMHO, DMPs could provide just that! Even though guidelines would need to be broadened to cover data acquisition, processing, distribution and analysis in more detail! DMP w/s for ESFRI(-like) projects proposed: to be re-scheduled now that EOSC goals / plans more clear Synergies through DMPs: save time, money; better solutions!

Benefits of collaboration: LTDP The elaboration of a clear "business case" for long-term data preservation The development of an associated "cost model” A common view of the Use Cases driving the need for data preservation Understanding how to address Funding Agencies requirements for Data Management Plans Preparing for Certification of HEP digital repositories and their long-term future. Collaboration (and not “control”) is key to everything CERN does.

Director Generals’ Viewpoints Software/Computing should not limit the detector performance and LHC physics reach the Software must be easy-to-use and stable not to hinder the fast delivery of physics results (and a possible early discovery …) To find the Higgs you need the Accelerator, the Detectors and the Grid! CHEP 2004, Interlaken “Higgs discovery day”, CERN, 2012

Services are (just) services No matter how fantastic our { TDRs, PID services, Digital Library, Software repository } etc is, they are there to support the users Who have to do the really hard work! E.g. write the software, documentation, acquire and analyse the data, write the scientific papers Getting the degree of public recognition as at the Higgs discovery day was a target KPI!

~30 years of LEP – what does it tell us? Major migrations are unavoidable but hard to foresee! Data is not just “bits”, but also documentation, software + environment + “knowledge” “Collective knowledge” particularly hard to capture Documentation “refreshed” after 20 years (1995) – now in Digital Library in PDF & PDF/A formats (was Postscript) Today’s “Big Data” may become tomorrow’s “peanuts” 100TB per LEP experiment: immensely challenging at the time; now “trivial” for both CPU and storage With time, hardware costs tend to zero O(CHF 1000) per experiment per year for archive storage Personnel costs tend to O(1FTE) >> CHF 1000! Perhaps as little now as 0.1 – 0.2 FTE per LEP experiment to keep data + s/w alive – no new analyses included See DPHEP Workshop on “Full Costs of Curation”, January 2014: https://indico.cern.ch/event/276820/

ODBMS migration – overview (300TB) A triple migration! Data format and software conversion from Objectivity/DB to Oracle Physical media migration from StorageTek 9940A to 9940B tapes Took ~1 year to prepare; ~1 year to execute Could never have been achieved without extensive system, database and application support! Two experiments – many software packages and data sets COMPASS raw event data (300 TB) Data taking continued after the migration, using the new Oracle software HARP raw event data (30 TB), event collections and conditions data Data taking stopped in 2002, no need to port event writing infrastructure In both cases, the migration was during the “lifetime” of the experiment System integration tests validating read-back from the new storage

Open Science: A 5-Star Scale? We have a 5-star scale for Open Data Sir Timothy Berners-Lee We have a proposed 5-star scale for FAIR data management (+TDRs) Peter Doorn and Ingrid Dillo How about a 5-star scale for “Open Science: Open to the World”? The EOSC “Open to the world” cannot mean no accounting, authorisation, access control etc.

What are the right metrics? As easy to use as Amazon? Cheaper (and better) than doing it in-house? A majority of ESFRIs use it as their baseline? “To find dark matter, you need the EOSC”?

“Data Preservation” Demonstrator Goal is to demonstrate “best practices” regarding data management and their applicability to LTDP + “open” sharing + re-use PIDs for data & meta-data in TDRs; DOIs for documentation; S/W + environment. Equivalent to CERN Open Data Portal but using “open” – i.e. non-HEP – solutions These all exist and are “advertised” in some form But there are “questions” around: Services; Resources; Long-Term Support (& Migration)… As well as Cost of Entry / “Ownership” See www.eiroforum.org/downloads/20170425_federated-scientific-data-hub.pdf

“Data” Preservation in HEP The data from the world’s particle accelerators and colliders (HEP data) is both costly and time consuming to produce HEP data contains a wealth of scientific potential, plus high value for educational outreach. Many data samples are unique, it is essential to preserve not only the data but also the full capability to reproduce past analyses and perform new ones. This means preserving data, documentation, software and "knowledge". Requires different (additional) services (resources) to those for analysis

What Makes HEP Different? We throw away most of our data before it is even recorded – “triggers” Our detectors are relatively stable over long periods of time (years) – not “doubling every 6 or 18 months” We make “measurements” – not “observations” Our projects typically last for decades – we need to keep data usable during at least this length of time We have shared “data behind publications” for more than 30 years… (HEPData) Measurements are repeatable: observations are not

CERN Services for LTDP State-of-the art "bit preservation", implementing practices that conform to the ISO 16363 standard "Software preservation" - a key challenge in HEP where the software stacks are both large and complex (and dynamic) Analysis capture and preservation, corresponding to a set of agreed Use Cases Access to data behind physics publications - the HEPData portal An Open Data portal for released subsets of the (currently) LHC data A DPHEP portal that links also to data preservation efforts at other HEP institutes worldwide. These run in production at CERN and elsewhere and are being prototyped (in generic equivalents) in the EOSC Pilot Humble pie: services are just services. The real work is in using them!

Bit Preservation: Steps Include Controlled media lifecycle Media kept for 2 max. 2 drive generations Regular media verification When tape written, filled, every 2 years… Reducing tape mounts Reduces media wear-out & increases efficiency Data Redundancy For “smaller” communities, a 2nd copy can be created: separate library in a different building (e.g. LEP – 3 copies at CERN!) Protecting the physical link Between disk caches and tape servers Protecting the environment Dust sensors! (Don’t let users touch tapes) Constant improvement: reduction in bit-loss rate: 5 x 10-16 See “The Lost Picture Show: Hollywood Archivists Can’t Outpace Obsolescence” IEEE Spectrum

LTDP Conclusions As is well known, Data Preservation is a Journey and not a destination. Can we capture sufficient “knowledge” to keep the data usable beyond the lifetime of the original collaboration? Can we prepare for major migrations, similar to those that happened in the past? (Or will x86 and Linux last “forever”) For the HL-LHC, we may have neither the storage resources to keep all (intermediate) data, nor the computational resources to re-compute them! You can’t share or re-use data, nor reproduce results, if you haven’t first preserved it (data, software, documentation, knowledge) Data preservation & sharing: required by Science and Funders

Big Data: From LEP to the LHC to the FCC “Big data” from hundreds of TB to hundreds of PB to (perhaps) hundreds of EB FCC-ee option: “repeat” LEP in just 1 day! FCC-hh: 7 times LHC energy, 1010 Higgs bosons LEP 27km FCC – several options including a LEP-like machine More on Physics Case and technical options in May 2017 CERN Courier!