Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 12 – 13 December 2016.

Slides:



Advertisements
Similar presentations
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1 CEOS/WGISS20 – Kyiv – September 13, 2005 Paul Kopp SIPAD New Generation: Dominique Heulet CNES 18, Avenue E.Belin Toulouse Cedex 9 France
Data Catalogue Service Work Package 4. Main Objective: Deployment, Operation and Evaluation of a cataloguing service for scientific data. Why: Potential.
May 17, Capabilities Description of a Rapid Prototyping Capability for Earth-Sun System Sciences RPC Project Team Mississippi State University.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Use of RCP for Instrument Control Tony Lam 2006 Eclipse SLAC.
WRAP Technical Support System Project Update AoH Call October 19, 2005.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
2 Object-Oriented Analysis and Design with the Unified Process Objectives  Describe the differences between requirements activities and design activities.
OFC 200 Microsoft Solution Accelerator for Intranets Scott Fynn Microsoft Consulting Services National Practices.
Corral: A Texas-scale repository for digital research data Chris Jordan Data Management and Collections Group Texas Advanced Computing Center.
Peter Chochula ALICE DCS Workshop, October 6,2005 DCS Computing policies and rules.
DORII Joint Research Activities DORII Joint Research Activities Status and Progress 6 th All-Hands-Meeting (AHM) Alexey Cheptsov on.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
1 st -4 th December st BioXHIT Annual Meeting WorkPackage 5.2: Implementation of Data management and Project Tracking in Structure Solution Peter.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Computing Division Requests The following is a list of tasks about to be officially submitted to the Computing Division for requested support. D0 personnel.
WP18: High-Speed Data Recording Krzysztof Wrona, European XFEL 07 October 2011 CRISP.
Preparation for Integration Organized access to the code WP6 infrastructure (MDS-2, RC, …) Input from WPs on requirements,... Acquire experience with Globus.
March 2004 At A Glance NASA’s GSFC GMSEC architecture provides a scalable, extensible ground and flight system approach for future missions. Benefits Simplifies.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
IT Directors Group 13 & 14 October 2008 Item of the Agenda Seasonal Adjustment software Cristina Calizzani - Unit B5.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Easy Access to Grid infrastructures Dr. Harald Kornmayer (NEC Laboratories Europe) Dr. Mathias Stuempert (KIT-SCC, Karlsruhe) EGEE User Forum 2008 Clermont-Ferrand,
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
R. Krempaska, October, 2013 Wir schaffen Wissen – heute für morgen Controls Security at PSI Current Status R. Krempaska, A. Bertrand, C. Higgs, R. Kapeller,
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
WIR SCHAFFEN WISSEN – HEUTE FÜR MORGEN Transition to user operation Didier Voulot :: Paul Scherrer Institut SwissFEL Commissioning Workshop, 22 March2016.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Al Lilianstrom and Dr. Olga Terlyga NLIT 2016 May 4 th, 2016 Under the Hood of Fermilab’s Identity Management Service.
1 The XMSF Profile Overlay to the FEDEP Dr. Katherine L. Morse, SAIC Mr. Robert Lutz, JHU APL
Bob Jones EGEE Technical Director
Umbrella ID Status Mirjam van Daalen.
Computing Clusters, Grids and Clouds Globus data service
WP18, High-speed data recording Krzysztof Wrona, European XFEL
umbrellaID: New Website
Budget JRA2 Beneficiaries Description TOT Costs incl travel
StoRM: a SRM solution for disk based storage systems
Joslynn Lee – Data Science Educator
Data Management Plans for SNSF applications
Data Ingestion in ENES and collaboration with RDA
GFA Controls IT Alain Bertrand
Mirjam van Daalen:: Paul Scherrer Institut
Grid Portal Services IeSE (the Integrated e-Science Environment)
Systems Analysis – ITEC 3155 Evaluating Alternatives for Requirements, Environment, and Implementation.
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Exploitation of ISS Scientific data - sustainability
Management of Virtual Execution Environments 3 June 2008
Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016.
Real IBM C exam questions and answers
Leigh Grundhoefer Indiana University
Case Study: Algae Bloom in a Water Reservoir
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Björn Erik Abt :: Paul Scherrer Institut
Wide Area Workload Management Work Package DATAGRID project
Grid Systems: What do we need from web service standards?
CRKN and Canadiana Update
umbrellaID: OpenIRIS & Umbrella
Technical Outreach Expert
Data Management Components for a Research Data Archive
Short to middle term GRID deployment plan for LHCb
DaaS and Kubernetes at PSI
Maria Teresa Capria December 15, 2009 Paris – VOPlaneto 2009
Umbrella ID Federated Identity for PaN facilities
Presentation transcript:

Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 12 – 13 December 2016

Current projects at PSI Data Analysis Service Data Policy Remote access Metadata catalogue Petabyte archive Remote data transfer PSI, PSI, 10. April 2019 10. April 2019

Covering larger parts of the life cycle

Project Overview: Data Analysis Service SUK Project 142-004 Project Manager: Dr. Stephan Egli, Dr. Derek Feichtinger, Paul Scherrer Institut Partner: ETHZ Project Duration: 04.2015 - 04.2017 Financing Support (50% matching funds): CHF 1'618'000 Team members: 16 (including 3 new positions financed by project) Workpackages WP1: Common Tools and Services WP2: Data Analysis Environments for major use cases WP3: Identity Management, DUO, Authentication and Authorization WP4: Integration and development of scientific analysis codes WP5: Procurement, installation, operation of analysis cluster infrastructure WP6: Infrastructure sharing with other institutions WP7: Project Management

DaaS Project Status Main purpose: provide an integrated solution for all SLS Users to do offline data analysis for data taken at SLS (and later SwissFel) Cluster of moderate size (~900 Cores, 2 PB Storage) Hired 3 persons dedicated to this project . Currently about 50% of the foreseen hardware installed and in operation Now in test phase with invited external users and internal users Adjusting the system and software according to concrete use cases of these users. So far very good feedback Planning for Storage upgrade up to a total of about 3 PB until mid 2017 Option for extending the cluster also with “dedicated” resources (for paying customers), but within the same infrastructure and using centrally provided hardware choices

Data Policy Status Data Policy based on PaNdata framework Policy has been adopted by Directorate in October 2016 Policy applies to not only to the large research facilities at PSI, but to all research activities Embargo period 3 years, with easy extension to 5 years Implementation will be a long term effort, stepwise implementation per facility and beamline.

Remote Access Usecases: online and offline analysis, remote measurements, shift operation,sharing of sessions for support tasks , Sharing of sessions for collaboration Support for 3D Hardware Acceleration Access to the beamlines and to the DaaS Cluster through a common gateway Architecture based on separation of “server” and “node” processes of the Nomachine Software Version 5 Added graphical management tool to define (time based) access to beamline resources and offline compute cluster, with role based delegated management to resource responsible

Data Catalogue Decision for approach based on NoSQL document databases (MongoDB), taking advantage of recent developments for middleware (Loopback) and component based graphical user interfaces (Angular2) Need to cover extended set of use cases and long term evolution at PSI and ESS and therefore flexibility of a solution is mission critical Currently preparing the production environment for data ingestion and recruiting developer position(3 years 2017-2020). First 3 beamlines should be connected within DaaS project timeframe within Spring 2017 This is also a decision for continued collaboration with the ICAT community ! E.g. working on a common API to aim for interoperability of current and future products Evaluate potential to develop software (components) which can be used in a ”product” independent fashion Open to further suggestions…

Interactive and Batch data Analysis Support for doing interactive (e.g. Matlab) data analysis on the cluster, nodes can be reserved for interactive work. Standard Batch processing based on Slurm

Petabyte Archive PSI must prepare for the archiving of high amounts of data being expected for SLS and SwissFEL over the next decades. Strategic collaboration of PSI with the Swiss National Supercomputing Center (CSCS) in Lugano for building a Petabyte Tape Archive solution at CSCS Project initiated by a PoC within the DaaS project Volume increase driven by detector and instrumentation advances. Planning to leverage IBM Spectrum Scale (GPFS) AFM technology for the asynchronous data transfers between the sites. Dataflow orchestration and packaging tools are being evaluated, elected candidate is Arema from IBM Definition of interfaces from and to data catalogue ongoing

Remote Data Transfer Support for rsync/scp and gridftp (Globus Online) Also evaluated Aspera solution from IBM. Could be added, but only if (paying) customer would request for it The integration with the longterm archive will create additional requirements

Wir schaffen Wissen – heute für morgen My thanks go to Stephan Egli Derek Feichtinger Gerd Mann