Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016.

Slides:



Advertisements
Similar presentations
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
1 CEOS/WGISS20 – Kyiv – September 13, 2005 Paul Kopp SIPAD New Generation: Dominique Heulet CNES 18, Avenue E.Belin Toulouse Cedex 9 France
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Internet Business Strategies A strategic view of the various options and connectivity levels available to business through the Internet. Copyright 2011.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
2 Object-Oriented Analysis and Design with the Unified Process Objectives  Describe the differences between requirements activities and design activities.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
OFC 200 Microsoft Solution Accelerator for Intranets Scott Fynn Microsoft Consulting Services National Practices.
Peter Chochula ALICE DCS Workshop, October 6,2005 DCS Computing policies and rules.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
© 2006 DTP PMC; made available under the EPL v1.0 | July 12, 2006 | DTP Enablement Project Creation Review Creation Review: Eclipse Data Tools Platform.
Preparation for Integration Organized access to the code WP6 infrastructure (MDS-2, RC, …) Input from WPs on requirements,... Acquire experience with Globus.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Easy Access to Grid infrastructures Dr. Harald Kornmayer (NEC Laboratories Europe) Dr. Mathias Stuempert (KIT-SCC, Karlsruhe) EGEE User Forum 2008 Clermont-Ferrand,
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
R. Krempaska, October, 2013 Wir schaffen Wissen – heute für morgen Controls Security at PSI Current Status R. Krempaska, A. Bertrand, C. Higgs, R. Kapeller,
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
WIR SCHAFFEN WISSEN – HEUTE FÜR MORGEN Transition to user operation Didier Voulot :: Paul Scherrer Institut SwissFEL Commissioning Workshop, 22 March2016.
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
The Umbrella Project Authentication The minimum user information possible is stored centrally to avoid Data Protection issues. The Authentication is done.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Portlet Development Konrad Rokicki (SAIC) Manav Kher (SemanticBits) Joshua Phillips (SemanticBits) Arch/VCDE F2F November 28, 2008.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
umbrellaID: Shibboleth 3 and Attribute Authority
CALIPSOplus JRA2 Kickoff: Task 6 – Authentication + Identity
Status Umbrella ID Mirjam van Daalen.
Umbrella ID Status Mirjam van Daalen.
Simulation Production System
umbrellaID: New Website
Budget JRA2 Beneficiaries Description TOT Costs incl travel
Tools and Services Workshop
StoRM: a SRM solution for disk based storage systems
Joslynn Lee – Data Science Educator
Data Management Plans for SNSF applications
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
StratusLab Final Periodic Review
StratusLab Final Periodic Review
GFA Controls IT Alain Bertrand
Umbrella Roadmap & CALIPSOplus
Mirjam van Daalen:: Paul Scherrer Institut
Grid Portal Services IeSE (the Integrated e-Science Environment)
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Data Management & Analysis in MATTER
Chapter 18 MobileApp Design
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Exploitation of ISS Scientific data - sustainability
Real IBM C exam questions and answers
Dev Test on Windows Azure Solution in a Box
Federated Identity Management: Status and perspectives of EGI
Leigh Grundhoefer Indiana University
Björn Erik Abt :: Paul Scherrer Institut
Wide Area Workload Management Work Package DATAGRID project
Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 12 – 13 December 2016.
Google Sky.
Grid Systems: What do we need from web service standards?
umbrellaID: OpenIRIS & Umbrella
Windows Azure Hybrid Architectures and Patterns
Data Management Components for a Research Data Archive
DaaS and Kubernetes at PSI
WP6 – EOSC integration J-F. Perrin (ILL) 15th Jan 2019
Umbrella ID Federated Identity for PaN facilities
Presentation transcript:

Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016

Current projects at PSI Data Analysis Service Data Policy Remote access Metadata catalogue Petabyte archive Remote data transfer PSI, PSI, 20. September 201820. September 2018 20. September 2018

Covering larger parts of the life cycle

Project Overview: Data Analysis Service SUK Project 142-004 Project Manager: Dr. Stephan Egli, Dr. Derek Feichtinger, Paul Scherrer Institut Partner: ETHZ Project Duration: 04.2015 - 04.2017 Financing Support (50% matching funds): CHF 1'618'000 Team members: 16 (including 3 new positions financed by project) Workpackages WP1: Common Tools and Services (4) WP2: Data Analysis Environments for major use cases (3) WP3: Identity Management, DUO, Authentication and Authorization (3) WP4: Integration and development of scientific analysis codes (2) WP5: Procurement, installation, operation of analysis cluster infrastructure (4) WP6: Infrastructure sharing with other institutions (3) WP7: Project Management (2) Numbers in parentheses refer to the number of involved project members

DaaS Project Status Main purpose: provide an integrated solution for all SLS Users to do offline data analysis for data taken at SLS (and later SwissFel) Cluster of moderate size (~900 Cores, 2 PB Storage) Hired 3 persons dedicated to this project . Currently about 50% of the foreseen hardware installed and in operation Now in test phase with internal users Adjusting the system and software according to concrete use cases of these users. So far very good feedback Next phase: add external users Planning for Storage upgrade up to a total of about 3 PB until mid 2017 Option for extending the cluster also with “dedicated” resources (for paying customers), but within the same infrastructure and using centrally provided hardware choices

Data Policy Status Data Policy based on PaNdata framework Draft existing Embargo period 3 years, with easy extension to 5 years Should be adopted by PSI directorate in August 2016 Implementation from the end of August 2016

Remote Access Usecases: online and offline analysis, remote measurements, shift operation,sharing of sessions for support tasks , Sharing of sessions for collaboration Support for 3D Hardware Acceleration Access to the beamlines and to the DaaS Cluster through a common gateway Architecture based on separation of “server” and “node” processes of the Nomachine Software Version 5 Added graphical management tool to define (time based) access to beamline resources and offline compute cluster, with role based delegated management to resource responsible

Data Catalogue Currently comparing two possible approaches: ICAT/TOPCAT combination Approach based on NoSQL document databases (MongoDB), taking advantage of recent developments for middleware (Strongloop/Loopback) and component based graphical user interfaces (Angular2) Compare approaches concerning Ease of data ingestion Potential to integrate into existing IT infrastructures and storage/archive systems Flexibility to cope with multitude and growing requirements coming from different facilities, research groups, beamline instruments and experimental method in the area of data queries, data display and data analysis

Interactive and Batch data Analysis Support for doing interactive (e.g. Matlab) data analysis on the cluster, nodes can be reserved for interactive work. Standard Batch processing based on Slurm

Petabyte Archive PSI must prepare for the archiving of high amounts of data being expected for SLS and SwissFEL over the next decades. Strategic collaboration of PSI with the Swiss National Supercomputing Center (CSCS) in Lugano for building a Petabyte Tape Archive solution at CSCS Project initiated by a PoC within the DaaS project Volume increase driven by detector and instrumentation advances. Planning to leverage IBM Spectrum Scale (GPFS) AFM technology for the asynchronous data transfers between the sites. Data growth for high data volume beamlines in 2016 Estimated projections for yearly data production

Remote Data Transfer Support for rsync/scp and gridftp (Globus Online) Also evaluated Aspera solution from IBM. Could be added, but only if (paying) customer would request for it The integration with the longterm archive will create additional requirements

Possible Collaborations Remote Access Lossy Compression . TOMCAT group at SLS is developing solutions here. Should be driven from scientific community. Umbrella federated identity management.

Wir schaffen Wissen – heute für morgen My thanks go to Stephan Egli Derek Feichtinger Gerd Mann