Services for Sensitive Research Data Gard Thomassen, PhD Head of Research Support Services Group Leader of the ”Services for Sensitive Data” project University.

Slides:



Advertisements
Similar presentations
Creating HIPAA-Compliant Medical Data Applications with Amazon Web Services Presented by, Tulika Srivastava Purdue University.
Advertisements

Microsoft Dynamics AX Technical Conference 2013
Which server is right for you? Get in Contact with us
Take your CMS to the cloud to lighten the load Brett Pollak Campus Web Office UC San Diego.
Network Redesign and Palette 2.0. The Mission of GCIS* Provide all of our users optimal access to GCC’s technology resources. *(GCC Information Services:
Accelerate Your Business RP IaaS (Infrastructure as a Service) IaaS.
Password?. Project CLASP: Common Login and Access rights across Services Plan
Network Redesign and Palette 2.0. The Mission of GCIS* Provide all of our users optimal access to GCC’s technology resources. *(GCC Information Services:
Tryggve project developing services for sensitive biomedical data: Call for Nordic use cases NeiC 2015 Conference Workshop on sensitive data Antti Pursula.
Is Your IT Out of Alignment? Chargeback and Billing with Parallels Automation Brian Shellabarger, Chief Architect - SaaS.
1 Disaster Recovery Planning & Cross-Border Backup of Data among AMEDA Members Vipin Mahabirsingh Managing Director, CDS Mauritius For Workgroup on Cross-Border.
5205 – IT Service Delivery and Support
Virtual Desktop Infrastructure Solution Stack Cam Merrett – Demonstrator User device Connection Bandwidth Virtualisation Hardware Centralised desktops.
Open Cloud Sunil Kumar Balaganchi Thammaiah Internet and Web Systems 2, Spring 2012 Department of Computer Science University of Massachusetts Lowell.
Risk assessment - TSD Gard Thomassen, PhD USIT, UIO.
TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.
CTS Private Cloud Status Quarterly Customer Meeting October 22, 2014.
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Real Security for Server Virtualization Rajiv Motwani 2 nd October 2010.
Terminal Services in Windows Server ® 2008 Infrastructure Planning and Design.
Effectively Explaining the Cloud to Your Colleagues.
For more notes and topics visit:
Tier 3g Infrastructure Doug Benjamin Duke University.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
Cloud Computing Saneel Bidaye uni-slb2181. What is Cloud Computing? Cloud Computing refers to both the applications delivered as services over the Internet.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Chapter 7: Using Windows Servers to Share Information.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Grid Engine Riccardo Rotondo
TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.
Copyright © 2011 EMC Corporation. All Rights Reserved. MODULE – 6 VIRTUALIZED DATA CENTER – DESKTOP AND APPLICATION 1.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
Module 2: Installing and Maintaining ISA Server. Overview Installing ISA Server 2004 Choosing ISA Server Clients Installing and Configuring Firewall Clients.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
Global Delivery of Large Scale VDI with Quest and Microsoft Daniel Bolton Information Services Kingston University Mission Statement: To provide a University.
Chapter Six Maintaining a Computer Part II: Installing, Repairing, and Removing Applications.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
DoC Private IaaS Cloud Thomas Joseph Cloud Manager
Computer Security Risks for Control Systems at CERN Denise Heagerty, CERN Computer Security Officer, 12 Feb 2003.
Nordic platform for sensitive biomedical data The Tryggve project Antti Pursula
1 Computer Maintenance Software Configuration: Evaluating Software Packages, Software Licensing, and Computer Protection through the Installation and Maintenance.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
Take Confident Steps Towards Virtualization Phil Utschig Solutions Architect September 15, 2008 Springfield, IL.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No B 2 DROP User.
Be Microsoft’s first and best customer Enabling world-class and predictable customer, client, and partner experience Protecting Microsoft’s physical and.
Chapter 3 Pre-Incident Preparation Spring Incident Response & Computer Forensics.
R. Krempaska, October, 2013 Wir schaffen Wissen – heute für morgen Controls Security at PSI Current Status R. Krempaska, A. Bertrand, C. Higgs, R. Kapeller,
August Video Management Software ViconNet Enterprise Video Management Software Hybrid DVR Kollector Strike Kollector Force Plug & Play NVR HDExpress.
Lattelecom | Cloud Pakalpojums. 2 | Lattelecom Cloud Platform: Competitive Advantages 3 Hardware infrastructure User Control Panel Customer self-service.
Data Hosting and Security Overview January, 2011.
TSD: a Secure and Scalable Service for Sensitive Data and eBiobanks Gard Thomassen, PhD Head of Research Support Services Group University Center for Information.
Capacity Planning For the Hybrid Cloud From an infrastructure owner’s perspective.
Services for Sensitive Research Data Iozzi Maria Francesca, Group Leader & Nihal D. Perera, Senior Engineer Research Support Services Group ”Services for.
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
E-Infrastructure for Sensitive biomedical data NeiC 2015 Conference Espoo, Finland Antti Pursula.
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Barracuda SSL VPN Remote, Authenticated Access to Applications and Data Version 2.6 | July 2014.
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Building a Virtual Infrastructure
2016 Citrix presentation.
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Network+ Guide to Networks 6th Edition
Managing Clouds with VMM
Concept of VLAN (Virtual LAN) and Benefits
HC Hyper-V Module GUI Portal VPS Templates Web Console
TSD Status and TSD API USIT
PLANNING A SECURE BASELINE INSTALLATION
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
Presentation transcript:

Services for Sensitive Research Data Gard Thomassen, PhD Head of Research Support Services Group Leader of the ”Services for Sensitive Data” project University Center for Information Technology (USIT) University of Oslo

Outline What is sensitive data? Who has sensitive data? Project background Collaborators and reference group System requirements System outline Technical and security details Maintenance Advantages and current status International collaborations Gard Thomassen,TSD 2.0

Who has sensitive data? Faculty of Medicine / Oslo University Hospital Faculty of Theology Faculty of Educational Sciences Faculty of Social sciences And so the list continues…also outside UiO.. Gard Thomassen,TSD 2.0

Project background UiO has an open network structure, but still with a high level of security Most of the UiO data is open Various UiO/OUS researchers approached USIT asking for an eInfrastructure for sensitive data (majority was MR-images and NGS data) The pilot project TSD 1.0 was run Gard Thomassen,TSD 2.0

Lessons learned The need for our services far exceeded the scalability of our system Too much hands-on maintaining and manual setup of new projects and new users There is a need for a High Performance Computing (HPC) resource within a secure environment Not very user friendly (both ends) Gard Thomassen,TSD 2.0

Main collaborators on TSD 2.0 Collaborators Norwegian Storage Infrastructure (NorStore) Norwegian Genetics Analysis Platform (GenAp) Norwegian Dietary Registry (Faculty of Medicine) Institute of Psychology (Faculty of Social Sciences) Norwegian Cancer Sequencing Consortium (NCGC) Reference group Oslo University Hospital, NorStore, Regional Etichal Committee, National Institute of Public Health, Norwegian Cancer Registry, Research Network at OUS, Elixir Norway, NCGC, GenAP and Institute of Psychology,UiO. 7 Gard Thomassen,TSD 2.0

System requirements Security, isolation and access control as given by law Large storage capacity Multiple users High performance computing resource High bandwidth Easy to maintain Easy to use (including audio and video) Some freedom within user space Accessible from anywhere through authentication A variety of software and public DBs must be available Windows and Linux support (OS X if possible) Data collection service Data sharing service National scope (so far..) 8 Gard Thomassen,TSD 2.0

Solution outline 9 Gard Thomassen,TSD 2.0

System outline 10 Gateway HPC - ColossusVM-server Storage Internet Secure encrypted network to special high volume data production sites 1 (project) 1 (storage area) n 1 Gard Thomassen,TSD 2.0

Using TSD 2.0 for analysis 11 VM B 1 P 1 P1P1 TSD disk VM B 2 P 1 GW User B 1 P 1 Colossus disk Colossus Front end Colossus Gard Thomassen,TSD 2.0 User B 2 P 1 TSD 2.0 P 1 DB

Data import and export using TSD “Sluice-server” Virtual “sluice- server” Virtual project- server “Sluice HD” Project HD TSD 2.0 NFS mount 2 Data copied here by ssh+scp or web-drive (2-factor authentication) encrypted data if sensitive Gard Thomassen,TSD 2.0

Data collection using TSD “Nettskjema” Gard Thomassen,TSD 2.0 minID Project VM Project disk Import mechanism Encrypted XML (PGP) TSD 2.0

Data-import for NGS-centers and other large scale data producers 14 Gard Thomassen,TSD 2.0 TSD 2.0 TSD controlled box on-site HiSEQ /tmp/ storage Project VM Project disk GW Encrypted connection

Closed network at USIT Technical outline 15 Admin services -Provisioning system -AD -Surveillance -Software repo -Cfengine -Vcenter -Backup -Antivirus -Log service Storage / DBs -PostgreSQL -Archiving -Compartmentalized disk HPC-resource Management -Mgmt of storage -Mgmt of network -Mgmt of hardware -Mgmt of VMs Clients (2-factor login) -Remote desktop clients -Thin-clients on dedicated network -Special network for large-scale data production centers Publicly available network segment through “minID” Web- questionary Web portalElectronic consent Clinical health data projects Other sensitive data projects Access network -National Health network -Terminal servers -Thin client servers -VPN Gard Thomassen,TSD 2.0

Technical details KVM for virtualization (RedHat Linux) Cerebrum as provisioning (a USIT application) AD system administration guided by the provisioning system (duplicated) FreeBSD firewall and gateway (duplicated) Integration with IDporten (Norwegian governmental eID system) for www-enquiries and applications Storage with separation between projects (Hitachi disc system and encrypted backup to tape) IPv6 on the inside (… and private IPv4) 16 Gard Thomassen,TSD 2.0

HPC resource – Colossus At present about 500 cores No project users are to log in on any nodes One global job daemon to control data integrity (to ensure project data separation) /tmp/ and /work/ will be per projects and cleaned after job finishes As similar to Abel as possible Separate disk and more nodes will come soon 17 Gard Thomassen,TSD 2.0

Security details OATH TOTP 2-factor authentication –Smart phones or programmable hardware tokens Special roles for those allowed to export data Import/export is under strict control No open connection to the internet Strong separation between projects (VLAN) Special security measures with remote desktops Extremely hardened FreeBSD gateway and firewall Encrypted backup, one key per project Sys admins are single users (traceability) Sys admins have to use same authentication process Most hardware is physically separated from other UiO hardware 18 Gard Thomassen,TSD 2.0

Maintenance Reuse as much as possible from the USIT eInfrastructure Virtualize as much as possible Management/ surveillance data can be pushed, but not pulled (Nagios, Collectd) Surveillance based on existing systems Sys admins have different access levels 19

Opportunities enabled by TSD 2.0 NGS research on humans is possible Large scale imaging studies possible “HUNT-like” studies online for the respondents and the scientists Off-site analysis of sensitive data Secure storage for verification of published research Electronic consent Possible work-area for making exams? TSD to host all human NGS research data from UIO/OUS?? Gard Thomassen,TSD 2.0

Nordic collaboration opportunities Laws are fairly similar (Norway very strict) Difficult to exchange data for research One should learn from each others as these systems demands very special IT-knowledge System development and system-administration is non-sensitive and may be shared Building TSD addresses many novel security questions in a University setting, to be learnt from Large DBs of health data may enable very interesting research in the future (NeGI) NeIC has shown interest into TSD 2.0 TSD collaborate with CSC in Finland and with BILS / Elixir Sweden. BBMRI are interested 21 Gard Thomassen,TSD 2.0

Current status Pilot project data is transferred now now System is being prepared and finished for setting up new projects and go into production Storage is up Secure Nettskjema is up Working on risk evaluation Project registration when risk evaluation is finished HPC-resource 4 th quarter 2013 Video and sound will be the main target during further work System Whitepaper (v1.0) written

People involved Dag-Erling Smørgrav Petter Reinholtsen Elisabeth Ytterdal Tor Fuglerud DBA (PostgreSQL team) Cerebrum team Morten Werner Forsbring Espen Grøndahl HPC – Colossus team Gard Thomassen 23 Project group / developers IT-dir Lars Oftedal Hans A. Eide Märtha Felton Administration / associated Gard Thomassen,TSD 2.0

Cost per project First year establishment price (per project) Regular yearly project fee License cost (licensed software usage) Storage cost for storage exceeding basic allocation Cost of DB administration (if DB needed) Cost of CPU hours Colossus 24

Project administration in TSD technical Application through the National ID-portal + Nettskjema The project is created in Cerebrum with role-categories The project is connected to resources (VM + disc + VLAN + DB + HPC) Users are created and given their roles Username, pwd and one-time-passwords are distributed Accounts kept on storage, HPC CPU time and additional VMs to enable control and book-keeping NorStore may offer “free” storage within TSD (there might be a small security mgmt overhead cost) In the the future there will be some level of self service through a web portal within TSD 25 Gard Thomassen,TSD 2.0

Conclusion It is very hard to make something secure and user- friendly at the same time –Researchers wants the freedom of using the internet while doing research on sensitive data… A thorough risk assessment must be made during and after the planning and implementation phase to make the best choices What you can not avoid should at least be detected by some surveillance mechanism. More (inter)national / local cooperation wanted 26 Gard Thomassen,TSD 2.0

Pilot project (TSD 1.0) Secure storage for large amounts of NGS data and MR-images (>100TB) Secure windows “research server” enabling usage of MS Office, STATA, SPSS etc on sensitive data Research server is based on an isolated system using VMware ESX Two-factor login-system Encrypted backup Gard Thomassen,TSD 2.0

“The Ultimate Goal is…. ….to be able to provide the same services that are available for researchers working with non- sensitive data, with the necessary security, with minimum impact on the user experience, and minimum extra overhead and cost.” Hans Eide, 2012 (my boss) 28 Gard Thomassen,TSD 2.0