The CLoud Infrastructure for Microbial Bioinformatics

Slides:



Advertisements
Similar presentations
Implementing Tableau Server in an Enterprise Environment
Advertisements

Tableau Software Australia
Ed Duguid with subject: MACE Cloud
MUNIS Platform Migration Project WELCOME. Agenda Introductions Tyler Cloud Overview Munis New Features Questions.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Matt Bertrand Building GIS Apps in the Cloud. Infrastructure - Provides computer infrastructure, typically a platform virtualization environment, as a.
EduShib VA What is EduShib VA? EduShib VA (Virtual Appliance) is a image based implementation tool for eduroam and Shibboleth.
VMware vCenter Server Module 4.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
The University of Texas Research Data Repository : “Corral” A Geographically Replicated Repository for Research Data Chris Jordan.
SAP on windows server 2012 hyper-v documentation

Customized cloud platform for computing on your terms !
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
From Virtualization Management to Private Cloud with SCVMM 2012 Dan Stolts Sr. IT Pro Evangelist Microsoft Corporation
Cloud infrastructure for training in Life Sciences Manuel Corpas The Genome Analysis Centre.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
26 September 2013 Federating OpenStack: a CERN and Rackspace Collaboration Tim Bell Toby Owen
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Atmosphere.
OSIsoft High Availability PI Replication
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Intro to Datazen.
Microsoft Azure Active Directory. AD Microsoft Azure Active Directory.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Vignesh Ravindran Sankarbala Manoharan. Infrastructure As A Service (IAAS) is a model that is used to deliver a platform virtualization environment with.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
Lessons learned administering a larger setup for LHCb
Canadian Bioinformatics Workshops
REDCap General Overview
Unit 3 Virtualization.
What is HPC? High Performance Computing (HPC)
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
GENUS Virtualisation Service for GÉANT and European NRENs
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Working With Azure Batch AI
ATLAS Cloud Operations
SCD Cloud at STFC By Alexander Dibbo.
MUNIS Platform Migration Project
Tools and Services Workshop Overview of Atmosphere
Cloud Management Mechanisms
Usage of Openstack Cloud Computing Architecture in COE Seowon Jung Systems Administrator, COE
NextGENI: The Nation’s Edge Cloud
2TCloud - Veeam Cloud Connect
Cloud based Open Source Backup/Restore Tool
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Red Hat User Group June 2014 Marco Berube, Cloud Solutions Architect
Deploy OpenStack with Ubuntu Autopilot
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
Required 9s and data protection: introduction to sql server 2012 alwayson, new high availability solution Santosh Balasubramanian Senior Program Manager.
Shared Research Computing Policy Advisory Committee (SRCPAC)
Microsoft Ignite NZ October 2016 SKYCITY, Auckland.
Planning High Availability and Disaster Recovery
Specialized Cloud Mechanisms
Cyberinfrastructure for the Life Sciences
Microsoft Virtual Academy
BusinessObjects IN Cloud ……InfoSol’s story
* Introduction to Cloud computing * Introduction to OpenStack * OpenStack Design & Architecture * Demonstration of OpenStack Cloud.
Developing for Windows Azure
OpenStack Summit Berlin – November 14, 2018
Microsoft Virtual Academy
Microsoft Virtual Academy
Presentation transcript:

The CLoud Infrastructure for Microbial Bioinformatics Dr Tom Connor Cardiff University Genome Science 2015 www.climb.ac.uk @tomrconnor ; @mrcclimb

The CLIMB Consortium Are Professor Mark Pallen (Warwick) and Dr Sam Sheppard (Swansea) – Joint PIs Professor Mark Achtman (Warwick), Professor Steve Busby FRS (Birmingham), Dr Tom Connor (Cardiff)*, Professor Tim Walsh (Cardiff), Dr Robin Howe (Public Health Wales) – Co-Is Dr Nick Loman (Birmingham)* and Dr Chris Quince (Warwick) ; MRC Research Fellows Simon Thompson (Birmingham, Project Technical/OpenStack lead), Marius Bakke (Warwick, Systems administrator), Simon Thompson (Swansea, Systems administrator) * Principal bioinformaticians architecting and designing the system

The CLoud Infrastructure for Microbial Bioinformatics (climb.ac.uk) We are creating A one stop shop for Microbial Bioinformatics Public/private cloud for use by UK academics Standardised cloud images that implement key pipelines Storage repository for data/images that are made available online and within our system, anywhere (‘eduroam for microbial genomics’) We will provide access to other databases from within the system As well as providing a place to support orphan databases and tools

System Outline 4 sites Connected over Janet Different sizes of VM available; personal, standard, large memory, huge memory Able to support >1,000 VMs simultaneously (1:1 vCPUs/vRAM : CPUs/RAM) 7-8PB of object storage across 4 sites (~2-3PB usable with erasure coding) 4-500TB of local high performance storage per site A single system, with common log in, and between site data replication System has been designed to enable the addition of extra nodes / Universities

CLIMB Overview: GS Update 4 sites, running OpenStack Hardware procured in a three stage process IBM/OCF provided compute, Dell/redhat provided storage Networks provided by Brocade Are defining a reference architecture to enable other sites to trivially be added

Hardware (per site) 2 router/firewalls (capable of routing >80Gb each) 3 Controllers 21x 64 vCore, 512GB RAM nodes 3x 192 vCore, 3TB RAM nodes ~500TB GPFS (local) 4 controllers Infiniband, with 10Gb failover ~2PB total Ceph (shared) 27x 64TB nodes/site Cross site replication 10Gb Backbone

Overview – 4 sites, (virtually) identical hardware External clouds External databases Each site is connected to the others over VPN tunnels. Sites can be easily added. System can use free router software and commodity hardware, pay for-software or dedicated router/firewalls Our intention is for the system to be presented to users as a single system, with single login, via Shibboleth. We are currently working on that bit  External clouds External databases A single system makes it easy(er) to share methods and data!

Flavours User configurable, with standard flavours Regular; up to 8 vCPUs, 64GB RAM xlarge; up to 16 vCPUs, 256GB RAM Huge; up to 192 vCPUs, 3TB RAM System also supports a scalable virtual cluster 2+ nodes with 2+ vCPUs, 2-4GB RAM/vCPU Also provides for Long Term Hosting (for new or orphan datasets/tools)

Access Microbial researchers will be able to access the system through one of two ways Externally, via federated access system, login via .ac.uk user login in first instance, later (hopefully) open to anyone who uses shibboleth Internally, via user accounts setup by consortium for collaborators

Where are we now? Computational hardware was procured by March 2015 (~6 month process) Ahead of schedule - system is now online and in use for research Adopting two models for access Access for registered users to “pro” dashboard online now (intended for bioinformaticians/developers) “version 1.0” system providing universal access to predefined images starting with the GVL – by Winter 2015 (intended for those who just want a single server with predefined software)

Live Demo Dashboard Login: http://birmingham.climb.ac.uk Wiki Login: http://wiki.climb.ac.uk

VMs are already up

Users are already using CLIMB to do research

CLoud Infrastructure for Microbial Bioinformatics A multi site system to provide a one-stop-bioinformatics-shop, designed specifically to support Microbial researchers For both Bioinformaticians and wet lab scientists Combines hardware with training Free, simple interface, easy to use Common login Easy data and method sharing Already have multiple users from across UK academia and healthcare

The CLIMB Consortium Are Professor Mark Pallen (Warwick) and Dr Sam Sheppard (Swansea) – Joint PIs Professor Mark Achtman (Warwick), Professor Steve Busby FRS (Birmingham), Dr Tom Connor (Cardiff)*, Professor Tim Walsh (Cardiff), Dr Robin Howe (Public Health Wales) – Co-Is Dr Nick Loman (Birmingham)* and Dr Chris Quince (Warwick) ; MRC Research Fellows Simon Thompson (Birmingham, Project Technical/OpenStack lead), Marius Bakke (Warwick, Systems administrator), Simon Thompson (Swansea, Systems administrator) * Principal bioinformaticians architecting and designing the system

CLoud Infrastructure for Microbial Bioinformatics (CLIMB) MRC funded project to develop Cloud Infrastructure for microbial bioinformatics ~£4M of hardware, capable of supporting >1000 individual virtual servers Amazon/Google cloud for Academics Already in production, and use by researchers