Download presentation
Presentation is loading. Please wait.
Published byAlisha Lloyd Modified over 8 years ago
1
SARA Reken- en Netwerkdiensten Experiences running a HPC Cloud Ron Trompert on behalf of the SARA Cloud team
2
SARA Reken- en Netwerkdiensten HPC Cloud - What Service Models at SARA: Cloud Infrastructure as a Service (IaaS) Cloud Software as a Service (SaaS) Currently experimenting with portals (R portal) Cloud Platform as a service Not (yet) SARA: Allow users to freely instantiate a personal environment leap from laptop (small scale) to HPC (large scale)
3
SARA Reken- en Netwerkdiensten HPC Cloud - Why World better utilization for infrastructure "Green IT" (power off under-utilization) easy management SARA free OS & software environment locked software can be used rapid availability HPC cloud for academic world Massive interest and multiple early adopters prove the need for an academic HPC Cloud environment. Early cloud P.O.C. running “production”
4
SARA Reken- en Netwerkdiensten HPC Cloud – How Evolving thanks to continued support from BiG-Grid: Claudia ~1 year POC/Beta running 2010/2011 limited infrastructure: CPU (16*8) 128, MEM (16*24GB) 384GB 1Gbps IO, local disk storage single VM limit Calligo HPC infrastructure: CPU (19*32) 608 (Intel Intel Xeon-E7 "Westmere-EX") MEM (19*256GB) 4.864GB = 4,75TB 10GE, 1-hop, non-blocking interconnect 400TB shared storage (ISCSI,NFS,CIFS,CDMI...) virtual clusters
5
SARA Reken- en Netwerkdiensten HPC Cloud – How Technology Challenges (1/2): Hypervisor VM-Ware: data-center centric Citrix (Xen): para-virtualisation KVM: full virtualization, standard Linux kernel based, no commercial backing Networking Virtual switches: slow Para-virtual drivers: limited to 1Gbs (kvm/virtio) SR-IOV: bright new future
6
SARA Reken- en Netwerkdiensten HPC Cloud – How Technology Challenges (2/2): Storage Concurrent use of different protocols: Local Disk -> fast but no file locking ISCSI -> fast but no file locking SSH -> slower, secure, local account required NFS(v4) -> slower, shared, host-based, load? CIFS -> slow, shared, user based WebDav/CDMI -> slow, user based, clients? Compute all mainstream CPUs support Virtualization support for network IOMMU/VTd and SR-IOV is scarce
7
SARA Reken- en Netwerkdiensten HPC Cloud – Trust Security is of major importance cloud user confidence infrastructure provider confidence Protect the outside from the cloud users the cloud users from the outside the cloud users from each other Not possible to protect the cloud user from himself user has full access/control/responsibility ex. virus research must be possible
8
SARA Reken- en Netwerkdiensten HPC Cloud – Trust Firewall fine-grained access rules (“closed port” policy) By default everything is closed, non-standard ports can be opened by the user Scanning of new virtual templates catches initial problems, but once the VM is live... Port scanning catches well-known problems State-full Package Inspection random sample based
9
SARA Reken- en NetwerkdienstenPhilips site visit | 2 February 2012 HPC Cloud – Calligo Virtual Machine Disk Image Private network Network filter ISO Image CPU Memory Public network Template = User Defined = Predefined
10
SARA Reken- en Netwerkdiensten HPC Cloud – Calligo Admin PortalCLI [EC2] [OCCI] User Portal [XML-RPC] ONED mm_sched Libvirt KVM-Qemu OpenNebula Linux virsh
11
SARA Reken- en NetwerkdienstenPhilips site visit | 2 February 2012 HPC Cloud – Calligo Template Virtual Machine network(s) Disk Image CPU/Memory Copy of Disk Image Virtual Machine Copy of Disk Image Virtual Machine Copy of Disk Image Virtual Machine Copy of Disk Image Non Persistent Disk Image
12
SARA Reken- en NetwerkdienstenPhilips site visit | 2 February 2012 HPC Cloud – Calligo Template network(s) Disk Image CPU/Memory Virtual Machine Persistent Disk Image
13
SARA Reken- en Netwerkdiensten HPC Cloud – Calligo
14
SARA Reken- en Netwerkdiensten HPC Cloud – Calligo
15
SARA Reken- en Netwerkdiensten HPC Cloud – Calligo Node eth1 DRAC CONS-NW (static) eth3eth2 bond0 DATA-NW br0 (DHCP) br1br2 eth0 MGT-NW br100 UserA VM01 ( VLAN_ID=100 ) eth0eth1 br3 bond0.100 br56 bond0.56 br101 bond0.101 UserA VM02 ( VLAN_ID=100 ) UserB VM01 ( VLAN_ID=101 ) eth0 eth1 br102
16
SARA Reken- en Netwerkdiensten HPC Cloud – Accounting Take logging of VMs starting and stopping Take periodic snapshots to see which VMs are running
17
SARA Reken- en Netwerkdiensten HPC Cloud – Users Can ask fore core hours Each core hours is accompanied by 8 GB of memory This means When you need two cores for your VM you get 16 GB of memory When you need 32GB of memory you get 4 cores Maps nicely on the deployed hardware We did this just to start out with. Maybe we get better ideas on this later on. No overcommitting
18
SARA Reken- en Netwerkdiensten HPC Cloud – Users Started in January this year with the production infrastructure Now 37 users Applications: Galaxy, DNA sequencing, CLARIN, EUDAT, eSiBayes, Historic Map Collections, UrbanFlood, Transcriptomics,..… Now out of every 3 grant requests to the Dutch NGI (BiGGrid) 2 of them are Cloud requests Our users have been happy campers so far
19
SARA Reken- en Netwerkdiensten HPC Cloud – Experiences Had to do some some tweaking ourselves. This is the motivation for using an open source product like OpenNebula. Examples are accounting, firewalling With SR-IOV we got 8-9Gbps between VMs support for network IOMMU/VTd and SR-IOV is scarce
20
SARA Reken- en Netwerkdiensten HPC Cloud – Future OVF, OCCI Uploading own images Access to HSM environment, dCache grid storage Concurrent storage protocols ISCSI SSH CIFS WebDav/CDMI
21
SARA Reken- en Netwerkdiensten HPC Cloud - Summary SARA HPC CLOUD FREEDOM OF CHOICE Would you like to know more? https://www.cloud.sara.nl https://twitter.com/#!/sara_escience Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.