1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons Plant Sciences & iPlant Collaborative University of Arizona or Will Computers Crash Genomics? Science Vol 331 Feb 2011
Topic Coverage Building your tool chest for: – Development Environment – Test Environment – Deployment Environment Why do you need different Environments ? Virtualization, Containers and Virtual Env Preparing VM for Thu. hands on.
3 + = Simple Formula
The Reality 4 ++ PERL Python Java Ruby Fortran C C# C++ R Matlab etc. PERL Python Java Ruby Fortran C C# C++ R Matlab etc. Amazon Azure Rackspace Campus HPC XSEDE Etc. Amazon Azure Rackspace Campus HPC XSEDE Etc. and lots of glue…..
+ = Simple Formula
Too many environments ? What is “dependency hell” ? Why create these environments ? The laptop syndrome Reproducibility challenges Making things “cloudy”
7 SaaS: Software as a Service (e.g. Clustering/Assembly is a service) IaaS : Infrastructure as a Service (get computer time with a credit card and with a Web interface like EC2) PaaS : Platform as a Service IaaS plus core software capabilities on which you build SaaS (e.g. Hadoop/MapReduce is a Platform) Cyberinfrastructure Is “ Research as a Service ” Arrival of “ As a Service ” models Pain Flexibility Productivity
Your new best friend (for this course) Virtualization (in all layers of our cake !) 8
These slides are a subset from Carl Waldspurger (Vmware R&D) Introduction to Virtual Machines To learn more about virtualization visit:
Overview Virtualization and VMs Processor Virtualization Memory Virtualization I/O Virtualization Network Virtualization How does Virtualization relate to cloud ?
Types of Virtualization Process Virtualization – Language-level Java,.NET, Smalltalk – OS-level processes, Solaris Zones, BSD Jails, Virtuozzo – Cross-ISA emulation Apple 68K-PPC-x86, Digital FX!32 Device Virtualization – Logical vs. physical VLAN, VPN, NPIV, LUN, RAID System Virtualization – “Hosted” Virtual Box, VMware Workstation, Microsoft VPC, Parallels – “Bare metal” VMware ESX, Xen, KVM, Microsoft Hyper-V
Starting Point: A Physical Machine Physical Hardware – Processors, memory, chipset, I/O devices, etc. – Resources often grossly underutilized Software – Tightly coupled to physical hardware – Single active OS instance – OS controls hardware
What is a Virtual Machine? Software Abstraction – Behaves like hardware – Encapsulates all OS and application state Virtualization Layer – Extra level of indirection – Decouples hardware, OS – Enforces isolation – Multiplexes physical hardware across VMs Host OS and Guest OS
Virtualization Properties Isolation – Fault isolation – Performance isolation Encapsulation – Cleanly capture all VM state – Enables VM snapshots, clones Portability – Independent of physical hardware – Enables migration of live, running VMs Interposition – Transformations on instructions, memory, I/O – Enables transparent resource overcommitment, encryption, compression, replication …
What is a Virtual Machine Monitor (VMM)? Classic Definition (Popek and Goldberg ’74) VMM Properties – Fidelity – Performance – Safety and Isolation Note: VMM = Hypervisor
Classic Virtualization and Applications Classical VMM – IBM mainframes: IBM S/360, IBM VM/370 – Co-designed proprietary hardware, OS, VMM – “Trap and emulate” model Applications – Timeshare several single-user OS instances on expensive hardware – Compatibility From IBM VM/370 product announcement, ca. 1972
Modern Virtualization Renaissance Recent Proliferation of VMs – Considered exotic mainframe technology in 90s – Now pervasive in datacenters and clouds – Huge commercial success Why? – Introduction on commodity x86 hardware – Ability to “do more with less” saves $$$ – Innovative new capabilities – Extremely versatile technology
Modern Virtualization Applications Server Consolidation – Convert underutilized servers to VMs – Significant cost savings (equipment, space, power) – Increasingly used for virtual desktops Simplified Management – Datacenter provisioning and monitoring – Dynamic load balancing Improved Availability – Automatic restart – Fault tolerance – Disaster recovery Test and Development
Improve the software lifecycle Develop, debug, deploy and maintain applications in virtual machines Power tool for software developers – record/replay application execution deterministically – trace application behavior online and offline – model distributed hardware for multi-tier applications Application and OS flexibility – run any application or operating system Virtual appliances – a complete, portable application execution environment
Increase application availability Fast, automated recovery – automated failover/restart within a cluster – disaster recovery across sites – VM portability enables this to work reliably across potentially different hardware configurations Fault tolerance – hypervisor-based fault tolerance against hardware failures [Bressoud and Schneider, SOSP 1995] – run two identical VMs on two different machines, backup VM takes over if primary VM’s hardware crashes – commercial prototypes beginning to emerge (2008)
Clouds and VM Frist step to learning about cloud is Virtualization Taking VM from your desktop to cloud is our goal (which will not be easy, but we will make it happen) Scaling and why it matters to have many VM ? Connecting VM’s and what is a appliance ? Discussion on VM Cloud Containers (Docker and others in next class)!
Virtual Box We will use (Sun now Oracle) ver (current as of today) Guest OS will be CentOS (Number 13 on the list on this page) Use manual to install this on your laptop and learn about Virtual Box (end user docs) tml tml
Virtual Box (next class) You should have it running with the the centos 6.3 image for next class (and have logged in) We will learn about basic linux system admin tasks/duties, machine performance etc. for getting started odules.pdf (main text for next few sessions) odules.pdf Use Snapshots to save states Working with VM Appliances and exchanging images.
Starting to build VM from scratch Before coming to class on Thu. Download the current Ubuntu desktop ISO (ver LTS and get 64 bit version) This is a large file so make sure you do it BEFORE you come to class