Scott McNealy is an American business executive. He co-founded Sun Microsystems in 1982 (Sun was acquired by Oracle). Scott McNealy SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia.
“Network is a computer” Our vision: “Computational center area network is a Grid” SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Scott McNealy What is the Grid?
Greg Papadopoulos Greg Papadopoulos, Ph.D. was Executive Vice President and Chief Technology Officer (CTO) of Sun Microsystems. SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia.
Services-Oriented Architecture Service provider Service broker Service requestor Publish Find Bind
Larry Smarr is a leader in scientific computing, supercomputer applications, and Internet infrastructure. Larry Smarr Smarr was a director of the National Computational Science Alliance SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia.
Technologies Metacomputing Service oriented architecture (SOA) Virtualization Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia.
Metacomputer is a network of heterogeneous, computational resources linked by software in such a way that they can be used as easily as a personal computer. The way to organize distributed computations. To boost the usage of the networks (local and global). To promote computational facility of the next generation. SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Metacomputing
The idea of Cloud Computing is the transmission of the organization of data computing and processing mainly from personal computers to the servers of the World Wide Web SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Cloud computing
Cloud Computing SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Basics
SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Tasks that cannot be solved in any different way We can use API and web-interface even in everyday tasks, which do not require large resources The idea of a co-usage of the resources on demand and only in the needed amount What tasks make us use a cloud?
Traditional methods of use of computational resources Cloud methods What tasks make us use a cloud? SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia.
Cloud is determined completely by its API Operational environment must be UNIX – like SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Our principles
Cloud uses protocols, compatible with popular public clouds Cloud processes the data on the base of distributed file systems The consolidation of data is achieved by distributed Federal DB Our principles SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia.
Load balancing is achieved by the use of virtual processors with controlled rate Processing of large data sets is done via shared virtual memory Cloud uses complex grid – like security mechanisms SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Our principles
Basic principle of a Metacomputer
Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Shared Memory Programming in Metacomputing Environment
SOA Metamodel
Virtual eco-system Corporate IT grid-architecture SPbSU and Institute for High Performance Computing and Integrated Systems, St.Petersburg, Russia. Convergention of means of virtualizaion, grid and SOA
Linux clusters ADVANTAGES: AN ORDER OF MAGNITUDE BETTER FASTER CHEAPER SIMPLER significant decrease in the price / performance ratio SIMPLE TO USE CROSSBARsCLUSTER
Linux clusters DISADVANTAGES: Linux clusters – poor scalability for large numbers of processors ( > 64 ) Linux clusters – systems with low price / performance ratio, taking into account the cost of the specialized software
ARCHITECTURES: Circulant Grafs Fat-tree / Crossbars 16x16 Circulant Grafs / Crossbars 16x16
Linux clusters PURPOSE : Linux cluster – effective tool for specialized computer systems designed for specific applications
Formal model of cluster acceleration Definition of acceleration Let us proposed that is the part of an algorithm carried out only on one processor Amdal law d ij d
n Typical shape of cluster acceleration SnSn
SnSnSnSn n opt Theoretically maximum acceleration =0 =0.01 =0.02 =0.05 n opt =0.01 =0.05 =0.2 =0.5 Efficiency of cluster solutions
Fundamental problems of cluster systems performance increase Processors balancing problem (more than through one router for configuration of more than 128 processors). Value of rate processor speed – link speed would not be very high.
Motivation for Virtual Private Supercomputer Make distributed computing system easier to use and to manage Allocate as much resources as needed by applications For some users, a typical computer cluster puts certain constraints, e. g. fixed operating system, fixed configuration, fixed number of libraries. Goal: provide user applications with access to as much resources as needed in a way preferable for the user
Approach Build tailored virtual computing environment: Tune the computing infrastructure to optimize application performance and optimally distribute virtualized physical resources between applications – application-centric approach Virtualization of resources –Create virtual clusters that match application profiles (configurable CPU, memory, network) –Use light-weight virtualization with less overhead –Enables flexible configuration of infrastructure Different applications have different profiles and requirements
Typical configuration
Proposed configuration
Virtual private clusters Collection of virtual machines working together to solve a computational problem Can be configured by advanced users; they know exactly what they want (CPU, memory, IO, network) Can be flexibly adjusted to the needs of an application VM Host 1 VM Host 2 VM Host 3 VM Host 4 VM VC1 VC3 VC2 VM
Why virtual clusters? Precise control on allocated resources (CPU, memory, etc) Applications get exactly what they need (or what they request): one app needs fast disk IO and not much CPU, another one – fast network and fast CPU with no disk IO, etc Capacity of unclaimed resources available for other applications on a limited set of hardware CPU1 App 1 MEM1 CPU MEM IO1 IO NET1 NETW CPU2 App 2 MEM2 IO2 NET2 CPU3 App 3 MEM3 IO3 NET3 HOST
Tried earlier: Virtual Clusters with OpenStack OpenStack with KVM, governed by cloud infrastructure management and automation software Significant overhead observed in startup time of VMs composing the cluster and calculation time Test application: matrix multiplication
Virtual clusters with OS-level virtualization Requirements Zero-overhead virtualization for HPC cluster Solution Application containers instead of VMs (on copy- on-write object storage)
Virtual clusters with OS-level virtualization Virtualization technologies: –XEN –Linux containers (LXC): OS-level virtualization /sbin/init Light-weight lxc-init Docker, to manage containers Measurements: –time of automatic creation –application runtime
Docker Docker allows you to package an application with all of its dependencies into a standardized unit for software development. Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries – anything you can install on a server. This guarantees that it will always run the same, regardless of the environment it is running in.
by Docker
Docker/host/VM Startup time
Virtual clusters: time of automatic creation
Tools and technologies Linux containers (LXC) Docker – automate deployment of containers Cgroups - limits, accounts for and isolates the resource usage (CPU, memory, disk I/O, network, etc.) Open vSwitch – network virtualization Ansible - Application Deployment + Configuration Management
Test applications NAS Parallel Benchmarks (NPB) The benchmarks are derived from computational fluid dynamics (CFD) applications and consist of several kernels and pseudo- applications HPC Challenge Benchmark (HPCC) Combines several benchmarks to test a number of independent attributes of the performance of high-performance computer (HPC) systems.
NAS Parallel Benchmarks results
Results Decreased startup time, overheads with LXC compared to para-virtualization Limiting a resource (e.g. memory) can impact performance highly (e.g. cause unneeded swapping) Need to profile (or model) applications to specify realistic requirements depending on input data Flexible configuration of containers with standard tools helps allocate proper amount of resources and control free available resources
Virtual Laboratory layer. Application Layer Grid Layer MACS Lab DNA-Array Radiology Application
VLAM Functional View VLAM Science Portal + Workbench Others DNA Array genome expression Material Ana Micro beam FTIR,... Bio Medicine MRI Scanner Application Grid Resources (Farms, microscope, etc.) VLAM RTS Grid Middleware (Globus) Layer Virtual Laboratory Layer (VLAM-G middleware)
Using the Testbed Parallel Jobs (HEP Prototype using MPICH-G2) Running Across Sites Grid Services (LIP) Site 1 Site i … network II JSS LB Globus
Концептуальная схема информационной системы Источники данных Препроцессинг Анализ и модельное усвоение информации Динамические базы данных Справочные каталоги Модельный синтез Сценарии Поддержка принятия решений Специальная база знаний Пополнение баз данных
Structure of virtual testbed
Characteristics of virtual testbed operation Functional diagram of complex environment organization
Practical realization of dynamic object modelling
The difference between having power and using it Steve Wallach