System G And CHECS Cal Ribbens

Slides:



Advertisements
Similar presentations
ClearCube Blade Manager 4.0 Overview and Demonstration Rev
Advertisements

The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today.
©2009 HP Confidential template rev Ed Turkel Manager, WorldWide HPC Marketing 4/7/2011 BUILDING THE GREENEST PRODUCTION SUPERCOMPUTER IN THE.
Beowulf Supercomputer System Lee, Jung won CS843.
CURRENT AND FUTURE HPC SOLUTIONS. T-PLATFORMS  Russia’s leading developer of turn-key solutions for supercomputing  Privately owned  140+ employees.
Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape.
PVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments Palden Lama, Xiaobo Zhou, University of Colorado at Colorado Springs.
Supermicro © 2009Confidential HPC Case Study & References.
Information Technology Center Introduction to High Performance Computing at KFUPM.
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME SESAME – LinkSCEEM.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Windows Server 2012 Certification and Training June 2012.
Virtual Desktop Infrastructure Solution Stack Cam Merrett – Demonstrator User device Connection Bandwidth Virtualisation Hardware Centralised desktops.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
1 Some Context for This Session…  Performance historically a concern for virtualized applications  By 2009, VMware (through vSphere) and hardware vendors.
Administration and management of Windows-based clusters Windows HPC Server 2008 Matej Ciesko HPC Consultant, PM
1 Down Place Hammersmith London UK 530 Lytton Ave. Palo Alto CA USA.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
CLOUD COMPUTING 2.0 Finally, the promise of the cloud has arrived v 1.8.
Energy Profiling And Analysis Of The HPC Challenge Benchmarks Scalable Performance Laboratory Department of Computer Science Virginia Tech Shuaiwen Song,
An Introduction to IBM Systems Director
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Srinidhi Varadarajan Director.  We need a paradigm shift to make supercomputers more usable for mainstream computational scientists. ◦ A similar shift.
An architecture for space sharing HPC and commodity workloads in the cloud Jack Lange Assistant Professor University of Pittsburgh.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
Example: Sorting on Distributed Computing Environment Apr 20,
Magellan: Experiences from a Science Cloud Lavanya Ramakrishnan.
March 9, 2015 San Jose Compute Engineering Workshop.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
Performance Monitoring of SLAC Blackbox Nodes Using Perl, Nagios, and Ganglia Roxanne Martinez Mentor: Yemi Adesanya United States Department of Energy.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Update IDC HPC Forum.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters Amit Karwande, Xin Yuan Department of Computer Science, Florida State.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Building and managing production bioclusters Chris Dagdigian BIOSILICO Vol2, No. 5 September 2004 Ankur Dhanik.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
A Practical Evaluation of Hypervisor Overheads Matthew Cawood Supervised by: Dr. Simon Winberg University of Cape Town Performance Analysis of Virtualization.
Windows Server 2012 Certification and Training
Running clusters on a Shoestring US Lattice QCD Fermilab SC 2007.
- Inter-departmental Lab
Use of Cloud Computing for Implementation of e-Governance Services
HPC Roadshow Overview of HPC systems and software available within the LinkSCEEM project.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Computing Clusters, Grids and Clouds Globus data service
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME Outreach SESAME,
Lattice QCD Computing Project Review
Flex System Enterprise Chassis
OpenLabs Security Laboratory
Appro Xtreme-X Supercomputers
Is System X for Me? Cal Ribbens Computer Science Department
An Overview of the ITTC Networking & Distributed Systems Laboratory
Department of Computer Science University of California, Santa Barbara
NCSA Supercluster Administration
Overview of HPC systems and software available within
IBM Power Systems.
Cloud-Enabling Technology
Department of Computer Science University of California, Santa Barbara
Assoc. Prof. Marc FRÎNCU, PhD. Habil.
Task Manager & Profile Interface
Cluster Computers.
Presentation transcript:

System G And CHECS Cal Ribbens Center for High-End Computing Systems (CHECS) Department of Computer Science Virginia Tech

Introducing System G System G (Green) was sponsored in part by the National Science Foundation and VT CoE (CHECS) to address the gap in scale between research and production machines. The purpose of System G is to provide a research platform for the development of high-performance software tools and applications with extreme efficiency at scale. A primary goal was to demonstrate that supercomputers can be both fast and a more environmentally green technology. System G is the largest power-aware research system and one of the largest computer science systems research clusters in the world.

System G Stats 325 Mac Pro nodes, each with two 4-core 2.8 gigahertz (GHZ) Intel Xeon Processors (2592 cores) Each node has eight gigabytes (GB) RAM and 6 MB cache. Mellanox 40Gbs (QDR) InfiniBand interconntect. LINPACK result: 22.8 TFLOPS Over 10,000 power and thermal sensors Variable power modes: DVFS control (2.4 and 2.8 GHZ), Fan-Speed control, Concurrency throttling,etc. Intelligent Power Distribution Unit: Dominion PX (remotely control the servers and network devices. Also monitor current, voltage, power, and temperature through Raritan’s KVM switches and secure Console Servers.)

Center for High-End Computing Systems Srinidhi Varadarajan Director

Vision Our goal is to build a world class research group focused on high-end systems research. This involves research in architectures, networks, power optimization, operating systems, compilers and programming models, algorithms, scheduling and reliability. Our faculty hiring in systems is targeted to cover the breadth of these research areas. The center is involved in research and development work, including design and prototyping of systems and development of production quality systems software. The goal is to design and build the software infrastructure that makes HPC systems usable by the broad computational science and engineering community. Provide support to high performance computing users on-campus. This involves the center in supporting actual applications, are then profiled to gauge the performance impact of its research.

Research Labs Computing Systems Research Lab (CSRL) Distributed Systems and Storage Lab (DSSL) Laboratory for Advanced Scientific Computing and Applications (LASCA) Scalable Performance Laboratory (SCAPE) Systems, Networking and Renaissance Grokking Lab (SyNeRGY) Computational Science Laboratory Software Innovations Lab

Faculty/Students Godmar Back (04) Barbar Ryder (08) Ali Butt (06) Adrian Sandu (03) Kirk Cameron (05) Eli Tilevich (06) Wu Feng (05) Srinidhi Varadarajan (99) Dennis Kafura (82) Layne Watson (78) Cal Ribbens (87) Danfeng Yao (10) Ph.D. Students MS Students 35 20+

Deployment Details * 13 racks total, 24 nodes on each rack and 8 nodes on each layer. * 5 PDUs per rack. Raritan PDU Model DPCS12-20. Each single PUD in SystemG has an unique IP address and Users can use IPMI to access and retrieve information from the PDUS and also control them such as remotely shuting down and restarting machines, recording system AC power, etc. * There are two types of switch: 1) Ethernet Switch: 1 Gb/sec Ethernet switch. 36 nodes share one Ethernet switch. 2) InfiniBand switch: 40 Gb/sec InfiniBand switch. 24 nodes (which is one rack) share one IB switch.

More Information About SystemG A. Question: Why not purchase regular blades for SystemG? 1. Current Mac Proc machine has same system configurations with much cheaper price. 2. Every node has two extra PCI-Experts X16 Slots. We already use one for InfiniBand. 3. Better heat dissipation for thermal research. B. InfiniBand vs Ethernet: 1. Users can always run MPI programs using regular Ethernet. 2. Compared to Ethernet, we have much faster infiniband (40x faster) but require proper usage of infiniband APIs and other related programming techniques. 3. For MPI programs, there is : MVAPICH (for infiniband) ---- MPICH MVAPICH2 (for infiniband)--- MPICH2 Users can compile MPI programs using MVAPICH implementations to compile and run parallel programs on Infiniband if necessary.

Useful Links and Contact Info System G reservation Page (Wiki): http://www.csrl.cs.vt.edu/wiki/index.php/Main_Page. System G administrator email: rhunter@vt.edu System G listener: SYSG@listserv.vt.edu MVAPICH and MVAPICH2: http://mvapich.cse.ohio-state.edu/overview/

A Power Profile for HPCC benchmark suite