Download presentation
Presentation is loading. Please wait.
Published byKent Willocks Modified over 9 years ago
1
03.05.2015 SHORT OVERVIEW OF CURRENT STATUS A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow, 10-12 June, 2009 “SKIF-GRID” SUPERCOMPUTING PROJECT OF THE UNION STATE OF RUSSIA AND BELARUS
2
03.05.2015Slide 2 2 Pereslavl-Zalessky Russian Golden Ring City: 857 years old Hometown of Great Dukes of Russia The first building site Peter The Great navy Ancient capital of Russian Orthodox church Moscow Pereslavl Zalessky 120 km
3
03.05.2015Slide 3 “SKIF-GRID” PROJECT TIMELINE 1.2000-2004 - SKIF project, SKIF K-1000 is #98 in Top500 2.June 2004 – first proposal filed for “SKIF-GRID” project 3.March 2007 – approved by Government 4.March 2008 - SKIF-MSU supercomputer deployed (#36 in June 08 Top 500) 5.May 2008 - “SKIF-Testbed” federation created. 6.March 2009 – alliance agreement signed for SKIF series 4 development
4
03.05.2015Slide 4 PROJECT ORGANIZATION: 2007-2008 Project directions 1. Grid technology 2. Supercomputers SW HW 3. Security 4. Pilot projects – applications of HPC and grid technology
5
03.05.2015Slide 5 «SKIF MSU»
6
03.05.2015Slide 6 SKIF MSU Theoretical peak performance 60 TFlops 47 TFlops Linpack Advanced clustering solutions: diskless computational nodes Original blade design ParameterValue CPU architecture:x86-64 CPU model:Intel XEON E5472 3,0 GHz (4-cores) Nodes (dual CPU)625 CPU cores total5 000 InterconnectInfiniband DDR, Fat Tree
7
03.05.2015Slide 7 «SKIF-Testbed» a/k/a “SKIF-Polygon” Federation of HPC centers, ~100 Tflops 4 computers in the current Top 500 MSU (#35 in Top500) South Urals State University Tomsk State University UFA state technical university
8
03.05.2015Slide 8 Middleware platform – UNICORE 6.1 X.509 for security Certificate Authority at Pereslavl-Zalessky (PyCA) Site platform UNICORE 6.1 Java 1.5 Linux Torque Experimental sites: UNICORE is complemented with additional services/modules
9
03.05.2015Slide 9 Applications (2007-2008) HPC applications: Drug design (MSU Belozersky Institute, SRCC, Chelyabinsk SU) Inverse problems in soil remote sensing (SRCC) Computational chemistry (MSU Chemistry department) Geophysical data services Mammography database prototype (N.N. Semenov Chemical Physics Institute, RAS) Text mining (PSI RAS) Engineering (South Ural University …) Space Research Institute... …
10
03.05.2015 SKIF-Aurora 2009-2010: second phase of SKIF-GRID project
11
03.05.2015Slide 11 SKIF Series 4: original R&D goals Highest density of performance (biggest possible number CPU per 1U) Smaller latency Less cables and connectors — better reliability Enlarged emission of heat per 1U We need new technology of cooling… How to? Improved Interconnect: we need better scalability, bandwidth and latency that it’s provided by best available solutions (eg. Infiniband QDR) New approach to monitoring and management of the supercomputer Combining standard CPUs and accelerators in computational nodes of the supercomputer
12
03.05.2015Slide 12 Spring’2008: SKIF Series 4 — How To?
13
03.05.2015Slide 13 Summer’2008: SKIF Series 4 — Know How! Italian-Russian Cooperation «SKIF Series 4» == «SKIF-AURORA Project» Designed by an alliance of Eurotech, PSI RAS and RSC SKIF with support by Intel To be present at ISC 09 Program Systems Institute of RAS
14
03.05.2015Slide 14 SKIF-Aurora distinctive features No moving parts Liquid cooling – power efficiency X86_64 processors (IntelNehalem) 3-D torus interconnect Redundant management/monitoring subsystem FPGA on board (optional) SSD disks (optional) QDR Infiniband
15
03.05.2015Slide 15 SKIF-Aurora 32 nodes per chassis 64 CPUs in 6U Up to 8 chassis per rack Up to 512 CPU per rack Up to 2048 cores To build 500 TFlops 21 racks in 2009 scalable due to 3-D torus 10 kW per chassis
16
03.05.2015Slide 16 SKIF-AURORA: Designed by the alliance of Eurotech, PSI RAS and RSC SKIF PCBs, mechanics, power supply, cooling, 1 and 2 levels of management system 3 level of management system, Interconnect (3D-torus: firmware, routing, drivers, MPI-2…), FPGA as accelerator
17
03.05.2015Slide 17 SKIF-AURORA Management Subsystem
18
03.05.2015Slide 18 3-D torus interconnect implementation System Interconnect, 3D-torus Subsidiary Interconnect, Infiniband FPGA... CPU standard part non- standard part Only QCD specific is implemented by Italian team Russian teams to upgrade network to general-purpose interconnect (MPI 2.0), due to appear fall 2009
19
03.05.2015Slide 19 R&D Directions Using FPGA Collective MPI operations using FPGA FPGA to facilitate support of PGAS- languages (UPC, Titanium, etc) FPGA+CPU hybrid computing
20
03.05.2015Slide 20 Conclusions Is based on collaboration between international teams Harnesses shared expertise and results Aimed to develop a family of petascale-level supercomputers with innovative techniques: Higher density of CPUs (flops per volume) Efficient water cooling system Scalable powerful 3D-Torus Interconnect Etc.
21
03.05.2015Slide 21 Datacenter visualization
22
03.05.2015Slide 22 Datacenter visualization
23
03.05.2015Slide 23 THANKS SKIF-GRID web site http://skif-grid.botik.ru
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.