Www.kit.edu KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) SX-9, a good Choice ? Steinbuch Centre for Computing.

Slides:



Advertisements
Similar presentations
Die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) StorNextFS, a fast global Filesysteme in a heterogeneous Cluster Environment.
Advertisements

1 | M. Sutter | IPE | KIT – die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Michael Sutter H.-J. Mathes,
SLA-Oriented Resource Provisioning for Cloud Computing
IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.
Ver 0.1 Page 1 SGI Proprietary Introducing the CRAY SV1 CRAY SV1-128 SuperCluster.
Beowulf Supercomputer System Lee, Jung won CS843.
CURRENT AND FUTURE HPC SOLUTIONS. T-PLATFORMS  Russia’s leading developer of turn-key solutions for supercomputing  Privately owned  140+ employees.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Claude TADONKI Mines ParisTech – LAL / CNRS / INP 2 P 3 University of Oujda (Morocco) – October 7, 2011 High Performance Computing Challenges and Trends.
Die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH)‏ Steinbuch Centre for Computing (SCC) A large Multi-Functional HPC.
HELICS Petteri Johansson & Ilkka Uuhiniemi. HELICS COW –AMD Athlon MP 1.4Ghz –512 (2 in same computing node) –35 at top500.org –Linpack Benchmark 825.
E-Science Workshop, Santiago de Chile, 23./ KIT ( Frank Schmitz Forschungszentrum Karlsruhe Institut.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
Earth Simulator Jari Halla-aho Pekka Keränen. Architecture MIMD type distributed memory 640 Nodes, 8 vector processors each. 16GB shared memory per node.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Operational computing environment at EARS Jure Jerman Meteorological Office Environmental Agency of Slovenia (EARS)
KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association Steinbuch Centre for.
KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft Welcome to GridKa School 2015.
Herbert Huber, Axel Auweter, Torsten Wilde, High Performance Computing Group, Leibniz Supercomputing Centre Charles Archer, Torsten Bloth, Achim Bömelburg,
3. April 2006Bernd Panzer-Steindel, CERN/IT1 HEPIX 2006 CPU technology session some ‘random walk’
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Massive Supercomputing Coping with Heterogeneity of Modern Accelerators Toshio Endo and Satoshi Matsuoka Tokyo Institute of Technology, Japan.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
Leibniz Supercomputing Centre Garching/Munich Matthias Brehm HPC Group June 16.
Sep 08, 2009 SPEEDUP – Optimization and Porting of Path Integral MC Code to New Computing Architectures V. Slavnić, A. Balaž, D. Stojiljković, A. Belić,
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) ITIL and Grid services at GridKa CHEP 2009,
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Nikhef/(SARA) tier-1 data center infrastructure
CASPUR Site Report Andrei Maslennikov Lead - Systems Rome, April 2006.
11 January 2005 High Performance Computing at NCAR Tom Bettge Deputy Director Scientific Computing Division National Center for Atmospheric Research Boulder,
Towards energy efficient HPC HP Apollo 8000 at Cyfronet Part I Patryk Lasoń, Marek Magryś.
Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University.
Patryk Lasoń, Marek Magryś
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Power and Cooling at Texas Advanced Computing Center Tommy Minyard, Ph.D. Director of Advanced Computing Systems 42 nd HPC User Forum September 8, 2011.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
Tackling I/O Issues 1 David Race 16 March 2010.
Operational and Application Experiences with the Infiniband Environment Sharon Brunett Caltech May 1, 2007.
The Worldwide LHC Computing Grid Frédéric Hemmer IT Department Head Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21 st June 2012.
Load Rebalancing for Distributed File Systems in Clouds.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft Steinbuch Centre for Computing
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
The Tier-1 center for CMS experiment at LIT JINR N. S. Astakhov, A. S. Baginyan, S. D. Belov, A. G. Dolbilov, A. O. Golunov, I. N. Gorbunov, N. I. Gromova,
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
Grid Operations in Germany T1-T2 workshop 2015 Torino, Italy Kilian Schwarz WooJin Park Christopher Jung.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
KIT – The Research University in the Helmholtz Association Welcome to GridKa School 2016 Hannes Hartenstein.
Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans.
Brief introduction about “Grid at LNS”
NIIF HPC services for research and education
HPC Roadshow Overview of HPC systems and software available within the LinkSCEEM project.
Grid site as a tool for data processing and data analysis
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME Outreach SESAME,
Appro Xtreme-X Supercomputers
Stallo: First impressions
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Vision for CERN IT Department
New strategies of the LHC experiments to meet
Versatile HPC: Comet Virtual Clusters for the Long Tail of Science SC17 Denver Colorado Comet Virtualization Team: Trevor Cooper, Dmitry Mishin, Christopher.
Overview of HPC systems and software available within
Presentation transcript:

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) SX-9, a good Choice ? Steinbuch Centre for Computing (SCC)

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Introduction Karlsruhe Institut of Technology (KIT) Steinbuch Centre for Computing (SCC) at KIT The SCC environment Why SX-9 SX-9 problems! Benchmarks Do we have a software or hardware problem? Further investments to be done by NEC Conclusion

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Universität Karlsruhe (TH) Forschungszentrum Karlsruhe + = + = CC of the University IWR of the Research Centre Karlsruhe Institut of Technology After , the Karlsruhe Institut of Technology will be a legal entity.

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Steinbuch Centre for Computing (SCC) Founded on January 1 st, 2008 Information Technology Center of KIT Merger of the Computing Center of Karlsruhe University and the Institut of Scientific Computing of the Research Center Karlsruhe One of the largest scientific computing centers in Europe SCC at Research Center SCC at Karlsruhe University

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Mission of the SCC IT-Services under one roof with own research and development Promotion of research, teaching, study, further education and administration at KIT by excellent services Major center for modelling, simulation and optimization Leading role in Scientific Computing, HPC and Grids and Clouds as well as Large Scale Data Management and Analysis

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | SCC – Competencies and Synergies Synergies between R&D and IT-Services: IT-Management & Process Integration Distributed Systems, Grid & Virtualization IT Security & Service Mgmt., innovative Network Technologies HPC & Cluster Computing, Visualization Grid/CLOUD HPCSOA Virtualization Numerical Simulation & Optimization Communication Technologies IT-Security Large-Scale Data

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | High Performance Computing, Cluster Computing, HW-aware Computing Grid und Cloud Computing, Virtualized Service Environments, Service-Oriented Architecture – SOA Large-Scale Data Management & Analysis, Large-Scale Data Facility (e.g. WLCG, Genom Research Data) Energy-efficient large System Environments (Green IT) Process Integration and Service Management SCC – Main Topics of Research

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Environment at SCC HPC-Clusters based on Intel or AMD processors with InfinBand or Quadrics interconnects (XC1, XC2, IC1, OPUS IB, bwGRiD, ~ 9000 cores) GridKa cluster for the LARGE HADRON COLLIDER (LHC) project at CERN (~ 8500 cores, PBytes of disk storage and tape, no InfiniBand interconnect) 2 x SX-8R each running 8 processors, 256 GByte main memory a SX-9 system, 16 processors, 1 TB main memory for all three systems 80 TByte GFS disk storage A Power6 Blade system running AIX with 112 cores and IB (JS22)

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Why SX-9 A large memory node 1 TB and 16 processors, but 16 processors is not a must, better 8, but also 4 could be enough! Application CPU-performance should be a factor 2-3 compared with SX-8R, should be realistic because SX-8R has a memory bottleneck! (100 GFlop/s peak for SX-9 compared with 35 GFlop/s of SX-8R) Scalar unit should be a factor 2 faster on SX-9. Power consumption should be less Floor space should be less Seamless integration into the existing SX-8R world including the front-end environment, technology upgrade of the SX-8R sustainable development for the KIT-users up to 2011 Pre-stage for users to go to larger NEC vector systems

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | SX-9 problems! Five processors of the SX-9 node fails in the first quarter of For changing the failed processor you have to shut-down the system repair and add some cooling liquid Scalar CPU-performance is worse than assumed Vector performance achived is bad compared with SX-8R SUPER-UX 17.1 on SX-8R and SUPER-UX 18.1 on SX-9 For the users we have at the moment two different front-end systems, two different environments, different schedulers. Air cooling problems (35 kW), but we have enough cool water! Cooling problem with one memory hub (last Sunday!)

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Benchmarks Triad (d=a+b*c) Solution of a band-matrix Vector length SX-8R (vector) SX-9 (vector) s0.08s s0.30s s3.20s s31.90s s319.00s SX-8RSX-9 vector15.8s11.3s (5.7s)* scalar716.9s925.6s * you can’t use the code on SX-8R, without ADB!

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Benchmarks COSMO/LM weather code on different systems:

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Do we have a software or hardware problem? We have a slow scalar unit for SX-9.  Compiler is not using the ADB automatically, directives are needed. We don’t know the information, the compiler is writing into ADB  tuning guide  It seems, the libraries are not optimized for SX-9 yet!  Now it looks like a stable hardware. ☺ The newest Super-UX is not supported on SX-8R  KIT users have jobs with a vector rate of 94% up to 99% with average vector size of more than But in all cases it isn’t good enough to reach a speed-up > 2 against the SX-8R  Memory usage near the whole shared main memory and heavily communication results in growing elapsed time for all jobs on the node.   bank conflicts

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Do we have a software or hardware problem? DGEMM should run with more than 90 GFlop/s on a single CPU (Schoenemeyer, November 2008), but only 84 archived!  Tuning-guide for SX-9 needed?

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Further investments to be done by NEC Speed-up the existing scalar units at costumer sides by a factor of two. Further investigation in the used compiler technology: ADB techniques tuning handbook Harmonize the SX-8R and SX-9 environment, same compiler versions, libraries, operating system, scheduler, …. Supporting customers by the NEC benchmark team. Cool the system using the onside cold water installation. Better explanation of hard- and software and the interference.

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Conclusion Why customers should use NEC vector systems in the future? Because of reliability? …………… GFlops/Watts? …………… GFlops/m 2 floorspace ………….… serial application performance? ……………. general purpose use? ……………. main memory? ……………. price? ________________ ……………. TCO of the solution for the main applications?

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH) | Frank Schmitz | NUG XXI | Thank You for the Attention