Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)

Slides:

Advertisements

Similar presentations

SCARF Duncan Tooke RAL HPCSG. Overview What is SCARF? Hardware & OS Management Software Users Future.

Advertisements

Supercomputing Institute for Advanced Computational Research © 2009 Regents of the University of Minnesota. All rights reserved. The Minnesota Supercomputing.

Beowulf Supercomputer System Lee, Jung won CS843.

XEON PHI. TOPICS What are multicore processors? Intel MIC architecture Xeon Phi Programming for Xeon Phi Performance Applications.

HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

E-science grid facility for Europe and Latin America The São Paulo State University Campus Grid Initiative Marco A. F. Dias, José Roberto.

IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)

Introduction CS 524 – High-Performance Computing.

Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.

Figure 1.1 Interaction between applications and the operating system.

Techniques for Enabling Highly Efficient Message Passing on Many-Core Architectures Min Si PhD student at University of Tokyo, Tokyo, Japan Advisor : Yutaka.

High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.

Contemporary Languages in Parallel Computing Raymond Hummel.

VMware vCenter Server Module 4.

Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.

HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.

Getting Reproducible Results with Intel® MKL 11.0

ORIGINAL AUTHOR JAMES REINDERS, INTEL PRESENTED BY ADITYA AMBARDEKAR Overview for Intel Xeon Processors and Intel Xeon Phi coprocessors.

Project Proposal (Title + Abstract) Due Wednesday, September 4, 2013.

1 Down Place Hammersmith London UK 530 Lytton Ave. Palo Alto CA USA.

Virtualization Concept. Virtualization  Real: it exists, you can see it.  Transparent: it exists, you cannot see it  Virtual: it does not exist, you.

About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.

Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>

Sumit Kumar Archana Kumar Group # 4 CSE 591 : Virtualization and Cloud Computing.

DATA STRUCTURES OPTIMISATION FOR MANY-CORE SYSTEMS Matthew Freeman | Supervisor: Maciej Golebiewski CSIRO Vacation Scholar Program

A COMPARISON MPI vs POSIX Threads. Overview MPI allows you to run multiple processes on 1 host  How would running MPI on 1 host compare with POSIX thread.

Multi-core Programming for Academia Intel Software College.

Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.

Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.

Kento Aida, Tokyo Institute of Technology Grid Challenge - programming competition on the Grid - Kento Aida Tokyo Institute of Technology 22nd APAN Meeting.

Advisor: Dr. Aamir Shafi Co-Advisor: Mr. Ali Sajjad Member: Dr. Hafiz Farooq Member: Mr. Tahir Azim Optimizing N-body Simulations for Multi-core Compute.

Sobolev Showcase Computational Mathematics and Imaging Lab.

The WRF Model The Weather Research and Forecasting (WRF) Model is a mesoscale numerical weather prediction system designed for both atmospheric research.

Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.

The Cray XC30 “Darter” System Daniel Lucio. The Darter Supercomputer.

Module 1: Installing and Configuring Servers. Module Overview Installing Windows Server 2008 Managing Server Roles and Features Overview of the Server.

High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.

Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.

CCS Overview Rene Salmon Center for Computational Science.

The DCS lab. Computer infrastructure Peter Chochula.

Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.

2011/08/23 國家高速網路與計算中心 Advanced Large-scale Parallel Supercluster.

HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com.

Hands-On Virtual Computing

Co-Processor Architectures Fermi vs. Knights Ferry Roger Goff Dell Senior Global CERN/LHC Technologist |

GFlow: Towards GPU-based High- Performance Table Matching in OpenFlow Switches Author : Kun Qiu, Zhe Chen, Yang Chen, Jin Zhao, Xin Wang Publisher : Information.

Ismayilov Ali Institute of Physics of ANAS Creating a distributed computing grid of Azerbaijan for collaborative research NEC'2011.

Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.

Scaling up R computation with high performance computing resources.

Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,

Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.

Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb

CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.

Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.

J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013.

Compute and Storage For the Farm at Jlab

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

High-performance tracing of many-core systems with LTTng

Constructing a system with multiple computers or processors

Reducing OS noise using offload driver on Intel® Xeon Phi™ Processor

Computer Systems Summary

Compiler Back End Panel

Compiler Back End Panel

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

Introduction to Operating Systems

Introduction to research computing using Condor

Presentation transcript:

Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)

MPC Lab Sessions Hands-on activities divided in topics - 2x ~3-hour sessions Learner proceeds from one topic to the next at his/her own speed First session – Starts with a live demo on how to access the remote system – Finishes with a high-performance test-drive, where the participant is guided tries to extract the maximum performance of a coprocessor. Second session – Provides extra coverage of introductory aspects of programming the Intel manycore coprocessor – Concludes with an example on how to improve the performance efficiency of applications developed for the Xeon Phi 2

Lab Session 1 (CL5) 1. Introduction to the Intel Xeon Phi Coprocessor – Overview of the hardware architecture – Overview of the system software and programming models 2. Compiling and running simple applications 3. High-performance Test-Drive 4. Running a basic N-body simulation (optional) 3

Lab Session 2 (CL6) 1. Task Parallelism with OpenMP and Cilk Plus – Overview of OpenMP – Overview of Cilk Plus 2. Intel MPI Programming Models 3. Using Intel Math Kernel Library (MKL) 4. Optimizing a real-world code example 4

Intel / Unesp Manycore Testing Lab A special remote system that allows faculty and students to work with computers with lots of cores One of the first manycore labs outside U.S. – Server donated by Intel  Intel Xeon Phi coprocessors  Suite of Intel's software development tools – Users  test the performance of their codes in a highly parallel system  are registered as guests into CSC user database (LDAP) – Authentication / authorization  controlled by digital certificates issued by ANSP Grid CA First results: hands-on activities at – INFIERI Summer School 2013 (University of Oxford) – Intel Software Conference 2013 (UNESP/SP and COPPE/RJ) – SBAC-PAD 2013 Paralell Programming Marathon 5

Intel / Unesp Manycore Testing Lab Host Server – 2x Intel Xeon processor, 8-core, 2.3 GHz (E5-2670) – 64 GB memory, 1+1 TB disk – 2 network links:  University network (commodity, shared)  High-speed optical network (dedicated) – 3x Xeon Phi 57-core, 1.1 GHz, 6 GB GDDR mem (3120A) Total number of cores – Xeon processors: 16 cores, 32 threads – Xeon Phi coprocessors: 171 cores, 684 threads 6

Intel Xeon Phi coprocessors portfolio 7

Formal Agreement with Intel 28/Aug/13HEP Workshop8

Formal Agreement with Intel 28/Aug/13HEP Workshop9