CSC 7600 Lecture 28 : Final Exam Review Spring 2010 HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS FINAL EXAM REVIEW Daniel Kogler, Chirag Dekate.

Slides:



Advertisements
Similar presentations
Issues of HPC software From the experience of TH-1A Lu Yutong NUDT.
Advertisements

CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 7:
1 Parallel Scientific Computing: Algorithms and Tools Lecture #3 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
Introductory Courses in High Performance Computing at Illinois David Padua.
1 Lecture 5: Part 1 Performance Laws: Speedup and Scalability.
Computer Systems/Operating Systems - Class 8
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Early Adopter: ASU - Intel Collaboration in Parallel and Distributed Computing Yinong Chen, Eric Kostelich, Yann-Hang Lee, Alex Mahalov, Gil Speyer, and.
Lecture 1 – Parallel Programming Primer CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed.
CS 470/570:Introduction to Parallel and Distributed Computing.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Adding PDC within a Six-Course Subset of the CS Major Apan Qasem Texas State University.
General What is an OS? What do you get when you buy an OS? What does the OS do? What are the parts of an OS? What is the kernel? What is a device.
Scalable Data Clustering with GPUs Andrew D. Pangborn Thesis Defense Rochester Institute of Technology Computer Engineering Department Friday, May 14 th.
STRATEGIC NAMING: MULTI-THREADED ALGORITHM (Ch 27, Cormen et al.) Parallelization Four types of computing: –Instruction (single, multiple) per clock cycle.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
30 October Agenda for Today Introduction and purpose of the course Introduction and purpose of the course Organization of a computer system Organization.
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
Early Adopter: Integration of Parallel Topics into the Undergraduate CS Curriculum at Calvin College Joel C. Adams Chair, Department of Computer Science.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Silberschatz, Galvin and Gagne  Operating System Concepts Operating Systems 1. Overview 2. Process Management 3. Storage Management 4. I/O Systems.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
Exam Review Andy Wang Operating Systems COP 4610 / CGS 5765.
1 Concurrent and Distributed Programming Lecture 2 Parallel architectures Performance of parallel programs References: Based on: Mark Silberstein, ,
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Parallel Computing Presented by Justin Reschke
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Introduction to operating systems What is an operating system? An operating system is a program that, from a programmer’s perspective, adds a variety of.
Elec/Comp 526 Spring 2015 High Performance Computer Architecture Instructor Peter Varman DH 2022 (Duncan Hall) rice.edux3990 Office Hours Tue/Thu.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
4- Performance Analysis of Parallel Programs
Lecture 1 – Parallel Programming Primer
CSE 451: Operating Systems
Lecture Topics: 11/1 Processes Process Management
EE 193: Parallel Computing
Guoliang Chen Parallel Computing Guoliang Chen
Symmetric Multiprocessing (SMP)
Chirag Dekate Department of Computer Science
Mid Term review CSC345.
Operating Systems Lecture 1.
Chapter 4: Threads.
Introduction, background, jargon
Chapter 01: Introduction
Assoc. Prof. Marc FRÎNCU, PhD. Habil.
CSC Multiprocessor Programming, Spring, 2011
Presentation transcript:

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS FINAL EXAM REVIEW Daniel Kogler, Chirag Dekate & Timur Gilmanov Department of Computer Science Louisiana State University May 5th, 2011

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 2 : Parallel Computer Architecture HPC System Stack (5) Performance Factors (7, 8, 9, 10) Scalability (21, 22) MIMD, SIMD, Vector Processing (Pipelineing – Important), Shared Memory 2

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 3: Commodity Clusters Commodity clusters vs Constellations (8) Key parameters for cluster computing (24) Where is the parallelism (25) Constituent hardware elements (27) Decoupled Work Queue Model (46) 3

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 4 : Benchmarking Basic Performance metrics (4) Benchmarking Definition (5) Purpose of Benchmarking (6) Linpack and HPL 4

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 5: Capacity Computing Speedup & Efficiency (5) Capacity, Capability, Cooperative – Important (7,8,11) Ideal Speedup (18,19) Granularities in parallelism (20) Overhead (21, 22, 23, 24) Condor Class Ads (30, 31) Condor MatchMaker(32) Condor commands (37) Capacity Computing Performance issues (53) 5

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 6: Communicating Sequential Processes (CSP) Scalability, Strong Scaling, Weak Scaling (7, 8, 9, 10) Cooperative Computing (12) Cooperative Computing Highlights : –Data decomposition Goals in CSP (13) –Distributed Concurrent Processes (14) –Data Exchange (15) –Synchronization(16) Performance issues in CSP (64, 65) 6

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 7: MPI Point to point communication in-depth Deadlock & how to resolve deadlocks Be able to Understand MPI programs and detect conceptual flaws in the programs and correct the errors etc. 7

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 8: MPI Collective Calls Be able to understand MPI programs and detect conceptual flaws in the program and correct the errors 8

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 9: SMP IMPORTANT - AMDAHLS LAW(9,10,11,12, 13) Levels of Memory Hierarchy (30) Cache Measures (31, 32) IMPORTANT - CACHE PERFORMANCE (33) refer to lecture 10 that has more complete and comprehensive coverage of this topic. 9

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 10: Enabling Technologies Logic technology metrics (12) What is bisection bandwidth (67) 10

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture11: Pthreads IMPORTANT – CPI – Cycles Per Instructions (7, 8, 9, 10) Race Conditions, Critical Sections (15, 16) Thread Synchronization Mechanisms(17 – 22) Important - Deadlock, Livelock, Starvation(29) Priority inversion (30) 11

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 12: OpenMP Where are we? (4) – use as a guide for material to study HPC Modelities (5) OpenMP data environment (28) OpenMP work-sharing directives (29) OpenMP thread synchronizations (40, 41, 42, 43) OpenMP reduction Be able to read and understand OpenMP C source code, and detect anomalies, and correct the errors. 12

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 13: Performance Performance Counters (11, 12, 13, 14) Performance Analysis Tools (15) Gprof (20) PerfSuite (24,25) PAPI (28, 29, 30, 31, 32) TAU (49, 50, 51, 78, 79) SMP to MPP (60, 62, 63) 13

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 14: Visualization Basic gnuplot commands (48 – 56) 14

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 15: Parallel Algorithms 1 OpenMP and MPI Matrix Multiplication (54 – 70) Review source code 15

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 16: Parallel Algorithms 2 Parallel Matrix Processing and Locality (5, 6, 7, 8, 9) Matrix Transpose (18-19, 21-23) 16

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 17: Parallel Algorithms 3 Review Parallel Sorting Algorithms 17

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 18: Parallel Algorithms 4 Not Testable 18

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 19: Parallel File I/O 1 RAID (9, 10,11, 12, 13, 14) Distributed File Systems: NFS (16, 17, 18) Parallel File Systems (20, 21, 22, 23) 19

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 20: Parallel File I/O 2 Not Testable 20

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 21: Operating System 1 Operating System (7, 12, 13, 14, 15, 16) Process Management (18, 19, 20, 23, 24, 25) Threads (27, 28) Memory Management (31, 32) Storage Management (35, 36) OS Kernel (39) Modern Operating Systems (41, 42, 44, 45) Unix and Linux (57, 67, 68) 21

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 22 Libraries 1 Lecture 19: –Static & Dynamic libraries (27, 28) 22

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 23 PFIO3 + Libraries 2 Not testable 23

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 24: Operating Systems 2 Linux and Unix Concurrency mechanisms (24, 25) Linux and Unix Scheduling (26, 27,28) Unix & Linux I/O (32, 33) Lightweight Kernels (49, 50) Compute Node Kernels (55, 56, 57) 24

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 25: Scheduling Job Scheduling (4, 5) CPU Scheduling algorithms (9 – 15) Workload Management Systems (20 – 27) Scheduling Algorithms for WMS FIFO, FIFO with Backfill, EASY, Conservative (29 – 46) 25

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 26 : Checkpointing and System Administration Not Testable 26

CSC 7600 Lecture 28 : Final Exam Review Spring 2010 Lecture 27: Beyond and Beyond Not Testable 27

CSC 7600 Lecture 28 : Final Exam Review Spring