Fall 2007 CS 668/788 Parallel Computing Fred Annexstein 513-556-1807.

Slides:



Advertisements
Similar presentations
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Advertisements

SE 292 (3:0) High Performance Computing Aug R. Govindarajan Sathish S. Vadhiyar
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Lecture 6: Multicore Systems
1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Types of Parallel Computers
Revisiting a slide from the syllabus: CS 525 will cover Parallel and distributed computing architectures – Shared memory processors – Distributed memory.
Introduction CS 524 – High-Performance Computing.
1 Course Information Parallel Computing Fall 2008.
1 Course Information Parallel Computing Spring 2010.
Graph Analysis with High Performance Computing by Bruce Hendrickson and Jonathan W. Berry Sandria National Laboratories Published in the March/April 2008.
CS/CMPE 524 – High- Performance Computing Outline.
Parallel and Distributed Algorithms (CS 6/76501) Spring 2010 Johnnie W. Baker.
CS 524 – High- Performance Computing Outline. CS High-Performance Computing (Wi 2003/2004) - Asim LUMS2 Description (1) Introduction to.
Fall 2008 CS 668 Parallel Computing Prof. Fred Annexstein Office Hours: 11-1 MW or by appointment Tel:
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Parallel Architectures
WEEK 1 CS 361: ADVANCED DATA STRUCTURES AND ALGORITHMS Dong Si Dept. of Computer Science 1.
Parallel and Distributed Computing Overview and Syllabus Professor Johnnie Baker Guest Lecturer: Robert Walker.
Introduction CSE 1310 – Introduction to Computers and Programming
Lecture 2 CSS314 Parallel Computing
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006outline.1 ITCS 4145/5145 Parallel Programming (Cluster Computing) Fall 2006 Barry Wilkinson.
Lecture 1 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Slides Courtesy Michael J. Quinn Parallel Programming in C.
AN EXTENDED OPENMP TARGETING ON THE HYBRID ARCHITECTURE OF SMP-CLUSTER Author : Y. Zhao 、 C. Hu 、 S. Wang 、 S. Zhang Source : Proceedings of the 2nd IASTED.
1 Object Oriented Design COP 3331 Spring 2011 MWF 11:50 AM – 12:40 PM CHE 103 Instructor:Dr. Rollins Turner Dept. of Computer Science and Engineering ENB.
CS 390 Unix Programming Summer Unix Programming - CS 3902 Course Details Online Information Please check.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Super computers Parallel Processing By Lecturer: Aisha Dawood.
Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Ministry of Higher Education Sohar College of Applied Sciences IT department Comp Introduction to Programming Using C++ Fall, 2011.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Parallel and Distributed Computing Overview and Syllabus Professor Johnnie Baker Guest Lecturer: Robert Walker.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
CS 52500, Parallel Computing Spring 2011 Alex Pothen Lectures: Tues, Thurs, 3:00—4:15 PM, BRNG 2275 Office Hours: Wed 3:00—4:00 PM; Thurs 4:30—5:30 PM;
CS 6068 Parallel Computing Fall 2015 Prof. Fred Office Hours: before class or by appointment Tel:
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Parallel Computing Presented by Justin Reschke
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
1.3 Operating system services An operating system provide services to programs and to the users of the program. It provides an environment for the execution.
Why Parallel/Distributed Computing Sushil K. Prasad
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
These slides are based on the book:
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Exploratory Decomposition Dr. Xiao Qin Auburn.
Fall 2008 CS 668 Parallel Computing
Parallel Programming By J. H. Wang May 2, 2017.
The University of Adelaide, School of Computer Science
Constructing a system with multiple computers or processors
Java programming lecture one
Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
What is Parallel and Distributed computing?
CSCE569 Parallel Computing
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Chapter 4: Threads & Concurrency
Types of Parallel Computers
Presentation transcript:

Fall 2007 CS 668/788 Parallel Computing Fred Annexstein

Lecture 1: Welcome Goals of this course Syllabus, policies, grading Blackboard Resources LINC Linux cluster Introduction/Motivation for HPPC Scope of the Problems in Parallel Computing

Goals Primary: –Provide an introduction to the computing systems, programming approaches, some common numerical and algorithmic methods used for high performance parallel computing Secondary: –Have an course meeting competency requirements of RRSCS –Provide hands-on parallel programming experience

Official Syllabus Available on Blackboard Recommended Textbooks –1. Parallel Programming in C with MPI and OpenMP, Michael J. Quinn –2. Parallel Programming With Mpi, Peter Pacheco –3. Introduction to Parallel Computing: Design and Analysis of Algorithms: Ananth Grama, Anshul Gupta, George Karpis, Vipin Kumar –4. Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface by William Gropp

Workload/Grading Exams (2) –Graded 40% of Grade Written exercises (3-4) –May/may not be graded Programming Assignments (3-4) –May be done in Groups –MPI programming, performance measurement Research papers (1) –Discussion research questions, strengths, weaknesses, interesting points, contemporary bibliography Final project (1) –Group programming project and report

Policies Missed Exams: –Missed exams can not be made up unless pre- approved. Please see the instructor as soon as possible in the event of a conflict. Academic Misconduct: –Plagiarism on assignments, quizzes or exams will not be tolerated. See your student code of conduct ( for more on the consequences of academic misconduct. There are no “small” offenses.

Blackboard Syllabus and my contact info Announcements Lecture slides Assignment handouts Web resources relevant to the course Discussion board Grades

What is the Ralph Regula School? The Ralph Regula School of Computational Science is a statewide, virtual school focused on computational science. It is a collaborative effort of the Ohio Board of Regents, Ohio Supercomputer Center, Ohio Learning Network and Ohio's colleges and universities. With funding from NSF, the school acts as a coordinating entity for a variety of computational science education activities aimed at making education in computational science available to students across Ohio, as well as to workers seeking continuing education about this technology. Website:

CS LINC Cluster Michal Kouril’s links – –See README file for instructions on running MPI code on beowulf.linc.uc.edu Accounts –ECE/CS students should already have an account –I can request accounts for the non-ECE/CS students Access –Remote access only, the cluster is in the ECE/CS server/machine room on the 8 th floor of Rhodes, visible through windows in the 890’s hallway

Why HPPC? Who needs a roomful of computers anyway? My PC and XBOX run at GFLOP rates ( Billion Floating Point Operations per second) NCSA TeraGrid IA-64 Linux Cluster ( Hardware/TGIA64LinuxCluster/)

Needed by People who study Science and Engineering Materials / Superconductivity Fluid Flow Weather/Climate Structural Deformation Genetics / Protein interactions Seismic Many Research Projects in Natural Sciences and Engineering cannot exist without HPPC

Why are the problems so large? 3-Dimensional –If you want to increase the level of resolution by factor of 10, problem size increases by 10 3 Many Length Scales (both time and space) –If you want to observe the interactions between very small local phenomenon and larger more global phenomenon The number of relationships between data items grows quadraticly. –Example: human genome 3.2 G base pairs means about =5E relations

How can you solve these problems? Take advantage of parallelism –Large problems generally have many operations which can be performed concurrently Parallelism can be exploited at many levels by the computer hardware –Within the CPU core, multiple functional units, pipelining –Within the Chip, many cores –On a node, multiple chips –In a system, many nodes

However…. Parallelism has overheads –At the core and chip level the cost is complexity and money –Most applications get only a fraction of peak performance (10%-20%) –At the chip and node level, memory bus can get saturated if too many cores –Between nodes, the communication infrastructure is typically much slower than the CPU

Necessity Yields Modest Success Power of CPUs keeps growing exponentiallyPower of CPUs keeps growing exponentially Parallel programming environments changing very slowly – much harder than sequentialParallel programming environments changing very slowly – much harder than sequential Two standards have emerged MPI library, for processes that do not share memoryMPI library, for processes that do not share memory OpenMP directives, for processes that do share memoryOpenMP directives, for processes that do share memory

Why MPI? MPI = “Message Passing Interface”MPI = “Message Passing Interface” Standard specification for message- passing librariesStandard specification for message- passing libraries Very PortableVery Portable Libraries available on virtually all parallel computersLibraries available on virtually all parallel computers Free libraries also available for networks of workstations or commodity clustersFree libraries also available for networks of workstations or commodity clusters

Why OpenMP? OpenMP an application programming interface (API) for shared-memory systemsOpenMP an application programming interface (API) for shared-memory systems Based on model of creating and scheduling multi-threaded computations.Based on model of creating and scheduling multi-threaded computations. Supports higher performance parallel programming of symmetrical multiprocessorsSupports higher performance parallel programming of symmetrical multiprocessors

“All About the Benjamins” Commercial Parallel Systems Relatively costly per processorRelatively costly per processor Primitive programming environmentsPrimitive programming environments Focus on commercial salesFocus on commercial sales Scientists looked for alternativeScientists looked for alternative Beowulf Concept NASA (Sterling and Becker)NASA (Sterling and Becker) Commodity processorsCommodity processors Commodity interconnectCommodity interconnect Linux operating systemLinux operating system Message Passing Interface (MPI) libraryMessage Passing Interface (MPI) library High performance/$ for certain applicationsHigh performance/$ for certain applications

Programmer Desperately Seeking Concurrency Task Dependence Graph Directed graphDirected graph Vertices = tasksVertices = tasks Edges = dependencesEdges = dependences Data Parallelism Independent tasks apply same operation to different elements of a data setIndependent tasks apply same operation to different elements of a data set Okay to perform operations concurrentlyOkay to perform operations concurrently Functional Parallelism Independent tasks apply different operations to different data elementsIndependent tasks apply different operations to different data elements First and second statementsFirst and second statements Third and fourth statementsThird and fourth statements Pipelining Divide a process into stagesDivide a process into stages Produce several items simultaneouslyProduce several items simultaneously

Why not just use a Compiler? Parallelizing compiler - Detect parallelism in sequential programParallelizing compiler - Detect parallelism in sequential program Produce parallel executable programProduce parallel executable programAdvantages Can leverage millions of lines of existing serial programs Saves time and labor- Requires no retraining of programmersSaves time and labor- Requires no retraining of programmers Sequential programming easier than parallel programmingSequential programming easier than parallel programmingDisadvantages Parallelism may be irretrievably lost when programs written in sequential languagesParallelism may be irretrievably lost when programs written in sequential languages Simple example: Compute all partial sums in an arraySimple example: Compute all partial sums in an array Performance of parallelizing compilers on broad range of applications still up in airPerformance of parallelizing compilers on broad range of applications still up in air

Or we could Extend Languages Programmer can give directives or clues to the complier about how to parallelize Advantages Easiest, quickest, and least expensiveEasiest, quickest, and least expensive Allows existing compiler technology to be leveragedAllows existing compiler technology to be leveraged New libraries can be ready soon after new parallel computers are availableNew libraries can be ready soon after new parallel computers are availableDisadvantages Lack of compiler support to catch errorsLack of compiler support to catch errors Easy to write programs that are difficult to debugEasy to write programs that are difficult to debug

Or Create New Parallel Languages Advantages Allows programmer to communicate parallelism to compiler directlyAllows programmer to communicate parallelism to compiler directly Improves probability that executable will achieve high performanceImproves probability that executable will achieve high performanceDisadvantages Requires development of new compilersRequires development of new compilers New languages may not become standardsNew languages may not become standards Programmer resistanceProgrammer resistance

Where are we in 2007? Low-level approach is most popularLow-level approach is most popular Augment existing language with low-level parallel constructs and directivesAugment existing language with low-level parallel constructs and directives MPI and OpenMP are prime examplesMPI and OpenMP are prime examplesAdvantages EfficiencyEfficiency PortabilityPortabilityDisadvantages More difficult to program and debugMore difficult to program and debug

Programming Assignment #1 Log into beowulf.linc.uc.edu and run sample programs.

Reading Assignment #1 on Blackboard