IEEE-HKN Chapter at Wichita State

Slides:

Advertisements

Similar presentations

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Advertisements

Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.

GPU System Architecture Alan Gray EPCC The University of Edinburgh.

A many-core GPU architecture.. Price, performance, and evolution.

1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.

Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.

GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.

Invited Talk 5: “Discovering Energy-Efficient High-Performance Computing Systems? WSU CAPPLab may help!” ICIEV 2014 Dhaka, Bangladesh Dr. Abu Asaduzzaman,

+ CS 325: CS Hardware and Software Organization and Architecture Introduction.

“Early Estimation of Cache Properties for Multicore Embedded Processors” ISERD ICETM 2015 Bangkok, Thailand May 16, 2015.

Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.

CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA

Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.

Extracted directly from:

1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,

By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.

“SMT/GPU Provides High Performance; at WSU CAPPLab, we can help you!” Bogazici University Istanbul, Turkey Presented by: Dr. Abu Asaduzzaman Assistant.

MS Thesis Defense “IMPROVING GPU PERFORMANCE BY REGROUPING CPU-MEMORY DATA” by Deepthi Gummadi CoE EECS Department April 21, 2014.

Computer Architecture and Parallel Programming Laboratory (CAPPLab) Group Meetings Greetings! Abu Asaduzzaman Assistant Professor, Elec. Eng. & Comp. Sci.

Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,

Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"

GPU-Accelerated Computing and Case-Based Reasoning Yanzhi Ren, Jiadi Yu, Yingying Chen Department of Electrical and Computer Engineering, Stevens Institute.

Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.

Carlo del Mundo Department of Electrical and Computer Engineering Ubiquitous Parallelism Are You Equipped To Code For Multi- and Many- Core Platforms?

GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.

Fall-11: Early Adoption of NSF/TCPP PDC Curriculum at Texas Tech University and Beyond Yong Chen Yu Zhuang Noe Lopez-Benitez May 10 th, 2013.

1)Leverage raw computational power of GPU  Magnitude performance gains possible.

Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.

GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.

1/50 University of Turkish Aeronautical Association Computer Engineering Department Ceng 541 Introduction to Parallel Computing Dr. Tansel Dökeroğlu

Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.

Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.

“A Learner-Centered Computational Experience in Nanotechnology for Undergraduate STEM Students” IEEE ISEC 2016 Friend Center at Princeton University March.

MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.

Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.

INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.

GPGPU Programming with CUDA Leandro Avila - University of Northern Iowa Mentor: Dr. Paul Gray Computer Science Department University of Northern Iowa.

Hadoop Javad Azimi May What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data. It includes:

“SMT Capable CPU-GPU Systems for Big Data”

General Purpose computing on Graphics Processing Units

High Performance Computing (HPC)

Parallel Programming Models

Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 July 12, 2012 © Barry Wilkinson CUDAIntro.ppt.

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Microarchitecture.

Multi-core processors

Waltham, Massachusetts, USA Wichita State University (WSU), USA

The University of Adelaide, School of Computer Science

Constructing a system with multiple computers or processors

Multi-core processors

Architecture & Organization 1

Components of Computer

INTRODUCTION TO MICROPROCESSORS

Lecture 2: Intro to the simd lifestyle and GPU internals

Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Nov 4, 2013.

Architecture & Organization 1

High Performance Computing LAB

NVIDIA Fermi Architecture

ICIEV 2014 Dhaka, Bangladesh

CLUSTER COMPUTING.

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

Chapter 1 Introduction.

Dr. Tansel Dökeroğlu University of Turkish Aeronautical Association Computer Engineering Department Ceng 442 Introduction to Parallel.

Graphics Processing Unit

Vrije Universiteit Amsterdam

CSE 502: Computer Architecture

Presentation transcript:

IEEE-HKN Chapter at Wichita State Wichita, KS “High-Performance Computing Systems and Our Future” Understanding the world’s biggest mysteries (such as human genome and climate change) would be impossible without computers. Supercomputers operates at Peta (10^15) FLOPS; however, require tens of millions of dollars. High-performance computing (HPC) offers similar processing speed (10^12 FLOPS) with more affordability, accessibility, and ease-of-use. This presentation will introduce HPC systems and examples how HPC solves big data problems effectively, and discuss the societal benefits of research on HPC systems. Presenter: Abu Asaduzzaman Associate Professor of Computer Engineering, Director of CAPPLab, and Advisor of the IEEE Student Branch November 16, 2017

“High-Performance Computing Systems and Our Future” Outline ► Introduction We and Our Future High-Performance Computing (HPC) Activities HPC Systems Computer Systems SMT-Capable CPU-GPU Systems Examples: HPC Systems for Big Data Matrix Multiplication, Graph CAPPLab Research Concluding Remarks

“High-Performance Computing Systems and Our Future” We and Our Future When/where Eta Kappa Nu (HKN) was founded first? In 1904 at UIUC by Maurice L. Carr [1] How about the origin of HKN? Greek word for electron: ΗΛΕΚΤΡΟΝ [1] What was the original goal of HKN? … to help electrical engineering graduates find employment [2] … Today what are the fields that HKN does recognize? The IEEE-designated fields: Engineering, Computer Sciences and Information Technology, Physical Sciences, Biological and Medical Sciences, Mathematics, Technical Communications, Education, Management, and Law and Policy. [2] [1] “About HKN,” http://hkn.ieee.org/about/ [2] “Qualifications for IEEE,” Membershiphttps://www.ieee.org/membership_services/membership/qualifications.html

“High-Performance Computing Systems and Our Future” We and Our Future (+) Industry Apply and Research Academia (and other organizations) Teach and Research

“High-Performance Computing Systems and Our Future” We and Our Future (+) National Association of Colleges and Employers (NCAA) SALARY SURVEY 2017 [1] [1] https://www.naceweb.org/uploadedfiles/files/2017/publication/executive-summary/2017-nace-salary-survey-fall-executive-summary.pdf

“High-Performance Computing Systems and Our Future” We and Our Future (+) Michigan Tech – 2018 Engineering Salary Statistics– Engineers Get Top Pay [1] Computer Engineering Hardware Software Electrical Engineering Mechanical Engineering [1] http://www.mtu.edu/engineering/outreach/welcome/salary/

“High-Performance Computing Systems and Our Future” High-Performance Computing (HPC) Activities What is high-performance computing? HPC generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. [1] Multicore central processing unit (CPU) and many-core graphics processing unit (GPU) cards are typically used to build HPC systems. [1] https://insidehpc.com/hpc-basic-training/what-is-hpc/

“High-Performance Computing Systems and Our Future” HPC in Academia Google search Google search

“High-Performance Computing Systems and Our Future” HPC Research in NSF Google search Google search

“High-Performance Computing Systems and Our Future” U.S. Leadership in HPC [1, 2] [1] https://obamawhitehouse.archives.gov/blog/2015/07/29/advancing-us-leadership-high-performance-computing [2] https://www.nitrd.gov/nitrdgroups/images/b/b4/NSA_DOE_HPC_TechMeetingReport.pdf

“High-Performance Computing Systems and Our Future” HPC Research at WSU

“High-Performance Computing Systems and Our Future” Outline ► Introduction We and Our Future High-Performance Computing (HPC) Activities HPC Systems Computer Systems SMT-Capable CPU-GPU Systems Examples: HPC Systems for Big Data Matrix Multiplication, Graph CAPPLab Research Concluding Remarks

“High-Performance Computing Systems and Our Future” Computer Systems Google search Google search

“High-Performance Computing Systems and Our Future” Computer Systems Google search Google search

“High-Performance Computing Systems and Our Future” SMT-Capable CPU-GPU Systems = HPC Systems SMT – Simultaneous Multi-Threading CPU – Central Processing Unit GPU – Graphics Processing Unit Pure Harvard Architecture Von Neumann Arch A Computer Architecture Name of the Game: performance, energy consumption, cost, …

“High-Performance Computing Systems and Our Future” CPU-GPU Systems (+) Time-Efficient Computing SMT – Simultaneous Multi-Threading CPU – Central Processing Unit GPU – Graphics Processing Unit Compute Unified Device Architecture (CUDA) – a parallel computing platform and application programming interface (API) model created by Nvidia

“High-Performance Computing Systems and Our Future” SMT Capable CPU-GPU System CPU-GPU Systems (+) SMT – Simultaneous Multi-Threading CPU – Central Processing Unit GPU – Graphics Processing Unit Many-Core GPU Card Multicore CPU Instruction Execution A process is a running program. A process can generate many processes (called threads). …

“High-Performance Computing Systems and Our Future” SMT-Capable CPU-GPU Systems = HPC Systems CL1 – Level-1 Cache CL2 – Level-2 Cache Cache and memory are very power-hungry. More energy consumption, more heat dissipation! HPC: CPU i7-980X 130W, GPU Tesla K80 300W [1, 2] Tianhe-1A consumes 4.04 MW; for 4 MW at $0.10/kWh is $400 an hour or about $3.5 million per year. [3] CL1 CL2 Name of the Game: performance, energy consumption, cost, … [1] https://ark.intel.com/products/series/79666/Legacy-Intel-Core-Processors [2] https://images.nvidia.com/content/pdf/kepler/Tesla-K80-BoardSpec-07317-001-v05.pdf [3] https://en.wikipedia.org/wiki/Supercomputer

“High-Performance Computing Systems and Our Future” SMT-Capable CPU-GPU Systems = HPC Systems HPC Systems If SMT-capable 16-core CPU and 5000-core GPU card are used to build a HPC system, it offers about 9 Tera (10^12) FLOPS and costs about $5K. [1] HPC: CPU i7-980X 130W, GPU Tesla K80 300W Supercomputers A supercomputer may have more or less 300,000 processing cores and operate at Peta (10^15) FLOPS; however, it costs tens of millions of dollars. [2, 3] Tianhe-1A: 4.04 MW (about $3.5 million per year) Name of the Game: performance, energy consumption, cost, … [1] https://insidehpc.com/hpc-basic-training/what-is-hpc/ [2] https://en.wikipedia.org/wiki/Supercomputer [3] https://www.anandtech.com/show/8729/nvidia-launches-tesla-k80-gk210-gpu

“High-Performance Computing Systems and Our Future” Outline ► Introduction We and Our Future High-Performance Computing (HPC) Activities HPC Systems Computer Systems SMT-Capable CPU-GPU Systems Examples: HPC Systems for Big Data Matrix Multiplication, Graph CAPPLab Research Concluding Remarks

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data Matrix Multiplication (MM) [C] = [A] [B] 2 x 2 Matrix 8 (i.e., 2 * 2^2) multiplications 4 (i.e., 1 * 2^2) additions

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data Matrix Multiplication (MM) 2 x 2 Matrix: (2^3) 8 multiplications 3 x 3 Matrix: (3^3) 27 multiplications 4 x 4 Matrix: (4^3) 64 multiplications Are we reducing #*s? (no) What is the message? For many 2 x 2 matrix solvers (with 8 MULT), it takes “only” 2 * 8 MULT time unit (6416?) Do we have many solvers/cores? (Yes, GPU) A B C

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data GPGPU/CUDA Technology GPU (the chip itself) consists of group of Streaming Multiprocessors (SM) Inside each SM: 32 cores (sharing the same instruction) 64KB shared memory (shared among the 32 cores) 32K 32bit registers 2 warp schedulers (to schedule instructions) 4 special function units

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data Matrix Multiplication with CUDA for Graph [1] A graph is a representation of a set of objects (i.e., vertices) where some pairs of objects are connected by links (i.e., edges). To solve Engineering, Science, and, Math, problems Is there any path between vertices C and J? Yes. C-D-F-G-J (4). C-E-H-G-J (4). C-E-H-J (3). Number of paths of length 3 between two vertices? GPU Computing / Matrix Multiplication help solve graph problems effectively [1] “Matrix Multiplication with CUDA --- A basic introduction to the CUDA programming model,” by Robert Hochberg

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data Matrix Multiplication with CUDA for Graph [1] (+) Number of paths of length 4 between C and J? [1] “Matrix Multiplication with CUDA --- A basic introduction to the CUDA programming model,” by Robert Hochberg

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data Matrix Multiplication with CUDA for Graph [1] (+) Number of paths of length 4 between C and J? [1] “Matrix Multiplication with CUDA --- A basic introduction to the CUDA programming model,” by Robert Hochberg

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data Matrix Multiplication with CUDA for Graph [1] (+) Number of paths of length 4 between C and J? 7 Matrix multiplication Row (# of paths of length 1) x Column (# of paths of length 3)  Column (# of paths of length 4) Length 3  Length 4  Length 1  [1] “Matrix Multiplication with CUDA --- A basic introduction to the CUDA programming model,” by Robert Hochberg

“High-Performance Computing Systems and Our Future” HPC Systems for Big Data NVIDIA Tesla M2070 Performance 14 multiprocessors, 448 cores For matrices 1000x1000, approach a ratio of 1.7 For matrices 8000x8000, the shared memory solution is faster by a factor of 2.6 [1] “Matrix Multiplication with CUDA --- A basic introduction to the CUDA programming model,” by Robert Hochberg

“High-Performance Computing Systems and Our Future” CAPPLab Research Activities Asaduzzaman, A., Gummadi, D., and Yip, C.M., “A Talented CPU-to-GPU Memory Mapping Technique,” in IEEE SoutheastCon 2014, Lexington, KY, March 13-16, 2014. Asaduzzaman, A., Yip, C.M., Kumar, S., and Asmatulu, R., “Fast, Effective, and Adaptable Computer Modeling and Simulation of Lightning Strike Protection on Composite Materials,” in IEEE SoutheastCon Conference 2013, Jacksonville, FL, April 4-7, 2013. Asaduzzaman, A., Mitra, P., Chidella, K.K., Saeed, K.A., Cluff, K., and Mridha, M.F., “A Computer-Assisted Mammography Technique for Analyzing Breast Cancer,” IEEE International Conference on Advances in Electrical Engineering (ICAEE), Dhaka, Bangladesh, Sept. 28-30, 2017.

“High-Performance Computing Systems and Our Future” CAPPLab Research Activities (+) Asaduzzaman, A., Chidella, K.K., and Vardha, D., “An Energy-Efficient Directory Based Multicore Architecture with Wireless Routers to Minimize the Communication Latency,” IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 28, No. 2, pp. 374-385, May 2016. … Distributed Directory … CAEE Journal, under review, Nov. 2017. Chidella, K.K. and Asaduzzaman, A., “A Novel Directory Based WNoC-GPU System to Enhance Performance,” under preparation, 2017.

“High-Performance Computing Systems and Our Future” CAPPLab Research Activities (+) Research Grants/Awards (2016-2017) “High Performance Computing at Low (Room) Temperature,” CybertronPC; total $10,000 for one year (includes hardware donation). (2016-2017) “NVIDIA GPU Research Center at Wichita State,” NVIDIA Corporation; total $10,000 for two years (includes hardware donation). (2015–2016) “Collaborative Research: NetApp NFS Connector for Apache Spark Systems,” NetApp, Inc.; total $60,000 for nine months. (2014–2015) “An Empirical Application of High-Performance Pattern Recognition and Protein Binding to Treat Cancer,” WSU Flossie E. West Memorial Foundation; total $24,984 for one year. (2014–2015) “Discovering CUDA-Accelerated New Programming Paradigm to Address the Growing Low-Power High Performance Computing Requirements,” WSU URCP Award; total $4,498 for one year. (2014–2015) “Xilinx University Program (XUP) Award;” Xilinx; total $1,644 (hardware donation). (2014–2014) “Wiktronics-WSU Embedded Systems Research Project 2014,” Wiktronics Collaborative Project 2014; total $11,466 for six months. (2013–2014), “A novel task and data regrouping based parallel approach to solve massive problems faster on multithreaded computing systems,” Kansas NSF EPSCoR First Award; total $105,296 for 15 months. (2012–2012) “M2SYS-WSU Biometric Cloud Computing Research Project;” M2SYS Technology; total $2,875 for four months.

“High-Performance Computing Systems and Our Future” Outline ► Introduction We and Our Future High-Performance Computing (HPC) Activities HPC Systems Computer Systems SMT-Capable CPU-GPU Systems Examples: HPC Systems for Big Data Matrix Multiplication, Graph CAPPLab Research Concluding Remarks

“High-Performance Computing Systems and Our Future” Concluding Remarks Understanding the world’s biggest mysteries (such as human genome and climate change) would be impossible without computers. Supercomputers operates at Peta (10^15) FLOPS; however, require tens of millions of dollars. High-performance computing (HPC) offers similar processing speed (10^12 FLOPS) with more affordability, accessibility, and ease-of-use. This presentation introduced HPC systems and examples how HPC solves big data problems effectively, and discussed some societal benefits of research on HPC systems. Computing, Sensing, Communicating, Monitoring, and Controlling (CSCMC) – all-in-one for future computing!!! [1] [1] The Future of Computing Performance: Game Over or Next Level? by Fuller and Millett, NAP (2011), http://www.nap.edu/download.php?record_id=12980

Thank You! IEEE-HKN Chapter at Wichita State QUESTIONS/COMMENTS? “High-Performance Computing Systems and Our Future” QUESTIONS/COMMENTS? Contact: Abu Asaduzzaman E-mail: Abu.Asaduzzaman@wichita.edu Phone: +1-316-978-5261 Thank You!