IPDPS Looking Back Panel Uzi Vishkin, University of Maryland.

Slides:



Advertisements
Similar presentations
U Computer Systems Research: Past and Future u Butler Lampson u People have been inventing new ideas in computer systems for nearly four decades, usually.
Advertisements

1 Interdisciplinary Research: Opportunities and Challenges Nancy Amato Department of Computer Science and Engineering Texas A&M University Lori Clarke.
Developing Good Learners in Your Subject Session 4 The Learning-to-Learn NETWORK presents a 4-session certificate course Developing Good Learners in Your.
Algorithms-based extension of serial computing education to parallelism Uzi Vishkin - Using Simple Abstraction to Reinvent Computing for Parallelism, CACM,
Introduction CSCI 444/544 Operating Systems Fall 2008.
Prof. Srinidhi Varadarajan Director Center for High-End Computing Systems.
James Edwards and Uzi Vishkin University of Maryland 1.
Weekly Report Ph.D. Student: Leo Lee date: Oct. 9, 2009.
Enterprise Architecture: Unifying Business and IT.
Performance Potential of an Easy-to- Program PRAM-On-Chip Prototype Versus State-of-the-Art Processor George C. Caragea – University of Maryland A. Beliz.
Chapter 8 The Information Systems Planning Process Meeting the Challenges of Information Systems Planning Charles Cohen Presented by: Pablo De Luca.
George Caragea,and Uzi Vishkin University of Maryland 1 Speaker James Edwards.
Joint UIUC/UMD Parallel Algorithms/Programming Course David Padua, University of Illinois at Urbana-Champaign Uzi Vishkin, University of Maryland, speaker.
Better Speedups for Parallel Max-Flow George C. Caragea Uzi Vishkin Dept. of Computer Science University of Maryland, College Park, USA June 4 th, 2011.
General-Purpose vs. GPU: Comparison of Many-Cores on Irregular Benchmarks NameDescriptionCUDA SourceLines of Code DatasetParallel sectn. Threads/sectn.
How to build your own computer And why it will save you time and money.
Teaching Parallelism Panel, SPAA11 Uzi Vishkin, University of Maryland.
XMT-GPU A PRAM Architecture for Graphics Computation Tom DuBois, Bryant Lee, Yi Wang, Marc Olano and Uzi Vishkin.
Programmability and Portability Problems? Time for Hardware Upgrades Uzi Vishkin ~2003 Wall Street traded companies gave up the safety of the only paradigm.
Software Developer By: Charlie Edwards Period 6 th Mrs. Truong.
Introduction to Computing By Engr. Bilal Ahmad. Aim of the Lecture  In this Lecture the focus will be on Technology, we will be discussing some specifications.
Principles/theory matter and can matter more: Big lead of PRAM algorithms on prototype-HW Uzi Vishkin There is nothing more practical than a good theory--
© 2009 Mathew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
What is Concurrent Programming? Maram Bani Younes.
Foundations of Programming Languages – Course Overview Xinyu Feng Acknowledgments: some slides taken or adapted from lecture notes of Stanford CS242
Conference title1 A New Methodology for Studying Realistic Processors in Computer Science Degrees Crispín Gómez, María E. Gómez y Julio Sahuquillo DISCA.
Computer System Architectures Computer System Software
Games 1.Have a reason to design a game. 2.Brainstorm 3.Sift, strain, and find the “good” ideas 4.Prototype 5.Playtest 6.Experience Doc.
Open Source Software An Introduction. The Creation of Software l As you know, programmers create the software that we use l What you may not understand.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Prepare Yourself for IR Research ChengXiang Zhai Department of Computer.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Business Model for an Industrial development agency
CSCE 548 Secure Software Development Test 1 Review.
Science of Security Experimentation John McHugh, Dalhousie University Jennifer Bayuk, Jennifer L Bayuk LLC Minaxi Gupta, Indiana University Roy Maxion,
Y. Kotani · F. Ino · K. Hagihara Springer Science + Business Media B.V Reporter: 李長霖.
If Exascale by 2018, Really? Yes, if we want it, and here is how Laxmikant Kale.
Does humans-in-the-service-of-technology have a future Preview of Viewpoint article: Is Multi-Core Hardware for General-Purpose Parallel Processing Broken?
Workshop on Theory and Many Cores May 29, 2009 Sponsors The University of Maryland Institute for Advanced Computer Studies (UMIACS)The University of Maryland.
1 What is the big picture? Why study cognitive psychology? –A lot of this stuff you’ve already seen – eg Freud went on and on about memory & forgetting.
Joint UIUC/UMD Parallel Algorithms/Programming Course David Padua, University of Illinois at Urbana-Champaign Uzi Vishkin, University of Maryland, speaker.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
SCIENCE The aim of this tutorial is to help you learn to identify and evaluate scientific methods and assumptions.
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
Whither Formal? Moshe Y. Vardi Rice University. Ongoing Challenge: Complexity  We have only two ways to deal with increased complexity: Abstraction Tools.
1 Computer Engineering Department Islamic University of Gaza ECOM 6303: Advanced Computer Networks (Graduate Course) Spr Prof. Mohammad A. Mikki.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Kylie Minogue Asset Models & Who cares anyway Martyn Dorey Consultant 2 December 2003 Important Notice This document has been approved for issue in the.
University of Washington Today Quick review? Parallelism Wrap-up 
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.
1  2004 Morgan Kaufmann Publishers Fallacies and Pitfalls Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never.
Introduction. News you can use Hardware –Multicore chips (2009: mostly 2 cores and 4 cores, but doubling) (cores=processors) –Servers (often.
Feeding Parallel Machines – Any Silver Bullets? Novica Nosović ETF Sarajevo 8th Workshop “Software Engineering Education and Reverse Engineering” Durres,
Conclusions on CS3014 David Gregg Department of Computer Science
Kevin C. Chang University of Illinois, Urbana-Champaign
Course Outline Network Management Bahador Bakhshi
Building the foundations for innovation
Done Done Course Overview What is AI? What are the Major Challenges?
Parallel Computing in the Multicore Era
Part 3 Design What does design mean in different fields?
Genomic Data Clustering on FPGAs for Compression
CSCE 548 Secure Software Development Test 1 Review
Foundations of Programming Languages – Course Overview
Alternative Processor Panel Results 2008
Parallel Computing in the Multicore Era
Programming with Shared Memory Specifying parallelism
Panel on Research Challenges in Big Data
Presentation transcript:

IPDPS Looking Back Panel Uzi Vishkin, University of Maryland

Moderator: What has gone well? IPDPS a big success. Has become top facilitator for: Specific technical contributions, and Open debate of challenges – e.g., this panel Warm congrats to Viktor and all others! 2

What also has gone well Parallel PRAM algorithmic theory, second in magnitude only to the serial algorithmic theory Won the “battle of ideas” in the 1980s. Repeatedly: Challenged without success  no real alternative! 3

Dream opportunity Limited interest in parallel computing evolved into quest for general-purpose parallel computing in mainstream machines Example many-core desktops So far for the good news Are we doing everything we can to ensure that many- cores are not rejected by programmers? Recall Einstein’s observation: A perfection of means, and confusion of aims, seems to be our main problem What should be our aims? 4

“If you find yourself in a hole, stop digging” Moderator: What has gone wrong We found ourselves in a hole: most programmers can’t handle today’s (multicore) desktops Moderator: What was surprising We keep digging Why are we in trouble 1940’s Stored-program & program-counter  For serial comp: knowledge of algorithms was low priority for arch (& knowledge of arch was low priority for alg people) No such agreed “bridge” for many-cores. Still: -Industry realizes the need to reinvent computing for parallelism but is stuck with short-term pressures/culture. Academia hold on. -Education Architects pay too limited attention to (this time parallel) algorithms. How will they know to build machines that are easy to program? Instead, funding guides using problematic designs 5

What follows -Are many -core architectures doomed to mismatch parallel algorithms and ease-of-programming (EoP)? -What difference can a matching arch make? -What is feasible? 6

How come that most programmers can’t handle today’s (multicore) desktops? Hypothesis: Flawed architecture foundation Origin: ‘build-first figure-out-how-to-program-later’ Parallel languages: fitted flawed architectures then standardized Who can save the field and promote the aim of ease-of-programming (EoP)? Industry (perfecting means) -Follow-up architectures fit language standards  remain flawed -Insufficient competition Academia (perfecting means) -Consider a vendor-backed flawed system. Wonderful opportunity for our originality-seeking publications culture: * The simplest problem requires creativity  More papers * Cite one another if on similar systems  high # citations coupled with ‘industry impact’ - Ultimate job security – By the time the ink dries on these papers, next flawed ‘modern’ ‘state-of-the-art’ system. Culture of short-term impact 7

Anecdotal Validation (?) Breadth-first-search (BFS) example 42 students: joint UIUC/UMD course -<1X speedups using OpenMP on 8-processor SMP -7x-25x speedups on 64-processor XMT FPGA prototype [Built at UMD] What’s the big deal of 64 processors beating 8? Silicon area of 64 XMT processors ~= 1-2 SMP processors Questionnaire Rank approaches for achieving (hard) speedups: All students, but one : XMTC ahead of OpenMP Order-of-magnitude advantage on teachability (MS, HS & up, SIGCSE’10) SPAA’11: >100X speedup on max-flow relative to 2.5X on GPU (IPDPS’10) Fleck/Kuhn: research too esoteric to be reliable  exoteric validation! What has gone wrong Only heroic programmers can exploit the vast parallelism in current machines – The Future of Computing Performance: Game Over or Next Level?, Report by NAE Conclusion Fund power.. Reward alert: Try to publish a paper boasting easy to obtain results  EoP: 1. Badly needed. Yet, 2. A lose-lose proposition. 8

Parallel Random-Access Machine/Model PRAM: n synchronous processors all having unit time access to a shared memory. Reactions You got to be kidding, this is way: - Too easy - Too difficult: Why even mention processors? What to do with n processors? How to allocate processors to instructions?

Immediate Concurrent Execution 10 ‘Work-Depth framework’ SV82, Adopted in Par Alg texts [J92,KKT01]. ICE basis for architecture specs: V, Using simple abstraction to reinvent computing for parallelism, CACM 1/2011 Similar to role of stored-program & program-counter in arch specs for serial comp

Algorithms-aware many-core is feasible Algorithms Programming Programmer’s workflow Rudimentary yet stable compiler PRAM-On-Chip HW Prototypes 64-core, 75MHz FPGA of XMT [ SPAA98..CF08] Toolchain Compiler + simulator HIPS’ core interconnection network IBM 90nm: 9mmX5mm, MHz [HotI07] FPGA design  ASIC IBM 90nm: 10mmX10mm 150 MHz Architecture scales to cores on-chip XMT homepage: or search: ‘XMT’

Where are your specs? What is your par alg abstraction? ‘ First-specs then-build’ is “not uncommon”.. for engineering I see only 2 options for architects: A.1. Go through parallel algorithms immersion 2. Develop abstraction that meets EoP 3. Develop specs 4. Build B.Start from abstraction with proven EoP

Sociologists of science Debates between adherents of different thought styles consist almost entirely of misunderstandings. Members of both parties are talking of different things (though they are usually under an illusion that they are talking about the same thing). They are applying different methods and criteria of correctness (although they are usually under an illusion that their arguments are universally valid and if their opponents do not want to accept them, then they are either stupid or malicious) 13