Computer Organization TI1400 Alexandru Iosup (lecturer) Henk Sips (original slides) Parallel and Distributed Systems

Slides:



Advertisements
Similar presentations
Challenge the future Delft University of Technology Overprovisioning for Performance Consistency in Grids Nezih Yigitbasi and Dick Epema Parallel.
Advertisements

R.A.I.D. By: Ben Segall. What is R.A.I.D.? In 1987, Patterson, Gibson and Katz at the University of California Berkeley, published a paper entitled "A.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
Peer-to-peer systems and Distributed Hash Tables (DHTs) COS 461: Computer Networks Spring 2009 (MW 1:30-2:50 in COS 105) Mike Freedman Teaching Assistants:
IN2305-II Embedded Programming prof.dr.ir. Arjan J.C. van Gemund Embedded Software Lab Software Technology Dept.
CS252 Graduate Computer Architecture Spring 2014 Lecture 9: VLIW Architectures Krste Asanovic
EECC756 - Shaaban #1 lec # 1 Spring Systolic Architectures Replace single processor with an array of regular processing elements Orchestrate.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
OPNET Technologies, Inc. Performance versus Cost in a Cloud Computing Environment Yiping Ding OPNET Technologies, Inc. © 2009 OPNET Technologies, Inc.
C-Store: Data Management in the Cloud Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 5, 2009.
Stephen Dine Datasource Consulting, LLC Tuesday August 4, 2009.
Parallel Processing1 Parallel Processing (CS 676) Overview Jeremy R. Johnson.
The Evolution of RISC A Three Party Rivalry By Jenny Mitchell CS147 Fall 2003 Dr. Lee.
Challenge the future Delft University of Technology Evaluating Multi-Core Processors for Data-Intensive Kernels Alexander van Amesfoort Delft.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
CS2214 Recitation Presented By Veejay Sani. Benchmarking SPEC CPU2000 Integer Benchmark Floating Point Benchmark We will only deal with Integer Benchmark.
1 NetGames 2010 – CAMEO: Continuous Analytics for Massively Multiplayer Online Games CAMEO : Enabling Social Networks for Massively Multiplayer Online.
Supporting Systolic and Memory Communication in iWarp (Borkar et al. 1990) presented by Vasily Volkov CS258, Spring 2008, UC Berkeley.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
1 Rainer Leupers, University of Dortmund, Computer Science Dept. ISSS ´98 HDL-based Modeling of Embedded Processor Behavior for Ret. Compilation Rainer.
Computer Organization TI1400 Alexandru Iosup (lecturer) Henk Sips (original slides) Parallel and Distributed Systems
High Performance Computing with cloud Xu Tong. About the topic Why HPC(high performance computing) used on cloud What’s the difference between cloud and.
August 28, Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing Berkeley, CA, USA Alexandru Iosup, Nezih Yigitbasi,
1 Cloud Computing Research at TU Delft – A. Iosup Alexandru Iosup Parallel and Distributed Systems Group Delft University of Technology The Netherlands.
Course Information. Course resources All course materials (slides, links to recorded lectures, online quiz, assignments, course project, and online exams)
UIUC CSL Global Technology Forum © NVIDIA Corporation 2007 Computing in Crisis: Challenges and Opportunities David B. Kirk.
Last Time Performance Analysis It’s all relative
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
A Broker for Cost-efficient QoS aware Resource Allocation in EC2. Kurt Vermeersch Coordinator: Kurt Vanmechelen.
Cansys West International Conference February , 2013Panama City, Panama An easier way to deliver APPX applications.
CSE 691: Energy-Efficient Computing Lecture 7 SMARTS: custom-made systems Anshul Gandhi 1307, CS building
computer
J. J. Rehr & R.C. Albers Rev. Mod. Phys. 72, 621 (2000) A “cluster to cloud” story: Naturally parallel Each CPU calculates a few points in the energy grid.
Presented by: Mostafa Magdi. Contents Introduction. Cloud Computing Definition. Cloud Computing Characteristics. Cloud Computing Key features. Cost Virtualization.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
1 ROIA 2009 – CAMEO: Continuous Analytics for Massively Multiplayer Online Games CAMEO: Continuous Analytics for Massively Multiplayer Online Games Alexandru.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
November 29, Our team: Undergrad Thomas de Ruiter, Anand Sawant, Ruben Verboon, … Grad Siqi Shen, Guo Yong, Nezih Yigitbasi Staff Henk Sips, Dick.
4/25/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 22: Putting it All Together Krste Asanovic Electrical Engineering and.
CSE 451: Operating Systems Autumn 2010 Module 25 Cloud Computing Ed Lazowska Allen Center 570.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
#1. #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12.
Parallel & Distributed Computing Fall 2006 Comments About Final.
Biowulf: Molecular Dynamics and Parallel Computation Susan Chacko Scientific Computing Branch, Division of Computer System Services CIT, NIH.
亚洲的位置和范围 吉林省白城市洮北区教师进修学校 郑春艳. Q 宠宝贝神奇之旅 —— 亚洲 Q 宠快递 你在网上拍的一套物理实验器材到了。 Q 宠宝贝打电话给你: 你好,我是快递员,有你的邮件,你的收货地址上面 写的是学校地址,现在学校放假了,能把你家的具体 位置告诉我吗? 请向快递员描述自己家的详细位置!
Emerging applications in cloud High performance computing E-Commerce Media hosting Web hosting Content delivery... –from Amazon AWS survey 1 Emulated network.
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
Computer Organization TI1400 Alexandru Iosup (lecturer) Parallel and Distributed Systems
Sobolev(+Node 6, 7) Showcase +K20m GPU Accelerator.
Parallel Computing Lecture
continued on next slide
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 11 Amazon Web Services Prof. Zhang Gang
(Parallel and) Distributed Systems Group
                                                                                                                                                                                                                                                
continued on next slide
continued on next slide
Parallel & Distributed Computing Fall 2008
Programming Massively Parallel Processors Lecture Slides for Chapter 9: Application Case Study – Electrostatic Potential Calculation © David Kirk/NVIDIA.
ВОМР Подмярка 19.2 Възможности за финансиране
Споразумение за партньорство
,. . ' ;; '.. I I tI I t : /..: /.. ' : ····t I 'h I.;.; '..'.. I ' :".:".
Parallel Computing Architectures (048874) Fall
Introduction to Heterogeneous Parallel Computing
Human Information Processor
CINECA HIGH PERFORMANCE COMPUTING SYSTEM
Parallel & Distributed Computing Fall 2006
continued on next slide
continued on next slide
Presentation transcript:

Computer Organization TI1400 Alexandru Iosup (lecturer) Henk Sips (original slides) Parallel and Distributed Systems

Peak vs Real Performance “One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a GHz 2007 Opteron or 2007 Xeon processor.” Amazon EC2 Source: Original code w/o SIMD ACML ~90% ~10% Peak <50%

ATARI VCS (1977) Home entertainment system Video, Sound, Games Pluggables: joystick, dance mat, … Today we use Wiis and XBox360s Ultra-low-cost platform CPU: MOS 6502 at 1MHz low cost, low mem GFX, sound: 4 KB ROM (max 8K) State, score: 128 bytes RAM (Apple II had 4 kilo-bytes) RAM/Input/Output/Timer controller “Any mistake in timing produced visual artifacts, a problem programmers called racing the beam.”

Doom (1993) “For me, while I do take a lot of pride in shipping a great product, the achievements along the way are more memorable. I don't remember any of our older product releases, but I remember the important insights all the way back to using CRTC wraparound for infinite smooth scrolling in Keen (actually, all the way back to understanding the virtues of structures over parallel arrays in apple II assembly language…) Knowledge builds on knowledge.” John Carmack,.plan, Feb 1998

Why iPads and Android GUIs Don’t Mix About Android GUIs “They’re too slow” Matt Buchanan, Why You Should Avoid Phones With Android Skins, June 8, 2010, core smartphones coming up—the throw more hardware solution

Know Thy Platform Programming hot functions Programming optimized functions Linear algebra (BLAS) Video, sound, and security codecs Vertex and pixel shaders on GPUs Building low-power devices Real-time platforms Games, Simulations, Navigation systems, Medical equipment Matching algorithm and platform: improve performance up to a factor of 100 (Michael Abrash: Doom, Quake, …)

TU-Delft TI1400/11-PDS 7 Computer Organization Classes 2 hours a week (see College Rooster) Instructions 1 hour per week (see College Rooster)

TU-Delft TI1400/11-PDS 8 Course Material Class Material V.C. Hamacher, Z.G. Vranesic, S.G. Zaky, Computer Organization, 2002, McGraw-Hill, 5 th Ed. Reader: Computer Organization, January 2007 Lecture Slides Info: BlackBoard ATTN: textbook via CH (second-hand) Amazon.com or bol.com McGraw Hill link (e-version)

TU-Delft TI1400/11-PDS 9 Lab Course (in1705_II/ti1400) Programming in assembler IA-32 assembly language Start: first week of Q4 (see “rooster”) 3 lab assignments MUST be finished for entering in1805_II project ST-2 Presence is mandatory !!!!

TU-Delft TI1400/11-PDS 10 Examination(1) Two exams per year Each exam consists of 30 multiple choice questions Old examinations can be found at the in1705 web pages Additional exercises are on BlackBoard

TU-Delft TI1400/11-PDS 11 Quiz bonus In week 7 of Q3 there is a “quiz” test Results contribute to first exam ti1400 according to the following formula: max{r, 2r/3+test/3} where r is result first exam ti1400. In other words: The test can count at most 1/3 result. The exam (r) counts at least 2/3 result. You can only win by taking the quiz!

TU-Delft TI1400/11-PDS 12 Overview ti1400 [1/4] Week 1-2 : Basic concepts, digital logic, memory elements, finite state machines Week 1 : Hamacher A.1 to A.6 Week 2 : Hamacher A.7 to A.14 Week 3-4 : Number representations and arithmetic Week 3 : Hamacher 2.1, 6.1 Week 4 : Hamacher 6.3 to 6.8

TU-Delft TI1400/11-PDS 13 Overview ti1400 [2/4] Week 5-6 : Instructions, addressing modes, assembler Week 5 : Hamacher 2.2 to 2.4 Week 6 : Hamacher 2.5 to 2.13 Week 7 : IA-32 architecture Week 7 : Hamacher 3.16 to 3.25

TU-Delft TI1400/11-PDS 14 Overview ti1400 [3/4] Week 8-9 : The Processing Unit Week 8 : Hamacher 7 Week 9 : Hamacher 7 Week : Input/Output and Memory Organization Week 10 : Hamacher 2.7 and 4 Week 11 : Hamacher 5

TU-Delft TI1400/11-PDS 15 Overview ti1400 [4/4] Week 12 : Pipelining Week 12 : Hamacher 8 Week 13 : Language levels and translation Week 13 : Reader Week 14 : Large Systems Week 14 : Hamacher 12