NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN Saeid Nooshabadi Arthur Sale University of Tasmania.

Slides:



Advertisements
Similar presentations
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
Advertisements

ARCHITECTURE OF APPLE’S G4 PROCESSOR BY RON WEINWURZEL MICROPROCESSORS PROFESSOR DEWAR SPRING 2002.
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
Embedded System Lab. What is an embedded systems? An embedded system is a computer system designed for specific control functions within a larger system,
Future of Microprocessors
Pentium microprocessors CAS 133 – Basic Computer Skills/MS Office CIS 120 – Computer Concepts I Russ Erdman.
Microprocessor Microarchitecture Multithreading Lynn Choi School of Electrical Engineering.
The Central Processing Unit: What Goes on Inside the Computer.
Intel® Core™ Duo Processor Behrooz Jafarnejad Winter 2006.
1 Network Performance Model Sender Receiver Sender Overhead Transmission time (size ÷ band- width) Time of Flight Receiver Overhead Transport Latency Total.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
EE314 Basic EE II Silicon Technology [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
1 Introduction Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design Text for CS/EE 6810: Hennessy and.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
1 Chapter 4 The Central Processing Unit and Memory.
Semester One 2001/2002 Sheffield Hallam University1 The Motherboard Major circuit board in PC Holds CPU where calculations and instructions on data are.
1 Chapter 01 Authors: John Hennessy & David Patterson.
Computer performance.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Computer Architecture CST 250 INTEL PENTIUM PROCESSOR Prepared by:Omar Hirzallah.
2015/9/4System Arch 2008 (Fire Tom Wada) 1 SEMICONDUCTOR TECHNOLOGY -CMOS- Fire Tom Wada.
1 Copyright © 2011, Elsevier Inc. All rights Reserved. Appendix E Authors: John Hennessy & David Patterson.
Motivation Mobile embedded systems are present in: –Cell phones –PDA’s –MP3 players –GPS units.
2/6: CPUs & Memory CPUs –Parts of a sample CPU –Types of CPUs available ROM RAM –different kinds & uses inc. VRAM, SRAM image courtesy of How Computers.
8 – Simultaneous Multithreading. 2 Review from Last Time Limits to ILP (power efficiency, compilers, dependencies …) seem to limit to 3 to 6 issue for.
Writer:-Rashedul Hasan Editor:- Jasim Uddin
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
XP Practical PC, 3e Chapter 16 1 Looking “Under the Hood”
Practical PC, 7th Edition Chapter 17: Looking Under the Hood
Computer Performance Computer Engineering Department.
Current Computer Architecture Trends CE 140 A1/A2 29 August 2003.
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Led the WWII research group that broke the code for the Enigma machine proposed a simple abstract universal machine model for defining computability devised.
Recap Technology trends Cost/performance Measuring and Reporting Performance What does it mean to say “computer X is faster than computer Y”? E.g. Machine.
Pre-Pentium Intel Processors /
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
3/15/2002CSE Final Remarks Concluding Remarks SOAP.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
1 Latest Generations of Multi Core Processors
SEMICONDUCTOR TECHNOLOGY -CMOS-
IBM/Motorola/Apple PowerPC
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
Succeeding with Technology Chapter 2 Hardware Designed to Meet the Need The Digital Revolution Integrated Circuits and Processing Storage Input, Output,
Lecture # 10 Processors Microcomputer Processors.
The Pentium Series CS 585: Computer Architecture Summer 2002 Tim Barto.
CS203 – Advanced Computer Architecture Performance Evaluation.
Hardware Architecture
History of Computers and Performance David Monismith Jan. 14, 2015 Based on notes from Dr. Bill Siever and from the Patterson and Hennessy Text.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
CS203 – Advanced Computer Architecture
Microarchitecture.
Visit for more Learning Resources
Lynn Choi School of Electrical Engineering
Morgan Kaufmann Publishers
CC 423: Advanced Computer Architecture Limits to ILP
Vector Processing => Multimedia
CS775: Computer Architecture
Alpha Microarchitecture
Performance of computer systems
Performance of computer systems
Computer Evolution and Performance
Performance of computer systems
8 – Simultaneous Multithreading
Presentation transcript:

NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN Saeid Nooshabadi Arthur Sale University of Tasmania

Outline u Desktop/Server Microprocessor State of the Art u Current Processors Limit u Embedded Processors Market u Mobile Multimedia Computing as New Direction u Conclusion

Computer in the News Technology Marches on (1) SANTA CLARA, Calif., March 8, Intel Corporation today introduced the Intel® Pentium® III processor 1.0 GHz (GigaHertz or 1,000 MegaHertz), the world's highest performance microprocessor for PCs. The Pentium III processor at 1 GHz delivers a 15 percent performance gain over the fastest processors on the market today. Source:

Computer in the News Technology Marches on (2) INTEL DEVELOPER FORUM, Calif., Feb. 15, Intel Corporation Chairman Andrew S. Grove today kicked off the semi-annual Intel Developer Forum by demonstrating the company's fastest microprocessor: a chip running at 1.5 GHz, or 1.5 billion clock cycles per second, at room temperature. Based on a new microarchitecture from Intel, the chip is code-named "Willamette." (To be marketed towards end of the year) Source: Who needs 1.5 GHz Processor?

State of the Art: Alpha u 15M transistors u 2 x 64KB caches on chip; 16MB L2 cache off chip u Clock 600 MHz (Fastest Cray Supercomputer: T nsec) u 90 watts u Superscalar: fetch up to 6 instructions/clock cycle, retires up to 4 instruction/clock cycle u Execution out-of-order

Processor Limit: DRAM Gap

Processor-Memory Performance Gap “Tax” (1) Processor % Area %Transistors (cost*)(power) Alpha %77% StrongArm SA11061%94% Pentium Pro64%88% 2 dies per package: Proc/I$/D$ + L2$ u Caches have no inherent value, only try to close performance gap * COST =F(Area 4 )

Processor-Memory Performance Gap “Tax” (2) u Microprocessor-DRAM performance gap >time of a full cache miss in instructions executed 1st Alpha (7000): 340 ns/5.0 ns = 68 clks x 2 or 136 2nd Alpha (8400):266 ns/3.3 ns = 80 clks x 4 or 320 3rd Alpha ( ):180 ns/1.7 ns =108 clks x 6 or 648 >1/2X latency x 3X clock rate x 3X Instr/clock  5X

Today’s Situation: Microprocessor MIPS MPUsR5000R k/5k Clock Rate 200 MHz 195 MHz 1.0x On-Chip Caches 32K/32K 32K/32K 1.0x Instructions/Cycle 1(+ FP)4 4.0x Pipe stages x Model In-orderOut-of-order --- Die Size (mm 2 ) x without cache, TLB x Development (man yr.) x SPECint_base x

Processors Evaluation Metrics u SPECint95: Suit of Integer Programs u SPECft95: Suit of Floating Point Programs u TCP-C: On Line Transaction Processing Programs (OLTP) u All state of the arts processors perform well for SPECint95 and SPECft95 (scientific and technical applications) u TCP-C ?

Processor Limits for TPC-C SPEC- Pentium Pro int95 TPC-C >Multilevel Caches: Miss rate 1MB L2 cache 0.5% 5% >Superscalar (2-3 instr. retired/clock): % clks 40% 10% >Out-of-Order Execution speedup 2.0X 1.4X >Clocks per Instruction u % Peak performance 40% 10% source: Bhandarkar, D.; Ding, J. “Performance characterization of the Pentium Pro processor.” Proc. 3rd Int'l. Symp. on High-Performance Computer Architecture, Feb p

Embedded Processor Market u Over 97% of the processors fabricated u 50% of the revenues from processor sales u Embedded devices cover wide range products >simple devices such as thermostats and toasters >complex and mission-critical applications such as avionics systems. >In between are phones, facsimile machines, ATM switches, digital cameras, automotive applications, set-top boxes,...

Embedded Processor Design u Drives the technology “Post-PC” era u Embedded processors incorporate capabilities traditionally associated with the conventional CPUs. u They are subject to challenging >cost, >power consumption, >and application- imposed constraints.

Intel Embedded Mobile Celeron Processor u Available at 600, 566, 533, 500 and 466 MHz. u Dynamic Execution technology. u Includes Intel MMX™ media enhancement technology. u Intel Streaming SIMD Extensions (available on the Intel Celeron Processor at 566 and 600 MHz). u 32 Kbyte (16 Kbyte/16 Kbyte) Level 1 cache. u 128 Kbyte integrated Level 2 cache. u 66 MHz Intel P6 micro-architecture's multitransaction system bus. u Intel Chipset support: Intel® 810 chipset, Intel® 810E chipset, Intel® 440BX, Intel® 440EX and the Intel® 440ZX-66 AGPset. u Power WattsSource:

Desktop/Server Processors Summary (1) u SPEC performance doubling / 18 months >Growing CPU-DRAM performance gap & tax >Running out of ideas, competition? Back to 2X / 2.3 yrs? u Benchmarks: SPEC-int, SPEC-ft, TPC (for OLTP) >Benchmark highest optimization, ship lowest optimization? u Processor tricks not as useful for transactions? >Clock rate increase compensated by CPI increase? >When > 100 MIPS on TPC-C?

Desktop/Server Processors Summary (2) u Embedded processors promising >Strong ARM 110: 233 MHz, 268 MIPS, 0.36W typ., $49 >1/10 cost, 1/100 power, 1/2 integer performance? u Consolidation of desktop industry? Innovation? u Time to look for the computing trends and applications of tomorrow?

Billion Transistor Architectures and “Stationary Computer” Metrics SS++TraceSMTCMPIA-64*RAW SPEC Int+++=+= SPEC FP+++++= TPC (DataBse)==++=– SW Effort++===– Design Scal. –=–=== Physical–=–==+ Design Complexity (See IEEE Computer (9/97), Special Issue on Billion Transistor Microprocessors) >*Very Long Instruction Word (Intel,HP IA-64/Merced) –multiple ops/ instruction, compiler controls parallelism –Coined as the next generation Intel/HP processor –Renamed Itanium™ (October 99)

Current Computer Design with the Bias for the Past u Most Billion Transistor Architectures show high physical design complexity u Most show impressive performance for SPEC suits of programs u Suitablity: >suitable for high end traditonal applications >unsuitable for pervasive computing environment of the future; >high power budget (>180 Watts), >expensive (>$500) u Applications of past to design computers of future

Challenge for Future Microprocessors u “...wires are not keeping pace with scaling of other features. … In fact, for CMOS processes below 0.25 micron... an unacceptably small percentage of the die will be reachable during a single clock cycle.” u “Architectures that require long-distance, rapid interaction will not scale well...” >“Will Physical Scalability Sabotage Performance Gains?” Matzke, IEEE Computer (9/97)

Computer in the News Expert Talking “ Intel specializes in designing microprocessors for the desktop PC, which in five years may no longer be the most important type of computer. Its successor may be a personal mobile computer that integrates the portable computer with a cellular phone, digital camera, and video game player… Such devices require low- cost, energy- efficient microprocessors, and Intel is far from a leader in that area.” -David Patterson, NY Times, June 9, 1998* *David Patterson led the design of Berkeley RISC Machine, the first RISC computer. He is also the author/co-author of two of most popular Textbooks on Computer Architecture.

Post PC Motivation u Next generation fixes problems of last gen.  1960s: batch processing + slow turnaround  Timesharing >15-20 years of performance improvement, cost reduction (minicomputers, semiconductor memory)  1980s: Time sharing + inconsistent response times  Workstations/Personal Computers >15-20 years of performance improvement, cost reduction (microprocessors, DRAM memory, disk)  2000s: PCs + difficulty of use/high cost of ownership  ???

Computing Trends Post-PC Era u Multimedia Applications >real time data types; video, speech, animation, & music >90% of desktop cycles will be spent on media applications by end of >Multimedia workloads will continue in importance >Image, handwriting, and speech recognition will pose other major challenges. u Pervasive Mobile Computing Devices >support an expanding range of functions >challenge is in converging them into a single device >keeping the size, weight, and power consumption constant.

Sony Playstation 2000 u Emotion Engine: 6.2 GFLOPS, 75 million polygons per second (Microprocessor Report, 13:5) >Superscalar MIPS core + vector coprocessor + graphics/DRAM >Claim: Toy Story realism brought to games!

Intelligent PDA ( 2005?) Pilot PDA u gameboy, cell phone, radio, timer, camera, TV remote, am/fm radio, garage door opener,... u Wireless data (WWW) u Speech, vision recog. u Voice output for conversations -Speech control of all devices - Vision to see, - Scan documents, - read bar code,... - Measure room

Billion Transistor Architectures and “Mobile Multimedia” Metrics SS++ Trace SMT CMP IA-64 RAW Design Scal. –=–=== Energy/power–––==– Code Size====–= Real-time––==== Cont. Data====== Memory BW==== == Fine-grain Par.=====+ Coarse-gr.Par. ==++ =+ >“Direction for Computer Architecture Research”, Kozyrakis, Patterson IEEE Computer (11/98)

New Architecture Directions u “…media processing will become the dominant force in computer arch. & microprocessor design.” u “... new media-rich applications... involve significant real-time processing of continuous media streams, and make heavy use of vectors of packed 8-, 16-, and 32-bit integer and Fl. Pt.” u “Needs include high memory BW, high network BW, continuous media data types, real-time response, fine grain parallelism” “How Multimedia Workloads Will Change Processor Design”, Diefendorff & Dubey, IEEE Computer (9/97)

Some Media-Processing Functions KernelVector length Matrix transpose/multiply (3D Gr.)# vertices at once DCT (video, comm.)image width FFT (audio) Motion estimation (video)image width, i.w./16 Gamma correction (video)image width Haar transform (media mining)image width Median filter (image process.)image width (from

Challenges for Mobile Multimedia u High performance for multimedia functions u Energy and power efficiency (<1 Watt) u Small size (fit in pocket) u Low design complexity and high degree of scalability (costs few tens of $)

A Better Mobile Multimedia MPUs: Logic+DRAM u Embedded DRAM processors one possibility u Faster logic in DRAM process >DRAM vendors offer faster transistors + same number metal layers as good logic ≈ 20% higher cost per wafer? u Called Intelligent RAM (“IRAM”) since most of transistors will be DRAM u Leave for another presentation >“A Case for Intelligent RAM”Patterson, Anderson, …. IEEE Computer (3/97)

u 10000X cost-performance increase in “stationary” computers, consolidation of industry => time for architecture/OS/compiler researchers declare victory, search for new horizons? u Mobile Multimedia offer many new challenges: energy efficiency, size, real time performance,... u Apps/metrics of future to design computer of future! >Suppose PDA replaces desktop as primary computer? >Work on FPPP on PC vs. Speech on PDA? Mobile Multimedia Conclusion

u “Personal mobile computing offers a vision of the future with a much richer and more exciting set of architecture research challenges than extrapolations of the current desktop architectures and benchmarks.” u “Put another way, which problem would you rather work on: improving performance of PCs running FPPPP—a 1982 Fortran benchmark used in SPECfp95—or making speech input practical for PDAs? “ “ Direction for Computer Architecture Research”, Kozyrakis, Patterson IEEE Computer (11/98) From the Horse Mouth

References u IEEE Computers; Sept. 97, Jan. 98, Aug. 98, Nov. 98, u IEEE Micro: Dec. 96, Mar. 97, Sept. 97

Acknowledgement u Thanks to Dr. Vishv Malhotra for lending me some of his IEEE Computer issues. u Thanks to Prof. Sale for going through the slides and making useful suggestions. u WAIT FOR THE NEXT TWO SLIDES

Purpose of This Talk u To get Staff and Students excited about the new opportunities for research. u What would you be doing as a graduate? >Service Windows NT, and if lucky perhaps UNIX? >Develop web pages? >Do more of the same? u Or rather do something really exciting? u We need you if you choose the LATTER! u 50 Post Graduate Scholarship for IT up for grab

Our Vision and Aim u Achieve Critical Mass in Research u Create a Group of Staff & Students Working on the Problems of Future. u Pulling Australian IT Research Community Together u Identifying Niches Where We Can Make International Contribution.