Download presentation
Presentation is loading. Please wait.
Published byAlexander Benson Modified over 9 years ago
1
NEW TRENDS IN COMPUTER ARCHITECTURE DESIGN Saeid Nooshabadi Arthur Sale University of Tasmania
2
Outline u Desktop/Server Microprocessor State of the Art u Current Processors Limit u Embedded Processors Market u Mobile Multimedia Computing as New Direction u Conclusion
3
Computer in the News Technology Marches on (1) SANTA CLARA, Calif., March 8, 2000 -- Intel Corporation today introduced the Intel® Pentium® III processor 1.0 GHz (GigaHertz or 1,000 MegaHertz), the world's highest performance microprocessor for PCs. The Pentium III processor at 1 GHz delivers a 15 percent performance gain over the fastest processors on the market today. Source: http://www.intel.com
4
Computer in the News Technology Marches on (2) INTEL DEVELOPER FORUM, Calif., Feb. 15, 2000 - Intel Corporation Chairman Andrew S. Grove today kicked off the semi-annual Intel Developer Forum by demonstrating the company's fastest microprocessor: a chip running at 1.5 GHz, or 1.5 billion clock cycles per second, at room temperature. Based on a new microarchitecture from Intel, the chip is code-named "Willamette." (To be marketed towards end of the year) Source: http://www.intel.com Who needs 1.5 GHz Processor?
5
State of the Art: Alpha 21264 u 15M transistors u 2 x 64KB caches on chip; 16MB L2 cache off chip u Clock 600 MHz (Fastest Cray Supercomputer: T90 2.2 nsec) u 90 watts u Superscalar: fetch up to 6 instructions/clock cycle, retires up to 4 instruction/clock cycle u Execution out-of-order
6
Processor Limit: DRAM Gap
7
Processor-Memory Performance Gap “Tax” (1) Processor % Area %Transistors (cost*)(power) Alpha 2116437%77% StrongArm SA11061%94% Pentium Pro64%88% 2 dies per package: Proc/I$/D$ + L2$ u Caches have no inherent value, only try to close performance gap * COST =F(Area 4 )
8
Processor-Memory Performance Gap “Tax” (2) u Microprocessor-DRAM performance gap >time of a full cache miss in instructions executed 1st Alpha (7000): 340 ns/5.0 ns = 68 clks x 2 or 136 2nd Alpha (8400):266 ns/3.3 ns = 80 clks x 4 or 320 3rd Alpha ( 21264 ):180 ns/1.7 ns =108 clks x 6 or 648 >1/2X latency x 3X clock rate x 3X Instr/clock 5X
9
Today’s Situation: Microprocessor MIPS MPUsR5000R10000 10k/5k Clock Rate 200 MHz 195 MHz 1.0x On-Chip Caches 32K/32K 32K/32K 1.0x Instructions/Cycle 1(+ FP)4 4.0x Pipe stages 55-7 1.2x Model In-orderOut-of-order --- Die Size (mm 2 ) 84 298 3.5x without cache, TLB 32205 6.3x Development (man yr.) 60300 5.0x SPECint_base95 5.78.8 1.6x
10
Processors Evaluation Metrics u SPECint95: Suit of Integer Programs u SPECft95: Suit of Floating Point Programs u TCP-C: On Line Transaction Processing Programs (OLTP) u All state of the arts processors perform well for SPECint95 and SPECft95 (scientific and technical applications) u TCP-C ?
11
Processor Limits for TPC-C SPEC- Pentium Pro int95 TPC-C >Multilevel Caches: Miss rate 1MB L2 cache 0.5% 5% >Superscalar (2-3 instr. retired/clock): % clks 40% 10% >Out-of-Order Execution speedup 2.0X 1.4X >Clocks per Instruction 0.8 3.4 u % Peak performance 40% 10% source: Bhandarkar, D.; Ding, J. “Performance characterization of the Pentium Pro processor.” Proc. 3rd Int'l. Symp. on High-Performance Computer Architecture, Feb 1997. p. 288-97.
12
Embedded Processor Market u Over 97% of the processors fabricated u 50% of the revenues from processor sales u Embedded devices cover wide range products >simple devices such as thermostats and toasters >complex and mission-critical applications such as avionics systems. >In between are phones, facsimile machines, ATM switches, digital cameras, automotive applications, set-top boxes,...
13
Embedded Processor Design u Drives the technology “Post-PC” era u Embedded processors incorporate capabilities traditionally associated with the conventional CPUs. u They are subject to challenging >cost, >power consumption, >and application- imposed constraints.
14
Intel Embedded Mobile Celeron Processor u Available at 600, 566, 533, 500 and 466 MHz. u Dynamic Execution technology. u Includes Intel MMX™ media enhancement technology. u Intel Streaming SIMD Extensions (available on the Intel Celeron Processor at 566 and 600 MHz). u 32 Kbyte (16 Kbyte/16 Kbyte) Level 1 cache. u 128 Kbyte integrated Level 2 cache. u 66 MHz Intel P6 micro-architecture's multitransaction system bus. u Intel Chipset support: Intel® 810 chipset, Intel® 810E chipset, Intel® 440BX, Intel® 440EX and the Intel® 440ZX-66 AGPset. u Power 17 - 30 WattsSource: http://www.intel.com
15
Desktop/Server Processors Summary (1) u SPEC performance doubling / 18 months >Growing CPU-DRAM performance gap & tax >Running out of ideas, competition? Back to 2X / 2.3 yrs? u Benchmarks: SPEC-int, SPEC-ft, TPC (for OLTP) >Benchmark highest optimization, ship lowest optimization? u Processor tricks not as useful for transactions? >Clock rate increase compensated by CPI increase? >When > 100 MIPS on TPC-C?
16
Desktop/Server Processors Summary (2) u Embedded processors promising >Strong ARM 110: 233 MHz, 268 MIPS, 0.36W typ., $49 >1/10 cost, 1/100 power, 1/2 integer performance? u Consolidation of desktop industry? Innovation? u Time to look for the computing trends and applications of tomorrow?
17
Billion Transistor Architectures and “Stationary Computer” Metrics SS++TraceSMTCMPIA-64*RAW SPEC Int+++=+= SPEC FP+++++= TPC (DataBse)==++=– SW Effort++===– Design Scal. –=–=== Physical–=–==+ Design Complexity (See IEEE Computer (9/97), Special Issue on Billion Transistor Microprocessors) >*Very Long Instruction Word (Intel,HP IA-64/Merced) –multiple ops/ instruction, compiler controls parallelism –Coined as the next generation Intel/HP processor –Renamed Itanium™ (October 99)
18
Current Computer Design with the Bias for the Past u Most Billion Transistor Architectures show high physical design complexity u Most show impressive performance for SPEC suits of programs u Suitablity: >suitable for high end traditonal applications >unsuitable for pervasive computing environment of the future; >high power budget (>180 Watts), >expensive (>$500) u Applications of past to design computers of future
19
Challenge for Future Microprocessors u “...wires are not keeping pace with scaling of other features. … In fact, for CMOS processes below 0.25 micron... an unacceptably small percentage of the die will be reachable during a single clock cycle.” u “Architectures that require long-distance, rapid interaction will not scale well...” >“Will Physical Scalability Sabotage Performance Gains?” Matzke, IEEE Computer (9/97)
20
Computer in the News Expert Talking “ Intel specializes in designing microprocessors for the desktop PC, which in five years may no longer be the most important type of computer. Its successor may be a personal mobile computer that integrates the portable computer with a cellular phone, digital camera, and video game player… Such devices require low- cost, energy- efficient microprocessors, and Intel is far from a leader in that area.” -David Patterson, NY Times, June 9, 1998* *David Patterson led the design of Berkeley RISC Machine, the first RISC computer. He is also the author/co-author of two of most popular Textbooks on Computer Architecture.
21
Post PC Motivation u Next generation fixes problems of last gen. 1960s: batch processing + slow turnaround Timesharing >15-20 years of performance improvement, cost reduction (minicomputers, semiconductor memory) 1980s: Time sharing + inconsistent response times Workstations/Personal Computers >15-20 years of performance improvement, cost reduction (microprocessors, DRAM memory, disk) 2000s: PCs + difficulty of use/high cost of ownership ???
22
Computing Trends Post-PC Era u Multimedia Applications >real time data types; video, speech, animation, & music >90% of desktop cycles will be spent on media applications by end of 2000. >Multimedia workloads will continue in importance >Image, handwriting, and speech recognition will pose other major challenges. u Pervasive Mobile Computing Devices >support an expanding range of functions >challenge is in converging them into a single device >keeping the size, weight, and power consumption constant.
23
Sony Playstation 2000 u Emotion Engine: 6.2 GFLOPS, 75 million polygons per second (Microprocessor Report, 13:5) >Superscalar MIPS core + vector coprocessor + graphics/DRAM >Claim: Toy Story realism brought to games!
24
Intelligent PDA ( 2005?) Pilot PDA u gameboy, cell phone, radio, timer, camera, TV remote, am/fm radio, garage door opener,... u Wireless data (WWW) u Speech, vision recog. u Voice output for conversations -Speech control of all devices - Vision to see, - Scan documents, - read bar code,... - Measure room
25
Billion Transistor Architectures and “Mobile Multimedia” Metrics SS++ Trace SMT CMP IA-64 RAW Design Scal. –=–=== Energy/power–––==– Code Size====–= Real-time––==== Cont. Data====== Memory BW==== == Fine-grain Par.=====+ Coarse-gr.Par. ==++ =+ >“Direction for Computer Architecture Research”, Kozyrakis, Patterson IEEE Computer (11/98)
26
New Architecture Directions u “…media processing will become the dominant force in computer arch. & microprocessor design.” u “... new media-rich applications... involve significant real-time processing of continuous media streams, and make heavy use of vectors of packed 8-, 16-, and 32-bit integer and Fl. Pt.” u “Needs include high memory BW, high network BW, continuous media data types, real-time response, fine grain parallelism” “How Multimedia Workloads Will Change Processor Design”, Diefendorff & Dubey, IEEE Computer (9/97)
27
Some Media-Processing Functions KernelVector length Matrix transpose/multiply (3D Gr.)# vertices at once DCT (video, comm.)image width FFT (audio)256-1024 Motion estimation (video)image width, i.w./16 Gamma correction (video)image width Haar transform (media mining)image width Median filter (image process.)image width (from http://www.research.ibm.com/people/p/pradeep/tutor.html)
28
Challenges for Mobile Multimedia u High performance for multimedia functions u Energy and power efficiency (<1 Watt) u Small size (fit in pocket) u Low design complexity and high degree of scalability (costs few tens of $)
29
A Better Mobile Multimedia MPUs: Logic+DRAM u Embedded DRAM processors one possibility u Faster logic in DRAM process >DRAM vendors offer faster transistors + same number metal layers as good logic process? @ ≈ 20% higher cost per wafer? u Called Intelligent RAM (“IRAM”) since most of transistors will be DRAM u Leave for another presentation >“A Case for Intelligent RAM”Patterson, Anderson, …. IEEE Computer (3/97)
30
u 10000X cost-performance increase in “stationary” computers, consolidation of industry => time for architecture/OS/compiler researchers declare victory, search for new horizons? u Mobile Multimedia offer many new challenges: energy efficiency, size, real time performance,... u Apps/metrics of future to design computer of future! >Suppose PDA replaces desktop as primary computer? >Work on FPPP on PC vs. Speech on PDA? Mobile Multimedia Conclusion
31
u “Personal mobile computing offers a vision of the future with a much richer and more exciting set of architecture research challenges than extrapolations of the current desktop architectures and benchmarks.” u “Put another way, which problem would you rather work on: improving performance of PCs running FPPPP—a 1982 Fortran benchmark used in SPECfp95—or making speech input practical for PDAs? “ “ Direction for Computer Architecture Research”, Kozyrakis, Patterson IEEE Computer (11/98) From the Horse Mouth
32
References u IEEE Computers; Sept. 97, Jan. 98, Aug. 98, Nov. 98, u IEEE Micro: Dec. 96, Mar. 97, Sept. 97
33
Acknowledgement u Thanks to Dr. Vishv Malhotra for lending me some of his IEEE Computer issues. u Thanks to Prof. Sale for going through the slides and making useful suggestions. u WAIT FOR THE NEXT TWO SLIDES
34
Purpose of This Talk u To get Staff and Students excited about the new opportunities for research. u What would you be doing as a graduate? >Service Windows NT, and if lucky perhaps UNIX? >Develop web pages? >Do more of the same? u Or rather do something really exciting? u We need you if you choose the LATTER! u 50 Post Graduate Scholarship for IT up for grab
35
Our Vision and Aim u Achieve Critical Mass in Research u Create a Group of Staff & Students Working on the Problems of Future. u Pulling Australian IT Research Community Together u Identifying Niches Where We Can Make International Contribution.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.