Download presentation
Presentation is loading. Please wait.
Published byEdward Sims Modified over 9 years ago
1
CSIT 301 (Blum)1 Processor Specs
2
CSIT 301 (Blum)2 Pentium 4 Processor Specs
3
CSIT 301 (Blum)3 The above list of processor specifications includes such aspects as CPU Speed, Bus Speed, Manufacturing technology, Stepping, Cache Size, Package Type
4
CSIT 301 (Blum)4 Some more recent processor spec’s
5
Even more recent CSIT 301 (Blum)5
6
6 CPU Speed
7
CSIT 301 (Blum)7 CPU Speed
8
CSIT 301 (Blum)8 CPU Speed The activities of the processor are kept in sync by the clock. The clock goes through a regular/repetitive action. In a binary system, a cycle consists of a 1 and a 0 (a high followed by a low). The clock is usually a quartz oscillator that is external to the microprocessor. So the CPU speed is not something built into the chip, but rather the maximum rate at which the chip can be expected to perform normally.
9
CSIT 301 (Blum)9 CPU Speed (Cont.) Sometimes differently rated chips are made from the same manufacturing process, and the CPU speed is determined by some testing after the fact. Some people try to operate the processor faster than the designated rate. This is known as “overclocking.”
10
CSIT 301 (Blum)10 CPU Speed (Cont.) The speed is measured in Hertz, which are cycles per second. –KiloHertz, kHz, is thousands (10 3 ) of cycles per second –MegaHertz, MHz, is millions (10 6 ) of cycles per second –GigaHertz, GHz, is billions (10 9 ) of cycles per second –What’s next?
11
CSIT 301 (Blum)11 CPU Speed (Cont.) The clock speed is also known as the clock’s frequency (the number of cycles per second). A related quantity is called the period which is the time required for one cycle (a.k.a. as a clock tick). A clock’s frequency and period are reciprocals. –f = 1/T or T = 1/f, where f is frequency and T is period –E.g. a frequency of 60 Hertz (cycles per second) corresponds to a period of 1/60 = 0.0167 seconds per cycle
12
CSIT 301 (Blum)12 CPU Speed (Cont.) A frequency of 1 kHz [a thousand cycles per second] corresponds to a period (tick) of 1 millisecond (ms) [a thousandth (10 -3 ) of a second per cycle]. A frequency of 1 MHz [a million cycles per second] corresponds to a period (tick) of 1 microsecond ( s) [a millionth (10 -6 ) of a second per cycle]. A frequency of 1 GHz [a billion cycles per second] corresponds to a period (tick) of 1 nanosecond (ns) [a billionth (10 -9 ) of a second per cycle].
13
New to Intel: Turbo Boost CSIT 301 (Blum)13
14
Turbo Boost Instead of having a fixed top speed, the microprocessor has built in sensors for current, power usage, and temperature. Based on these factors and computational need, it will push the speed for one or more cores to its limit. CSIT 301 (Blum)14
15
CSIT 301 (Blum)15 Bus Speed
16
CSIT 301 (Blum)16 Bus Speed
17
CSIT 301 (Blum)17 Bus Speed There is a hierarchy of buses in a computer, but in a discussion of processors, the buses of interest are the front-side bus and the back-side bus. In early processors the CPU speed and bus speed (and thus the speed of interactions with memory, etc.) were the same. But a bottleneck (the von Neumann bottleneck) arose because memory speeds cannot keep up with processor speeds. And so accessing the memory was holding the processor back.
18
CSIT 301 (Blum)18 Front-side Bus (FSB) The Front-side Bus (a.k.a. the memory bus or system bus) connects the processor to other parts via the chipset. It allows communication between the processor and main memory (RAM), the system chipset, PCI devices, the AGP card, and other peripheral buses. When the “bus speed” is given as one of the processor’s specs it refers to the front-side bus speed.
19
CSIT 301 (Blum)19 The Northbridge A chipset is a simply group of chips that work together to perform related functions. The Northbridge chipset communicates with the processor (using the FSB) and controls interaction with memory, the PCI bus, and AGP. Northbridge’s partner in the chipset is the Southbridge. The Southbridge handles the IO functions. –The Intel Hub Architecture (IHA) is replacing the Northbridge/Southbridge chipset.
20
CSIT 301 (Blum)20 Backside Bus The back-side bus (a.ka. the cache bus) connects the processor to L2 cache. The term back-side bus is reserved for cases in which the L2 cache is packaged with the microprocessor. –If the L2 cache is separate from the processor, the front- side bus will connect the processor to the Level 2 cache. Cache (SRAM) operates faster than memory (DRAM). The backside bus operates at faster speeds than the front-side bus, sometimes it works at the processor speed.
21
CSIT 301 (Blum)21 FSB Speeds The ratio between the CPU speed and bus speed is a simple fraction. –For example, a CPU speed of 3.2 GHz and bus speed of 800 MHz has a ratio of 4. With Pentium III’s the 100 and 133 MHz FSB speeds became standard. That rate has been somewhat fixed for a few years but what is changing is the amount of data transferred each clock cycle. This is where one begins to talk of “DDR” or “quad-pumped.”
22
CSIT 301 (Blum)22 Edge-triggering
23
CSIT 301 (Blum)23 Edge triggering The clock keeps the various circuit elements working in unison. Elements are typically designed to be active on the “edge” of the clock – either –when it is rising (the positive edge) –Or when it is falling (the negative edge) More precise than level activation, where the action takes places when the clock has a certain state or level (e.g. when the clock is high).
24
CSIT 301 (Blum)24 DDR Double Data Rate (DDR) allows data to be fetched on both the positive and negative edges of the clock. –Thus it is essentially the equivalent of doubling clock rate. –E.g. a 100MHz DDR transfer equals that of a 200MHz SDR transfer
25
CSIT 301 (Blum)25 Quad pumped A quad pumped bus allows four signals to be communicated per clock cycle. This is sometimes called QDR (Quad Data Rate). Pentium 4’s uses a quad pumped FSB. –The 400MHz FSB is a 100MHz bus with four signals per cycle. –The 533MHz FSB is a quad-pumped 133MHz bus. Quad pumping is one of the features of the Pentium 4 Net-Burst micro-architecture.
26
New to Intel: QuickPath CSIT 301 (Blum)26 With multiple cores, now the chip has a built-in memory controller (Integrated Memory Controller) per core and each core gets assigned part of the system memory.
27
CSIT 301 (Blum)27 Manufacturing Technology
28
CSIT 301 (Blum)28 Manufacturing technology
29
Manufacturing Technology CSIT 301 (Blum)29
30
“Tick Tock” Intel refers to their progress as “tick-tock” Tick is an improvement in manufacturing technology – a decrease in the component size (Moore’s Law) Tock is a change in the architecture – new instructions, more controllers, more registers, etc. CSIT 301 (Blum)30
31
CSIT 301 (Blum)31 Manufacturing technology The next specification found in the table is manufacturing technology, which indicates the size of the components (mainly transistors) which reflects the number of components that can be placed on the chip. In earlier microprocessors, one used terms like large-scale integration (LSI), very large-scale integration (VLSI) and ultra large-scale integration (ULSI). –But as Moore’s Law continued to hold true, we ran out of adjectives.
32
CSIT 301 (Blum)32 Manufacturing Technology Today the manufacturing technology is given in terms of microns or nanometers (e.g. the 0.13- micron or the 90-nm technology). –A nanometer (nm) is a billionth of a meter (10 -9 m). The same chip may be made using different technologies, but this is to done to perfect the newer technology so that more components can be added to latter chips.
33
CSIT 301 (Blum)33 Stepping
34
CSIT 301 (Blum)34 Stepping As with software, mistakes (errata) in hardware are found and revisions are needed. However, hardware mistakes are more difficult to fix. The stepping refers to various fixes, so one wants a higher stepping which presumably has fewer bugs. –AMD uses the term “revision number.” The circuitry cannot be changed on an existing chip, it might be possible to overcome a processor bug by changing the BIOS which can be changed (flashed).
35
CSIT 301 (Blum)35 Pentium 4 Product Information
36
CSIT 301 (Blum)36 Document on Specification Update (Stepping Levels)
37
CSIT 301 (Blum)37 Cache size
38
CSIT 301 (Blum)38 Cache size
39
CSIT 301 (Blum)39 Cache Recall that there are three levels of cache (L1, L2 and L3) associated with the processor. The cache specification on the previous slide refers to L2 cache. A more detailed set of specification will reveal the amount of L1 and L2 as well as the amount of L3 that can be supported.
40
L3 The term L3 is starting to be used as for cache on the chip, but in addition to speed and use, another distinction is that each core has its own L2 now whereas L3 is shared. CSIT 301 (Blum)40
41
CSIT 301 (Blum)41 Package Type
42
CSIT 301 (Blum)42 Package Type
43
CSIT 301 (Blum)43 Form Factor and Package The term form factor applies to many devices including processors. It refers to their size and shape. And in the case of processors it also includes how they connect to the motherboard. –The motherboard has a slot or socket. A related term is the “package” — an enclosure for a chip (integrated circuit).
44
CSIT 301 (Blum)44 Pinning The pins or leads are how a chip interfaces with the outside world. There are various ways to arrange the pins on a chip. Furthermore, several chips can be brought together into unit called a module (common in memory).
45
CSIT 301 (Blum)45 PGA/DIP/SIP PGA: pin grid array, chip in which the pins are located on the bottom in concentric squares. –Used in some microprocessors. DIP: dual in-line package, rectangular chip with two rows of pins, one on each side. SIP: single in-line package, chip with pins protruding from one side
46
CSIT 301 (Blum)46 SEPP Single-Edge Processor Package With the S.E.P.P. form factor, the processor is not completely covered by the black plastic (as in S.E.C.C.and S.E.C.C.2). The circuit board (substrate) can be seen from the bottom side. An out-dated processor packaging scheme.
47
CSIT 301 (Blum)47 SECC Single Edge Contact Connector With the S.E.C.C. form factor, processors have a plastic shroud covering with an active heatsink and fan. Identifiable by the goldfinger contacts which in this case are inside of the plastic housing. Another out-dated processor packaging scheme.
48
CSIT 301 (Blum)48 Heat Recall that in the history of processors the number of transistors continues to grow (Moore’s Law) while the relative size of the chip stays fixed. With more transistors carrying current, more heat is produced. Various developments have occurred to deal with the issue of heat. One is a reduction in the working voltage (5V 3.3V 2V). Another has been the introduction of the heatsink and fan.
49
CSIT 301 (Blum)49 Heat Sink The computer has had a fan for some time to deal with heat. But starting with the 486, the processor needed special consideration. A heat sink is an element designed to take heat away from the processor. In this case, heat is dissipated mainly via convection, the heat is transferred to the nearby air and is carried away with the air as it moves. –Convection is why a breeze feels nice on a hot summer day.
50
CSIT 301 (Blum)50 Desired Effects A heat sink should have a large surface area since this is where the heat is transferred to the air. But the heat sink should not block the air flow since this is how the heat is carried away. Heat sinks often have very strange shapes to try to maximize these two competing effects. –Typically made of Aluminum –May have “fins”
51
CSIT 301 (Blum)51 Heat Sinks
52
CSIT 301 (Blum)52 Passive and Active All modern processors have a heat sink. Some also require a fan. –Without a fan: passive heat sink –With a fan: active heat sink Because the heat sink’s purpose is to dissipate heat, it is important that the heat can get from the processor to the heat sink. The material “gluing” the heat sink to the processor must conduct heat well. A heat slug is a piece of metal that connects the processor core to the processor package and/or heatsink.
53
CSIT 301 (Blum)53 SECC2 As with SECC, with SECC2 the processors have a plastic housing with an active heatsink (means it has a fan). It is distinct from SECC in that the goldfinger contacts are exposed.
54
CSIT 301 (Blum)54 PPGA Plastic Pin Grid Array With PPGA the processors have pins arranged in a square pattern. They fit into Socket 370 motherboards. Look for the square pattern (Pin Grid Array) on the bottom. Slot connectors do not have pins.
55
CSIT 301 (Blum)55 FC-PGA Flipped-Chip Pin Grid Arrays The chip is designed so that the “core” processor, which is the part that gets the hottest, is on top (closer to the heat sink). Also fits into a socket 370 motherboard. But it must be a FCPGA compliant motherboard for FCPGA processor to work.
56
CSIT 301 (Blum)56 Pentium 4 Form Factors Pentium 4’s also come in a FCPGA form factor. –The package uses 478 pins, which are 2.03 mm long and.32 mm in diameter. FCBGA (Flip Chip Ball Grid Array) –Instead of pins, FCBGA uses small balls, which acts as contacts for the processor. Pins bend, ball don’t. –The package uses 479 balls, which are.78 mm in diameter.
57
CSIT 301 (Blum)57 The LGA "Intel’s new LGA, or Land Grid Array, 775 processor socket takes a step away from traditional implementations in that the package no longer features pins, rather the bottom of the LGA 775 processors only have small gold contacts. With the LGA package, Intel has moved the pins into the bottom portion of the processor socket, something that will make installation of the processor easier in that there is no need to watch for bent pins on the package...although it will make it more difficult as well. You no longer need to worry about bent or damaged pins on the processor, rather now you have to worry twice as much about bent pins within the processor socket itself." http://rootprompt.org/article.php3?article=7115
58
CSIT 301 (Blum)58 Micro-architecture A processor’s architecture refers to its instruction set, the number and type of registers, and memory- resident data structures (e.g. stacks) that are available to a programmer (at least at the assembly level). A processor’s micro-architecture refers to the hardware implementation of the architecture (the transistors). Backward compatibility is within the architecture (which is more of a logical level). The micro- architecture (implementation) may change dramatically and is not necessarily compatible with previous versions.
59
CSIT 301 (Blum)59
60
CSIT 301 (Blum)60 Intel® Wide Dynamic Execution A combination of techniques (data flow analysis, speculative execution, out of order execution, and super scalar) that enables the processor to execute more instructions in parallel. –Pipelining ideas Delivers more instructions per clock cycle to improve execution time and energy efficiency.
61
Same for Nehalem Allows up to four instructions per clock cycle Has improved “out of order” instruction handling –They have made the window/buffer of instructions to be examined for possible pipelining or parallelizing CSIT 301 (Blum)61
62
CSIT 301 (Blum)62 Pipelining Recall that to execute an instruction, one must fetch it, decode it, fetch any data required, execute the instruction, write the answer to the appropriate place and possibly look for an interrupt requests that might have occurred during the previous. In pipelining a processor can begin executing a second instruction before the first has been completed. Thus, many instructions are in the pipeline at the same, though at various processing stages.
63
CSIT 301 (Blum)63 Pipelining The pipeline is divided into segments. Each segment can perform its duty at the same time as the other segments. When a segment completes its task, it passes the result to the next segment and fetches the next operation from the preceding segment. Once a feature of only high-end processors, now pipelining is standard. –A Pentium had up to six instruction in the pipeline.
64
CSIT 301 (Blum)64 Hyper-Pipelined Technology Pentium 4’s Hyper-pipelined technology uses a 20-stage pipeline. Having so many instructions in the works can be a problem if the program branches and one has the wrong instructions in the pipeline. For long pipelines to be effective there must be good “branch prediction.” BPU – Branch Prediction Unit
65
CSIT 301 (Blum)65 Intel® Wide Dynamic Execution (Cont.) Wider execution core allow each core to fetch, dispatch, execute and retire up to four full instructions simultaneously. More accurate branch prediction Deeper instruction buffers for greater execution flexibility
66
Nehalem The Nehalem micro-architecture has added a “Second-Level Branch Prediction Target Buffer” improves prediction and makes the situation better when the prediction is incorrect. CSIT 301 (Blum)66
67
CSIT 301 (Blum)67 Intel® Advanced Smart Cache The Intel Advanced Smart Cache is a multi- core optimized cache. Reduces latency to frequently used data –Improves performance and efficiency by increasing the probability that each execution core can quickly access data.
68
CSIT 301 (Blum)68 Intel® Smart Memory Access Intel Smart Memory Access optimizes the use of the available data bandwidth from the memory subsystem. Includes an important new capability called "memory disambiguation," –which increases the efficiency of out-of-order processing by providing the execution cores with the built-in intelligence to speculatively load data for instructions that are about to execute before all previous store instructions are executed. –(I.e. get what you need when you need it)
69
CSIT 301 (Blum)69 Intel® Advanced Digital Media Boost Intel® Advanced Digital Media Boost is a feature that significantly improves performance when executing Intel® Streaming SIMD Extension (SSE/SSE2/SSE3) instructions. Accelerate video, speech and image, photo processing, encryption, financial, engineering and scientific applications. Enables 128-bit instructions to be executed at a throughput rate of one per clock cycle, doubling the speed of previous generations.
70
CSIT 301 (Blum)70 Internet Streaming SIMD Extensions SSE is an acronym within an acronym: It stands for Streaming SIMD Extensions, where SIMD is Single Instruction Multiple Data SSE consists of 70 SIMD instructions for integer and floating-point operations. It helps with high resolution images, audio and video viewing, speech recognition etc. Pentium 4 actually uses SSE2. SSE2 adds 144 new instructions.
71
CSIT 301 (Blum)71
72
CSIT 301 (Blum)72
73
SSE4 (Cont.) CSIT 301 (Blum)73
74
CSIT 301 (Blum)74 Intel® Virtualization Technology Intel® Virtualization Technology (Intel® VT)¹ improves traditional software-based virtualization solutions. –“These integrated features give virtualization software the ability to take advantage of offloading workload to the system hardware, enabling more streamlined virtualization software stacks and ‘near native’ performance characteristics.”
75
CSIT 301 (Blum)75 Virtualization Using virtualization, one computer system can function as multiple "virtual" systems. –Can run multiple operating systems (simultaneously) –One machine being used as a number of independent virtual machines. –Allows consolidate and balancing of multiple workloads on one physical server system. –Lowers hardware acquisition costs –Improved data center performance efficiency.
76
CSIT 301 (Blum)76 Execute Disable Bit Intel's Execute Disable Bit allows the processor to distinguish between areas in memory where an application can execute and where it cannot. Can be used to disable certain worm attacks.
77
CSIT 301 (Blum)77 References PC Hardware in a Nutshell, Thompson and Thompson http://www.webopedia.com http://www.intel.com http://www.anandtech.com http://www.mbreview.com/lga775.php
78
References (Cont.) http://www.intel.com/technology/architectu re-silicon/next-gen/whitepaper.pdf http://en.wikipedia.org/wiki/SSE4 CSIT 301 (Blum)78
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.