Chris Foster Brian Moore Scott Thibaudeau
Overview I/O EE – Emotion Engine!! Graphics Synthesizer Comparison
System Overview RDRAM SRAM Emotion Engine IO Processor Graphics Synthesizer Sound Processor DVD OS ROM PCMCIA Interface PCMCIA Interface IEEE1394 USB (x2) Controller Digital RGB Analog RGB Digital L/R Analog L/R
Controllers Two controller ports –PS2 Dual Shock 2! Directional Pad Force Feedback 15 buttons –All but Start/Select are ANALOG!
Memory Cards/Ports 8 Mb storage capacity Transfer rate 250 times that of PSX memory cards Incorporates “MagicGate”
DVD/CD-ROM Supports DVDs, Audio CDs, PSX CDs, PS2 DVDs 4.7 Proprietary DVD 4x DVD-ROM 24x CD-ROM DVD video playback Built into hardware
Miscellaneous Ports IEEE1394 Standard – i.Link TM firewire Expansion Unit for network interface One Type III PCMCIA card slot –Flash Memory –Small Hard Drives 2 USB Ports
Input/Output Processor Handles all input and output PSX Core Processor MHz or MHz (Selectable) 2 MB Memory 32-bit Bus Interfaces with i.Link TM, USB, Controller and Memory ports
The Emotion Engine Designed By Toshiba Combination CPU and DSP Processor Simulates Massive 3D worlds
MIPS IV CPU core Two Vector Processing Units Floating-point Coprocessor (FPU) Image Processing Unit (MPEG2 decoder) 10-channel DMA controller Graphics Interface Unit (GIF) RDRAM interface and I/O interface Geometry Calculations Physics calculations AI Data Transfers The Emotion Engine
Geometry + Perspective Transformation 66 Million Polygons/sec + Lighting 38 Million Polygons/sec + Fog 36 Million Polygons/sec Curved Surface Generation (Bezier) 16Million Polygons/sec Image Processing Unit MPEG2 Macroblock Layer Decoder Image Processing Performance 150Million Pixels/sec
Most Important DSP Feature Today: the MAC Vector Calculations: Dot Product Summing of Products x 0 *x 1 + y 0 *y 1 + z 0 *z 1 PS2 has 10 FMACs Itanium Contains Only 4! Each FMAC Computes One 32-bit Floating-point MAC Operation per Clock Cycle DSP Features
DSP Processor Requirement 2 Memory Bandwidth Memory Availability New Caching Method Static Application Dynamic Media App Solution Increase the Pipe while Decreasing the Bucket
Static Application High Degree of Instruction-level Parallelism Dynamic Media App High Degree of Data Parallelism PS2 has Moved to SIMD (Single Instruction stream, Multiple Data streams)
Caching on the PS2 DMAC Coordinates Transfers on the 128 and 64-bit Data Paths L1 Cache: 16K Ins. 32K Data SPRAM: 16K VU0 Cache: 32K VU1 Cache: 32K PIII has 3X the Amount of Cache as the PS2
VU Caches Too Small to Hold Models PS2 Code & Data Constantly Streamed Over Wide Internal Buses Results in 10 Channel DMAC Being Used at Near Capacity Busting Out All Over
Two Independent Vector Processing Units CPU + FPU: Basic Program Control and Housekeeping CPU + FPU + VU0: Behavior Synthesis and Physics Calculations VU1: Generates Display Lists IPU: Image Decompression FPU and VU0: Coprocessors for CPU Core GS and VU1: Tied Together with GIF
Main Memory VPU1 VPU0 CPU SPRAM Rendering Engine Parallel Connection Both Groups Sending Display Lists to GIF
VPU0 CPU VPU1SPRAM Rendering Engine Serial Connection CPU/FPU/VU0 Acting as a Preprocessor for VU1
CPU core 128bit RISC (MIPS IV-subset) Clock Frequency 300MHz Integer Unit 64bit (2-way Superscalar) Multimedia extended instructions 107 instructions at 128bit width Integer General Purpose Register 32 at 128 bit width TLB 48 double entries Instruction Cache 16KB (2-way) Floating Point Performance 6.2GFLOPS Data Cache 8KB (2-way) Scratch Pad RAM 16KB (Dual port) Main Memory 32MB (Direct RDRAM Memory bandwidth 3.2GB/sec DMA 10 channels Co-processor1 FPU (FMAC x 1, FDIV x 1) Co-processor2 VU0 (FMAC x 4, FDIV x 1) Vector Processing Unit VU1 (FMAC x 5, FDIV x 2)
Gate Width: 0.18 micron VDD Voltage: 1.8 V Power Consumption: 15 Watts Metal Layers: 4 Total Transistors: 10.5 Million Die Size: 240 mm2 Package: 540pin PBGA
Sound Processor Specifications At A Glance: Sound "SPU2+CPU" Number of Voices ADPCM: 48ch on SPU2 plus definable, software programmable voices (Adaptive Differential Pulse Code Modulation) Sound Memory 2MB Output Frequency Variable up to 48 KHz (DAT quality) “Speech processing is important, sir!”
Clock Frequency MHz DRAM Bus bandwidth *Very High Throughput* Pixel Configuration RGB:Alpha:Z Buffer (24:8:32) Polygon Drawing Rate 75 Million Polygons Per Second Screen Resolution Variable from 256x224 to 1280x1024 Graphics Synthesizer Specifications At A Glance: ON DIE MAIN MEMORY! Redundancy for smoothness/error correction “With Z-buffering, textures, lighting and alpha blending (transparency), a sustained rate of 20 Million polygons per second can be drawn continuously.” -Ars Technica
4MB On Die VDRAM Polygon Drawing Rate 75 Million /sec (small polygon) 50 Million /sec (48 Pixel quad with Z and A) 30 Million /sec (50 Pixel triangle with Z and A) 25 Million /sec (48 Pixel quad with Z, A and T) VRAM is accessible from on die register files VRAM itself is on die VDRAM bus? Register to Main memory all on die? Allows for fast data transfer: 48GB/s 2,560 bit wide VDRAM bus!!!!
Process Modes Rendering Engine Visualization -3D Polygon (triangle, quad, mesh) -2D Sprite -2D/3D Line -Particle -Point -Photo/Movie (MPEG-2) Sony optimizes their instruction set to speed up these specific operations
Inside the Beast No. of Pixel Engines 16 (in Parallel) Display output NTSC/PAL Silicon process technology 0.25 µ 4-level metal Total number of transistors 43 Million Die size 279mm 2 Embedded DRAM 4 MB of multi-port DRAM PE n=2,560 Input: Highly Parallel Pixel Proccess Instructions Output: Geometry Engine
Talking Shop: Pixel Configuration Geometry engine: Alpha channel Anti-aliasing Bezier surfacing Gouraud shading Mip mapping Z-buffer Pixel Configuration: 64-Bit (RGB, Alpha, Z-Buffer (24,8,32) 24-Bits are used for RGB colors, 8-Bits are used for transparency effects, and 32- Bits are saved for the Z-Buffer which keeps track of everything in the 3D environment.)
Talking Shop: GEOMETRY ENGINE – Alpha Channel/Aliasing Alpha Channel/Aliasing: Alpha channel Transparency effects are created using high density dark gray Alpha bits. Anti-aliasing Aliasing: The jagged “stair steps” that occur when images are painted from pixels in straight lines. Anti-aliasing: Drawing gray pixels around the lines of an image, “blurring” the lines, minimizes the stair steps and makes an object appear more realistic.
Talking Shop: GEOMETRY ENGINE – Bezier Surfacing/Gouraud Shading Bezier Surfacing/Gouraud Shading: Bezier Surfacing A DSP that, approximates a 3D real-world object Into a 2D polygon structure. This involves *pixel triangle polygon 3d-object* transformation. Gouraud Shading: This is a technique (also DSP) that applies shading For smoothness to a Bezier Surface 3d-object.
Talking Shop: GEOMETRY ENGINE – Mip Mapping Mip Mapping: When it doesn’t look the same up close as it does from far away, you’re using MIP Mapping!
Compare and Believe: GAMECUBE: Polygons per second: 12 million polygons. PLAYSTATION2: Polygons per second: 75 million polygons. XBOX: Polygons per second: 45 million polygons. Did we mention that You can run Linux on a PS2? Open Market – January Linux Beta – This year Linux Emulation for Cluster Parallel Proc. – *Now*