Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graphics Hardware: Specialty Memories, Simple Framebuffers

Similar presentations


Presentation on theme: "Graphics Hardware: Specialty Memories, Simple Framebuffers"— Presentation transcript:

1 Graphics Hardware: Specialty Memories, Simple Framebuffers
Bob Reese - ECE, MSU EE Computer Graphics Hardware Spring 98

2 Introduction Prof Moorhead has previous given overviews of some graphic accelerator architectures Simple FrameBuffers 2D & 3D accelerator architectures We will review some of this material and concentrate on memories built especially for use with graphics accelerators. 12/8/2018

3 Memory Issues: Density
Lots of Pixels => Lots of Memory 1280 x 1024 x 32 bits/pixel => 5.1Mb Double Buffering => Double Memory Two frames of above => 10.2Mb Storage needed for things other than pixels, e.g. textures Voodoo2 card has 12Mb (high-end gaming) Intergraph Realizm 3D has up to 32 Mb 100 bits per pixel with 2.5 Million pixels (1824 x 1368) 12/8/2018

4 Memory Issues: Performance
Performance in this instance means BANDWIDTH How fast can I get data in/out? Bandwidth affected by bus width, bus rate, and contention for memory At least two contenders for Graphics memory CPU Write Mostly, Random Access Video Controller Read Only, Sequential Access Memory 12/8/2018

5 Memory Issues:Cost Cost primarily an issue for low to mid range boards intended for consumers Would like to use standard DRAM for memory because: offers highest density at lowest cost High volume of standard DRAM also means low cost BUT…. will performance will suffer because it only has one port? Newer single port Synchronous DRAMs have enough performance for both 2D & 3D apps 12/8/2018

6 Single Port FrameBuffer
CPU Monitor Memory MUX RAMDAC Control Control arbitrates between video and CPU access Simplest scheme only allows CPU access during horizontal or vertical retrace 12/8/2018

7 Double Buffering Can Help
CPU MemoryFB #1 Monitor MUX RAMDAC MemoryFB #2 Control When CPU is preparing next frame, video is accessing current frame 12/8/2018

8 A 2D Accelerator with Single Port DRAM
DRAM (single ported) CPU 12/8/2018

9 Specialty Memories for Graphics
SGRAM (Synchronous Graphics RAM) Single Ported Synchronous DRAM with support for fill operations, fast operation VRAM (Video RAM) Dual Ported DRAM - parallel port for random access and serial output port (locations accessed sequentially WRAM (Windowed RAM) VRAM with better fill operation support 3D-RAM ASM (Application Specific Memory) with support for rasterization portion of OpenGL pipeline 12/8/2018

10 IBM 256K x 16 VRAM 12/8/2018

11 VRAM Details SAM - Serial Access Memory
arranged as 256 x 16 registers connected to 16 serial outputs (SDQ0 - SDQ15) registers read in sequential order Entire SAM can be loaded from DRAM array in one memory cycle Supports a fill operation in which 8 columns be written with data from a color register high, low bytes can be seperately masked in each column 12/8/2018

12 Mapping Pixels to Memory
Simplest case: assume a 512 x 512 screen with 16bpp Memory Location = Pixel # Bit 15 Bit 14 … ….. pixel 0 pixel 511 Bit 2 Bit 1 Each memory location has one pixel. Memory locations are in scanline order. Each plane 512 x 512 bits Numbers are Mem locations 1st plane is Bit 0 of each pixel 12/8/2018

13 Multiple pixels per word
If we go to 8bpp, then we can double the number of pixels. Increase 50% in both X, Y => x ( = 768) Now each memory location contains two pixels. Bit 15 Bit 14 pixel 1 pixel 1023 Bit 8 Bit 7 pixel 0 Bit 1 pixel 1022 12/8/2018

14 Loading the SAM Use a 512 x 512 screen with 16bpp
Two 128 bit segments of selected Row is loaded into SAM Columns Row 0 SAM 256 x 16 When SAM read sequentially, pixels read in scanline order 127 128 12/8/2018 255

15 Fill Operation Support
Block Fills common operation in 2D graphics 8 Columns from selected row can be filled with value from color register (16 bit) in one operation Bit 15 Bit 14 pixel 1 pixel 1023 Bit 8 Bit 7 pixel 0 Bit 1 pixel 1022 Colored locations show fill operation. Starting column address has lower 3 bits = 0. (512x512)/8 => ops to fill entire array. 12/8/2018

16 Fill can also do Stenciling
Can mask individual bits and entire columns. (only 8 bits shown here for each location) 12/8/2018

17 Getting Pixels on the Screen
Need to hook up RAMDAC to convert Pixels to RGB. RAM (palette Table) + Digital to Analog Converter Will look at a Brooktree RAMDAC as an example 8 bit pixel input used to address 256 x 24 lookup table 15, 16, or 24/32 BPP true color supported (lookup table bypassed). Dual edge clocking supported to reduce number of load clocks for true color pixels 12/8/2018

18 Brooktree RAMDAC 12/8/2018

19 DAC Bit assignments The above is used when in true color mode.
12/8/2018

20 VRAM to DAC VRAM 16 MUX 8 Assume no dual edge clocking
SAM 16 MUX 8 RAMDAC B G Assume no dual edge clocking Define Pixel Rate as RBG update rate If 16BPP, SAM clk = pixel rate; RAMDAC clk = 2 *pixel rate If 8BPP, SAM clk = 1/2 pixel rate; RAMDAC clk = pixel rate If 32BPP, SAM clk = 2 * pixel rate RAMDAC clk = 4 * pixel rate 12/8/2018

21 Two Frame Buffers VRAM 16 MUX 8 VRAM 16 SAM CE RAMDAC SAM CE
G 16 SAM CE Frame Select Bit 12/8/2018

22 IBM SGRAM 8 Mb (256K x 32) 12/8/2018

23 IBM SGRAM Features A Synchronous DRAM with a few extra features for graphics Fill operation support as in VRAM Two color registers Pipelined architecture that allows Column address to change every cycle Precharging one bank while accessing other bank allows continuous access Implies that rows would accessed alternatively between banks 83 Mhz, 100 Mhz, 133 Mhz clock speeds 12/8/2018

24 Sample Configuration Assume 83 Mhz bandwidth 1024 x 1280 x 32 Display
Each SGRAM can hold 256K Pixels 1280/256 => 5 SGRAM chips Assume no double buffering Assume Refresh Rate = 72 Hz (13.9 ms) Each chip accessed 13.9/5 = 2.8ms (1/5 or 20% of my bandwidth is used for screen refresh). 80% * 83 Mhz => 66 Mhz bandwidth!!! Previous generation DRAMs had only 20 Mhz bandwidth 12/8/2018

25 More Numbers What can be done with 66 Mhz bandwidth?
What can be done in 4 * 2.8ms => 11.2 ms? Assume I want to read each pixel, modify it, write it in one screen refresh time. If I assume no pipelining of operations, but this operation is done by attached accelerator, then will take 3 clocks (1 clk read, 1 clk op, 1 clk write) 83 Mhz => 12 ns 256K pixels * 12 ns * 3 => 9.5 ms Plenty of time!!!! Still have 1.7 ms left over! Double buffering would help, then I would have: 10 SGRAM chips, 9 * 2.8 ms => 25.2 ms 12/8/2018

26 Parallel Pixel Ops The previous example assumed parallel pixel ops to each SDRAM device Time 1 - PixelOp blocked to #1 Video SDRAM1 PixelOp1 SDRAM2 SDRAM3 SDRAM4 SDRAM5 PixelOp2 PixelOp3 PixelOp4 PixelOp5 Time 2 - PixelOp blocked to #2 Video SDRAM1 SDRAM2 SDRAM3 SDRAM4 SDRAM5 PixelOp1 PixelOp2 PixelOp3 PixelOp4 PixelOp5 12/8/2018

27 Non-Parallel Pixel Ops
If non-parallel ops, then only have 2.8 ms! obvious advantage of parellel pixel ops Time 1 - PixelOp, Video access different SDRAMs Video SDRAM1 SDRAM2 SDRAM3 SDRAM4 SDRAM5 SRAMs 3,4,5 are idle PixelOp Time 2 - PixelOp, Video access move to next SDRAM Video SDRAM1 SDRAM2 SDRAM3 SDRAM4 SDRAM5 SRAMs 1,4,5 are idle PixelOp 12/8/2018

28 Samsung WRAM 12/8/2018


Download ppt "Graphics Hardware: Specialty Memories, Simple Framebuffers"

Similar presentations


Ads by Google