Download presentation
Presentation is loading. Please wait.
1
High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/2015 1
2
Agenda 2 Project Description Charles MarketingShabnam Behavioral DescriptionEmeka Design ProcessChris Floorplan EvolutionShabnam Design SpecificationsChris LayoutCharles ConclusionEmeka
3
Charles Fan Project Description 3
4
4 High Dynamic Range?? Bright colors are BRIGHT Dark colors are DARK Details are seen CLEARLY Otherwise… Colors and lights look distorted & bland FP HDR Format requires 48 bits per pixel Problem: Too much storage space & memory bandwidth!! Solution: HDR encoding yields 6:1 compression OUR GOAL: Implement efficient HDR decoding in hardware 6:1 pixel compression Increases useable storage space by 6 fold decrease memory bandwidth by 6 fold Effectively increases performance
6
Shabnam Aggarwal Marketing 6
7
7 AMD’s ATI Mobility Radeon X1900 48-bit floating point HDR HDR Compression is currently NOT supported Performance hit deters developers Windows Vista also now requires a high end GPU to realize its full graphics potential. Laptops & portable devices are using dedicated processors for graphics OLED (Organic Light Emitting Diode) Displays are being developed by Sony Contrast Ratio: 1000000:1
9
Marketing 9 Our decoder is designed to interface between specially encoded textures stored on the GPU’s memory and one of the GPU’s texture caches that feed into the shader processor. Each ROP on (**ATI) is capable of processing 4 pixels per clock cycle. We plan for our hardware to decode the texture information for 4 pixels during each clock cycle. This decoder will allow smaller textures to be stored in the GPU’s memory, which will allow graphics cards to provide the same functions with less memory. Ultimately, this decoder can provide savings in cost, power consumption, heat dissipation, and size in current graphics cards. Our HDR Decoder!!
10
Marketing 10 Our HDR Decoder: Smaller textures stored in GPU’s memory Same functions…less memory Savings in: Cost Power consumption Heat dissipation Size HDR is the next generation of display technology
11
Emeka Ezekwe Behavioral & Algorithmic Description 11
12
Algorithmic Description Encoding Break texture into 4X4 pixel blocks. Extract luminance value of each pixel. Normalize red and blue values and average over each 2X2 block. Green can be recalculated while decoding. Allocate more bits to luminance values. After encoding, a 4X4 block of pixels can be compressed from 48 bpp to 8 bpp.
13
Algorithmic Description Decoding (Luminance values) Reconstruct Lp 1 Logical shift 1 Integer addition Calculate GQ 1 Integer addition Calculate final pixel values 3 floating-point multiplications Total calculations 1 logical shift + 2 Integer additions + 3 floating-point multiplications
14
Data Flow 14 Find G Reg 7 7 4 4 4 4 8 Compute 1 pixel Compute 1 pixel Compute 1 pixel Compute 1 pixel Int to FP Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Reg 16 Serialize output Serialize output Serialize output Serialize output
15
Chris Thayer Design Process 15
16
Design Process 16 Goal: Speed 400 MHz 4 pixels per cycle, 4 cycles per block Architectural decisions No denormal support in Floating Point Multiplier Pipelined design Storing input values Integer Multiplication Wallace trees Booth encoding Critical adders Carry select Integer- Floating Point Conversion
17
Circuit level decisions Mirror FA’s to reduce carry-chain delay Two different HA’s AOI/OAI gates Gate sizing along critical paths Utilize Q and ~Q outputs from registers Clock buffers built into register blocks Double/Triple strapped VDD and GND Repeaters to break up long wires Balanced clock tree Device Folding Design Process
18
Verification Process 18 C Implementation Structural Verilog Gate Level Schematic Layout Major Modules Pipeline Stages Global Signals
19
Shabnam Aggarwal Floorplan Evolution 19
20
Floorplan Evolution
21
Chris Thayer Design Specifications 21
22
Design Specifications 22 Delays Stage one pipeline: 1.8 ns Stage two pipeline: 1.53ns Stage three pipeline: 2.479ns Skew Stage one: x Stage two: x Stage three: x Resulting Clock Speed: 500 MHz 2 BILLION pixels per second Size: 442x453 microns Aspect Ratio: 1:1.024 Transistors: 42,772 Density: 0.21 T/micron^2
23
Charles Fan Layout 23
24
Floating Point Multiplier Layout 24 Pretty beautiful
25
Floating Point Multiplier Data Flow
26
Poly Layer 26
27
Metal One Layer 27
28
Metal Two Layer 28
29
Metal Three Layer 29
30
Metal Four Layer 30
31
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.