NATIONAL POLYTECHNIC INSTITUTE COMPUTING RESEARCH CENTER IPN-CICMICROSE Lab Design of a Multimedia Extension for RISC Processor Ing. Eduardo Jonathan Martínez Montes Ph.D. Marco Antonio Ramírez Salinas
I.Thesis Requirements 1.Committee Tutorial 2. Objective 3.Justification 4.Problem overview II.Overview 1. RISC Processor 2. Architectures 3. Vector Processing 4. SIMD 5. SIMD vs SISD 6. Example 7. State of the art III. Work Done 1. RISC Segmented Processor 2. Debugger 3. Program Memory OUTLINEPart 1 IPN-CICMICROSE Lab2 4.Data Memory 5.Register Alias Table 6.LCD Controller 7.UART 8.SRAM Controller 9.2 Instruction Decode 10.2 Instruction Queue
IV.Current Work I.Looking for a Multiplier II.Redesigning the Rename Unit III.Complete the set instruction V.Work as a Research Team 1.uClinux VI.Future Work 1.Implement a Data Bus IPN-CICMICROSE Lab3 OUTLINEPart 2
IPN-CICMICROSE Lab4 THESIS REQUIREMENTSCommittee Tutorial NameExpertise Area Ph.D. Marco Antonio Ramírez SalinasComputer Architecture Ph.D. Luis Alfonso Villa VargasComputer Architecture Ph.D. Herón Molina LozanoVLSI Ph.D. José Luis Oropeza RodríguezOperating Systems
IPN-CICMICROSE Lab5 THESIS REQUIREMENTSObjective General Objective Design a multimedia extension unit for a RISC processor (Alligator). Specific Objectives Design a vector adder with saturation arithmetic. Design a multiplier with saturation arithmetic. Design a divisor with saturation arithmetic. Implement all the Instruction set of the MIPS Digital Media extension (MDMX).
IPN-CICMICROSE Lab6 THESIS REQUIREMENTSJustification Alligator is a superscalar embedded processor, now in develop. The goal of this effort is to be used to help in the research and teaching. This processor require the design and build many blocks, so that, this project is part of a bigger project.
IPN-CICMICROSE Lab7 THESIS REQUIREMENTSProblem Overview Multimedia Extension is a vector machine that is embedded in same chip with the main Superscalar Processor, it is used for deal with multimedia applications. Integrate Multimedia Extension Architecture as a coprocessor to the Superscalar Processor Integrate the MDMX Set Instruction as a part of ISA in the Decode stage. Deal with memory challenges for sharing data.
IPN-CICMICROSE Lab8 THESIS REQUIREMENTSProblem Overview
IPN-CICMICROSE Lab9 THESIS REQUIREMENTSProblem Overview
IPN-CICMICROSE Lab10 OVERVIEWRISC Processor Reduced Instruction Set Computing (RISC). The main idea is to keep the design simplified.
IPN-CICMICROSE Lab11 OVERVIEWArchitectures SISD: Scalar Processor, executes only one datum at a time. MIMD: Superscalar Processor, exploits parallelism in the Instruction stream. SIMD: Vector Processor, exploits parallelism in the data stream.
IPN-CICMICROSE Lab12 OVERVIEWSIMD Single Instruction Multiple Data, this architecture performs the same operation on multiple data elements in parallel.
IPN-CICMICROSE Lab13 OVERVIEWSIMD vs SISD
IPN-CICMICROSE Lab14 OVERVIEWSIMD Example (part 1) Example: get negative image
IPN-CICMICROSE Lab15 OVERVIEW Normal Processing SIMD Example (part 2)
IPN-CICMICROSE Lab16 OVERVIEW Parallel Processing SIMD Example (Part 3)
IPN-CICMICROSE Lab17 OVERVIEWState of the Art AVX2 - Intel 2013 Sandy Bridge y Bulldozer - Intel y AMD 2011 Advanced Vector Extensions (AVX) - Intel 2008 SSE4 - Intel 2006 SSE y SSE2 - AMD 2004 SSE3 - Intel 2004 Advance 3DNow! (3DNow! 2) - AMD 2003 AltiVec - IBM 2002 SSE2 - Intel DNow!. - AMD 2000 Streaming SIMD Extensions (SSE)- Intel 1999 Pentium II (MMX)- Intel 1998 AltiVec - Motorola
IPN-CICMICROSE Lab18 WORK DONERISC Segmented Processor
IPN-CICMICROSE Lab19 WORK DONEDebugger (part 1) Definition Every time that you create something new, like a program or in this case new hardware. You need something to test and trace faults and then fix it. All the developers, even all the engineers know what a debugger tool.
IPN-CICMICROSE Lab20 WORK DONEDebugger (part 2) Features Friendly GUI interface Load and download the Program Memory Load and download the Data Memory View the registers Reset the processor Pause the processor Run step by step de processor Use breakpoints Change the clock frequency In fact, it can work without a GUI!
IPN-CICMICROSE Lab21 WORK DONEDebugger (part 3)
IPN-CICMICROSE Lab22 WORK DONEThe Program Memory (cache L1) Implemented in dedicated memory (M9K) 1 write port 2 read port Size 512 bytes LC CombinationalsLC RegistersMemory Bytes 6591,024
IPN-CICMICROSE Lab23 WORK DONEThe Data Memory (cache L1) Implemented in dedicated memory (M9K) 2 write port 5 read port Size 512 bytes LC CombinationalsLC RegistersMemory Bytes 96326,144
IPN-CICMICROSE Lab24 WORK DONERegister Alias Table Implemented in dedicated memory 6 write port 12 read port 128 register of 32 bits LC CombinationalsLC RegistersMemory Bytes 1, ,544
IPN-CICMICROSE Lab25 WORK DONELCD Controller It has a state machine that read a 32 register memory (32x8) Characters are only write in the memory and it does the rest LC CombinationalsLC RegistersMemory Bytes
IPN-CICMICROSE Lab26 WORK DONEUART 9600 bps 8N1 LC CombinationalsLC RegistersMemory Bytes 44540
IPN-CICMICROSE Lab27 WORK DONESRAM Controller LC CombinationalsLC RegistersMemory Bytes 4900
IPN-CICMICROSE Lab28 WORK DONETwo Instruction Decode LC CombinationalsLC RegistersMemory Bytes
IPN-CICMICROSE Lab29 WORK DONETwo Instruction Queue Implemented in dedicated memory 2 write port 2 read port 16 register Circular Queue LC CombinationalsLC RegistersMemory Bytes
IPN-CICMICROSE Lab30 CURRENT WORKLooking for a Multiplier Fast Signed Unsigned Logical Elements Propagation Time (nS) Repple Carry Adder Kogge-Stone Adder Operator "x" LPM Soft LPM Hardware Operator "+" Parallel Adder Lookahead (4 bits)
IPN-CICMICROSE Lab31 CURRENT WORKLooking for a Multiplier
IPN-CICMICROSE Lab32 WORK AS A RESEARCH TEAMBooting uClinux Booting uClinux in Alligator Processor
IPN-CICMICROSE Lab33 FUTURE WORK SRAM controller SDRAM controller Flash controller UART controller LCD controller Data Bus (part 1)
IPN-CICMICROSE Lab34 Data Bus (part 2)FUTURE WORK
IPN-CICMICROSE Lab35 Q&A