NATIONAL POLYTECHNIC INSTITUTE COMPUTING RESEARCH CENTER IPN-CICMICROSE Lab Design of a Multimedia Extension for RISC Processor Ing. Eduardo Jonathan Martínez.

Slides:



Advertisements
Similar presentations
Design of a Multimedia Extension for RISC Processor
Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
PIPELINE AND VECTOR PROCESSING
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Streaming SIMD Extension (SSE)
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
The University of Adelaide, School of Computer Science
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Superscalar Organization Prof. Mikko H. Lipasti University of Wisconsin-Madison Lecture notes based on notes by John P. Shen Updated by Mikko Lipasti.
1 Microprocessor-based Systems Course 4 - Microprocessors.
Embedded Systems Programming
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Chapter 12 Three System Examples The Architecture of Computer Hardware and Systems Software: An Information Technology Approach 3rd Edition, Irv Englander.
The Pentium 4 CPSC 321 Andreas Klappenecker. Today’s Menu Advanced Pipelining Brief overview of the Pentium 4.
CS854 Pentium III group1 Instruction Set General Purpose Instruction X87 FPU Instruction SIMD Instruction MMX Instruction SSE Instruction System Instruction.
Computer Organization and Assembly language
Unit 3: Hardware Components & Software Concepts
Chapter One Introduction to Pipelined Processors.
Computer performance.
Intel
NATIONAL POLYTECHNIC INSTITUTE COMPUTING RESEARCH CENTER IPN-CICMICROSE Lab Design and implementation of a Multimedia Extension for a RISC Processor Eduardo.
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring.
NATIONAL POLYTECHNIC INSTITUTE COMPUTING RESEARCH CENTER IPN-CICMICROSE Lab Design of a Multimedia Extension for RISC Processor Ing. Eduardo Jonathan Martínez.
Pre-Pentium Intel Processors /
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Classifying GPR Machines TypeNumber of Operands Memory Operands Examples Register- Register 30 SPARC, MIPS, etc. Register- Memory 21 Intel 80x86, Motorola.
RISC By Ryan Aldana. Agenda Brief Overview of RISC and CISC Features of RISC Instruction Pipeline Register Windowing and renaming Data Conflicts Branch.
Hyper Threading (HT) and  OPs (Micro-Operations) Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki.
Ch. 2 Data Manipulation 4 The central processing unit. 4 The stored-program concept. 4 Program execution. 4 Other architectures. 4 Arithmetic/logic instructions.
Flynn’s Architecture. SISD (single instruction and single data stream) SIMD (single instruction and multiple data streams) MISD (Multiple instructions.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
The original MIPS I CPU ISA has been extended forward three times The practical result is that a processor implementing MIPS IV is also able to run MIPS.
The TM3270 Media-Processor. Introduction Design objective – exploit the high level of parallelism available. GPPs with Multi-media extensions (Ex: Intel’s.
Chapter 2 Data Manipulation. © 2005 Pearson Addison-Wesley. All rights reserved 2-2 Chapter 2: Data Manipulation 2.1 Computer Architecture 2.2 Machine.
Introduction to MMX, XMM, SSE and SSE2 Technology
CS/EE 5810 CS/EE 6810 F00: 1 Multimedia. CS/EE 5810 CS/EE 6810 F00: 2 New Architecture Direction “… media processing will become the dominant force in.
November 22, 1999The University of Texas at Austin Native Signal Processing Ravi Bhargava Laboratory of Computer Architecture Electrical and Computer.
Chapter Overview Microprocessors Replacing and Upgrading a CPU.
ECEn 191 – New Student Seminar - Session 6 Digital Logic Digital Logic ECEn 191 New Student Seminar.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
Computer performance issues* Pipelines, Parallelism. Process and Threads.
EECS 322 March 18, 2000 RISC - Reduced Instruction Set Computer Reduced Instruction Set Computer  By reducing the number of instructions that a processor.
Fundamentals of Programming Languages-II
Design of A Custom Vector Operation API Exploiting SIMD Intrinsics within Java Presented by John-Marc Desmarais Authors: Jonathan Parri, John-Marc Desmarais,
ISA's, Compilers, and Assembly
Xinsong1 Multimedia Extension Technology survey Xinsong Yang Electrical and Computer Engineering 734 Final Project 5/10/2002.
Lecture # 10 Processors Microcomputer Processors.
Processor Performance & Parallelism Yashwant Malaiya Colorado State University With some PH stuff.
Processor Level Parallelism 1
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang School.
Morgan Kaufmann Publishers
Vector Processing => Multimedia
Advanced Computer Architecture 5MD00 / 5Z032 Instruction Set Design
عمارة الحاسب.
MMX Multi Media eXtensions
Special Instructions for Graphics and Multi-Media
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Comparison of Two Processors
Computer Organization
Coe818 Advanced Computer Architecture
Introduction to Microprocessor Programming
COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING
CS 286 Computer Organization and Architecture
Presentation transcript:

NATIONAL POLYTECHNIC INSTITUTE COMPUTING RESEARCH CENTER IPN-CICMICROSE Lab Design of a Multimedia Extension for RISC Processor Ing. Eduardo Jonathan Martínez Montes Ph.D. Marco Antonio Ramírez Salinas

I.Thesis Requirements 1.Committee Tutorial 2. Objective 3.Justification 4.Problem overview II.Overview 1. RISC Processor 2. Architectures 3. Vector Processing 4. SIMD 5. SIMD vs SISD 6. Example 7. State of the art III. Work Done 1. RISC Segmented Processor 2. Debugger 3. Program Memory OUTLINEPart 1 IPN-CICMICROSE Lab2 4.Data Memory 5.Register Alias Table 6.LCD Controller 7.UART 8.SRAM Controller 9.2 Instruction Decode 10.2 Instruction Queue

IV.Current Work I.Looking for a Multiplier II.Redesigning the Rename Unit III.Complete the set instruction V.Work as a Research Team 1.uClinux VI.Future Work 1.Implement a Data Bus IPN-CICMICROSE Lab3 OUTLINEPart 2

IPN-CICMICROSE Lab4 THESIS REQUIREMENTSCommittee Tutorial NameExpertise Area Ph.D. Marco Antonio Ramírez SalinasComputer Architecture Ph.D. Luis Alfonso Villa VargasComputer Architecture Ph.D. Herón Molina LozanoVLSI Ph.D. José Luis Oropeza RodríguezOperating Systems

IPN-CICMICROSE Lab5 THESIS REQUIREMENTSObjective General Objective Design a multimedia extension unit for a RISC processor (Alligator). Specific Objectives  Design a vector adder with saturation arithmetic.  Design a multiplier with saturation arithmetic.  Design a divisor with saturation arithmetic.  Implement all the Instruction set of the MIPS Digital Media extension (MDMX).

IPN-CICMICROSE Lab6 THESIS REQUIREMENTSJustification Alligator is a superscalar embedded processor, now in develop. The goal of this effort is to be used to help in the research and teaching. This processor require the design and build many blocks, so that, this project is part of a bigger project.

IPN-CICMICROSE Lab7 THESIS REQUIREMENTSProblem Overview Multimedia Extension is a vector machine that is embedded in same chip with the main Superscalar Processor, it is used for deal with multimedia applications. Integrate Multimedia Extension Architecture as a coprocessor to the Superscalar Processor Integrate the MDMX Set Instruction as a part of ISA in the Decode stage. Deal with memory challenges for sharing data.

IPN-CICMICROSE Lab8 THESIS REQUIREMENTSProblem Overview

IPN-CICMICROSE Lab9 THESIS REQUIREMENTSProblem Overview

IPN-CICMICROSE Lab10 OVERVIEWRISC Processor Reduced Instruction Set Computing (RISC). The main idea is to keep the design simplified.

IPN-CICMICROSE Lab11 OVERVIEWArchitectures SISD: Scalar Processor, executes only one datum at a time. MIMD: Superscalar Processor, exploits parallelism in the Instruction stream. SIMD: Vector Processor, exploits parallelism in the data stream.

IPN-CICMICROSE Lab12 OVERVIEWSIMD Single Instruction Multiple Data, this architecture performs the same operation on multiple data elements in parallel.

IPN-CICMICROSE Lab13 OVERVIEWSIMD vs SISD

IPN-CICMICROSE Lab14 OVERVIEWSIMD Example (part 1) Example: get negative image

IPN-CICMICROSE Lab15 OVERVIEW Normal Processing SIMD Example (part 2)

IPN-CICMICROSE Lab16 OVERVIEW Parallel Processing SIMD Example (Part 3)

IPN-CICMICROSE Lab17 OVERVIEWState of the Art AVX2 - Intel 2013 Sandy Bridge y Bulldozer - Intel y AMD 2011 Advanced Vector Extensions (AVX) - Intel 2008 SSE4 - Intel 2006 SSE y SSE2 - AMD 2004 SSE3 - Intel 2004 Advance 3DNow! (3DNow! 2) - AMD 2003 AltiVec - IBM 2002 SSE2 - Intel DNow!. - AMD 2000 Streaming SIMD Extensions (SSE)- Intel 1999 Pentium II (MMX)- Intel 1998 AltiVec - Motorola

IPN-CICMICROSE Lab18 WORK DONERISC Segmented Processor

IPN-CICMICROSE Lab19 WORK DONEDebugger (part 1) Definition Every time that you create something new, like a program or in this case new hardware. You need something to test and trace faults and then fix it. All the developers, even all the engineers know what a debugger tool.

IPN-CICMICROSE Lab20 WORK DONEDebugger (part 2) Features Friendly GUI interface Load and download the Program Memory Load and download the Data Memory View the registers Reset the processor Pause the processor Run step by step de processor Use breakpoints Change the clock frequency In fact, it can work without a GUI!

IPN-CICMICROSE Lab21 WORK DONEDebugger (part 3)

IPN-CICMICROSE Lab22 WORK DONEThe Program Memory (cache L1) Implemented in dedicated memory (M9K) 1 write port 2 read port Size 512 bytes LC CombinationalsLC RegistersMemory Bytes 6591,024

IPN-CICMICROSE Lab23 WORK DONEThe Data Memory (cache L1) Implemented in dedicated memory (M9K) 2 write port 5 read port Size 512 bytes LC CombinationalsLC RegistersMemory Bytes 96326,144

IPN-CICMICROSE Lab24 WORK DONERegister Alias Table Implemented in dedicated memory 6 write port 12 read port 128 register of 32 bits LC CombinationalsLC RegistersMemory Bytes 1, ,544

IPN-CICMICROSE Lab25 WORK DONELCD Controller It has a state machine that read a 32 register memory (32x8) Characters are only write in the memory and it does the rest LC CombinationalsLC RegistersMemory Bytes

IPN-CICMICROSE Lab26 WORK DONEUART 9600 bps 8N1 LC CombinationalsLC RegistersMemory Bytes 44540

IPN-CICMICROSE Lab27 WORK DONESRAM Controller LC CombinationalsLC RegistersMemory Bytes 4900

IPN-CICMICROSE Lab28 WORK DONETwo Instruction Decode LC CombinationalsLC RegistersMemory Bytes

IPN-CICMICROSE Lab29 WORK DONETwo Instruction Queue Implemented in dedicated memory 2 write port 2 read port 16 register Circular Queue LC CombinationalsLC RegistersMemory Bytes

IPN-CICMICROSE Lab30 CURRENT WORKLooking for a Multiplier Fast Signed Unsigned Logical Elements Propagation Time (nS) Repple Carry Adder Kogge-Stone Adder Operator "x" LPM Soft LPM Hardware Operator "+" Parallel Adder Lookahead (4 bits)

IPN-CICMICROSE Lab31 CURRENT WORKLooking for a Multiplier

IPN-CICMICROSE Lab32 WORK AS A RESEARCH TEAMBooting uClinux Booting uClinux in Alligator Processor

IPN-CICMICROSE Lab33 FUTURE WORK SRAM controller SDRAM controller Flash controller UART controller LCD controller Data Bus (part 1)

IPN-CICMICROSE Lab34 Data Bus (part 2)FUTURE WORK

IPN-CICMICROSE Lab35 Q&A