More Realistic 16-Tap FIR Presented By Lihua, DONG Deyan, LIU.

Slides:



Advertisements
Similar presentations
Machine cycle.
Advertisements

© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
1 ECE734 VLSI Arrays for Digital Signal Processing Chapter 3 Parallel and Pipelined Processing.
Control path Recall that the control path is the physical entity in a processor which: fetches instructions, fetches operands, decodes instructions, schedules.
IN2305-II Embedded Programming Lecture 2: Digital Logic.
Distributed Arithmetic
Altera FLEX 10K technology in Real Time Application.
PROKNET: An IP/ATM processor University of Ottawa Rami Abielmona Samer Abielmona Mohamed Abou-Gabal Wael Hermas Dr. Voicu Groza Dr. Emil Petriu School.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
Institute of Applied Microelectronics and Computer Engineering © 2014 UNIVERSITY OF ROSTOCK | College of Computer Science and Electrical Engineering.
Timing Analysis Timing Analysis Instructor: Dr. Vishwani D. Agrawal ELEC 7770 Advanced VLSI Design Team Project.
1 3-Tap FIR Filter Optimizations By: Jeff Rybczynski CMPE 222.
NDG-L39Introduction to ASIC Design1 Design of a Simple Customizable Microprocessor * Chapter 7 and 15, “Digital System Design and Prototyping”  Pipelined.
EE 141 Project 2May 8, Outstanding Features of Design Maximize speed of one 8-bit Division by: i. Observing loop-holes in 8-bit division ii. Taking.
FIR Filter CMPE 222 – Project Divya Misra Gnanapriya Mohanavelu.
FIR Tap Filter Optimization CE222 Final Project Spring 2003 S oleste H ilberg N icole S tarr.
Sequential Logic 1  Combinational logic:  Compute a function all at one time  Fast/expensive  e.g. combinational multiplier  Sequential logic:  Compute.
Midterm Wednesday Chapter 1-3: Number /character representation and conversion Number arithmetic Combinational logic elements and design (DeMorgan’s Law)
Digital Design – Optimizations and Tradeoffs
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
CMPE222 Final Project Presentation More Optimization on FIR Manju Anand 06/03/03.
University of Michigan Electrical Engineering and Computer Science 1 Streamroller: Automatic Synthesis of Prescribed Throughput Accelerator Pipelines Manjunath.
The Processor Andreas Klappenecker CPSC321 Computer Architecture.
Mohammad Tamim Alkhodary Ali Al-Saihati
By Praveen Venkataramani Vishwani D. Agrawal TEST PROGRAMMING FOR POWER CONSTRAINED DEVICES 5/9/201322ND IEEE NORTH ATLANTIC TEST WORKSHOP 1.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
Analysis of Instruction-level Vulnerability to Dynamic Voltage and Temperature Variations ‡ Computer Science and Engineering, UC San Diego variability.org.
DARPA Digital Audio Receiver, Processor and Amplifier Group Z James Cotton Bobak Nazer Ryan Verret.
Highest Performance Programmable DSP Solution September 17, 2015.
Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Efficient.
Computer Science 210 Computer Organization The von Neumann Architecture.
Sub-expression elimination Logic expressions: –Performed by logic optimization. –Kernel-based methods. Arithmetic expressions: –Search isomorphic patterns.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
1/18 1.Intro 2. Implementation 3. Results 4. Con.
L7: Pipelining and Parallel Processing VADA Lab..
Sequential Arithmetic ELEC 311 Digital Logic and Circuits Dr. Ron Hayne Images Courtesy of Cengage Learning.
ECE 553: TESTING AND TESTABLE DESIGN OF DIGITAL SYSTEMS
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 11 Binary Adder/Subtractor.
Computer Architecture Lecture 4 Sequential Circuits Ralph Grishman September 2015 NYU.
General Concepts of Computer Organization Overview of Microcomputer.
ISSS 2001, Montréal1 ISSS’01 S.Derrien, S.Rajopadhye, S.Sur-Kolay* IRISA France *ISI calcutta Combined Instruction and Loop Level Parallelism for Regular.
ELEC692 VLSI Signal Processing Architecture Lecture 2 Pipelining and Parallel Processing.
Embedded Embedded at-speed test at-speed test.
Implementing and Optimizing a Direct Digital Frequency Synthesizer on FPGA Jung Seob LEE Xiangning YANG.
CERN, 18 december 2003Coincidence Matrix ASIC PRR Coincidence ASIC modifications E.Petrolo, R.Vari, S.Veneziano INFN-Rome.
1  1998 Morgan Kaufmann Publishers Simple Implementation Include the functional units we need for each instruction Why do we need this stuff?
EE3A1 Computer Hardware and Digital Design Lecture 9 Pipelining.
EKT 221 : Digital 2 Computer Design Basics Date : Lecture : 2 hrs.
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock Slide 1 Spezielle.
JET Algorithm Attila Hidvégi. Overview FIO scan in crate environment JET Algorithm –Hardware tests (on JEM 0.2) –Results and problems –Some VHDL tips.
CS 61C: Great Ideas in Computer Architecture Finite State Machines, Functional Units 1 Instructors: Vladimir Stojanovic and Nicholas Weaver
George Mason University Finite State Machines Refresher ECE 545 Lecture 11.
Implementing Combinational
Sequential Logic An Overview
Exercise 4.6 Problems in this exercise assume that logic blocks needed to implement a processor’s datapath have the following latencies: [10]
Computer Science 210 Computer Organization
High-Speed/Low Power At Architectural Level
بسم الله الرحمن الرحيم هل اختلف دور المعلم بعد تطبيق المنهج الحديث الذي ينادي بتوفير خبرات تعليمية مناسبة للطلبة ؟ هل اختلف دور المعلم ؟ ن.ن. ع.
ECE 448 Lecture 13 Multipliers Timing Parameters
VLSI Programming 2IMN35 Lab 1 Questionnaire
Overview Part 1 - Registers, Microoperations and Implementations
Simple Implementation
Pipelining: critical path, pipeline hazards Prof. Eric Rotenberg
Alireza Hodjat IVGroup
Lecture 26 Logic BIST Architectures
October 29 Review for 2nd Exam Ask Questions! 4/26/2019
Project Name Group Members.
The Processor: Datapath & Control.
Project Name Group Members.
Presentation transcript:

More Realistic 16-Tap FIR Presented By Lihua, DONG Deyan, LIU

Overflow FIR Architecture FIR Architecture Original Design Original Design Arithmetic Improvement Arithmetic Improvement  Parallel Multiplication  Tree Addition  Carry-Save-Adder Implemented Datapath Implemented Datapath Simulation Waveform Simulation Waveform Synthesis Results Synthesis Results

FIR Architecture FIR ASIC Design Overview FIR ASIC Design Overview

FIR Architecture (cont’d) FIR Basic Structure FIR Basic Structure

Original Design Sequential Arithmetic Operations on “+” and “*” Sequential Arithmetic Operations on “+” and “*” acc <= rin * c0; acc <= rin * c0; acc <= rs1 * c1 + acc; acc <= rs1 * c1 + acc; acc <= rs2 * c2 + acc; acc <= rs2 * c2 + acc; ……………. ……………. acc<= rs15 * c15 + acc; acc<= rs15 * c15 + acc;

Arithmetic Improvement Observation Observation  Redundant Mix-up-ed “+” and “*”  NO Data Dependency on “*” rin * c0; rin * c0; rs1 * c1; rs1 * c1; rs2 * c2; rs2 * c2; ……… ……… rs15 * c15; rs15 * c15;

Arithmetic Improvement (cont’d) Improvement Strategy I Improvement Strategy I  Partition on “+” and “*”  16 Parallel “*” rin <= sample; tmp0 <= sample * c0; tmp0 <= sample * c0; tmp1 <= rs1 * c1; tmp1 <= rs1 * c1; tmp2 <= rs2 * c2; tmp2 <= rs2 * c2; tmp3 <= rs3 * c3; tmp3 <= rs3 * c3; tmp15 <= rs15 * c15; tmp15 <= rs15 * c15;

Arithmetic Improvement (cont’d) Critical Path Critical Path  Very Long Single Instruction of “+” result <= tmp0 + tmp1 + tmp tmp15; result <= tmp0 + tmp1 + tmp tmp15;

Arithmetic Improvement (cont’d) Improvement Strategy II Improvement Strategy II  Partition on Level of “+”  Tree-Structure “+”

Arithmetic Improvement (cont’d) Improvement Strategy III Improvement Strategy III  Carry-Save-Adder (CSA)

Implemented Datapath Combinational Logic for Addition Combinational Logic for Addition Sequential Logic for Multiplication 2-states FIR Filter Design 2-states FIR Filter Design  1 st state: Data-waiting & Multiplication  2 nd state: Addition & Register-shifting

Simulation Waveform Only ONE Cycle Input-Output Delay Only ONE Cycle Input-Output Delay

Synthesis Results One Clock Cycle == 6ns One Clock Cycle == 6ns Clock Frequency == 167MHz Total Cell Area == Total Cell Area ==