Design and VLSI implementation of a digital audio-specific DSP core for MP3/AAC Kyoung Ho Bang, Nam Hun Jeong, Joon Seok Kim, Young Cheol Park and Dae.

Slides:



Advertisements
Similar presentations
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
Advertisements

DSPs Vs General Purpose Microprocessors
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
High-performance Cortex™-M4 MCU
Final Project : Pipelined Microprocessor Joseph Kim.
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Embedded Software Optimization for MP3 Decoder Implemented on RISC Core Yingbiao Yao, Qingdong Yao, Peng Liu, Zhibin Xiao Zhejiang University Information.
Term Project Overview Yong Wang. Introduction Goal –familiarize with the design and implementation of a simple pipelined RISC processor What to do –Build.
Alyssa Concha Microprocessors Final Project ADSP – SHARC Digital Signal Processor.
M-CORE Introduction Topic Overview Comparison to Comp-Arch 1 topics Register Files Execution / Function Unit Elements Instruction Set / Execution Core.
Unit -II CPU Organization By- Mr. S. S. Hire. CPU organization.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Mid-Term Presentation Performed by: Roni.
Computer Organization and Assembly language
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
Engineering 1040: Mechanisms & Electric Circuits Fall 2011 Introduction to Embedded Systems.
Octavo: An FPGA-Centric Processor Architecture Charles Eric LaForest J. Gregory Steffan ECE, University of Toronto FPGA 2012, February 24.
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
DSP Development System
Real time DSP Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria.
Instruction Sets and Pipelining Cover basics of instruction set types and fundamental ideas of pipelining Later in the course we will go into more depth.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
DOP - A CPU CORE FOR TEACHING BASICS OF COMPUTER ARCHITECTURE Miloš Bečvář, Alois Pluháček and Jiří Daněček Department of Computer Science and Engineering.
Basics and Architectures
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
Architectures for mobile and wireless systems Ese 566 Report 1 Hui Zhang Preethi Karthik.
CLEMSON U N I V E R S I T Y AVR32 Micro Controller Unit Atmel has created the first processor architected specifically for 21st century applications that.
© 2009, Renesas Technology America, Inc., All Rights Reserved 1 Course Introduction  Purpose:  This course provides an overview of the SH-2 32-bit RISC.
Chapter 1 An Introduction to Processor Design 부산대학교 컴퓨터공학과.
EKT 422 Computer Architecture
Software Defined Radio 長庚電機通訊組 碩一 張晉銓 指導教授 : 黃文傑博士.
Advanced Computer Architecture 0 Lecture # 1 Introduction by Husnain Sherazi.
Pipelining. 10/19/ Outline 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and Interrupts Conclusion.
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education &
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
DSP Processors We have seen that the Multiply and Accumulate (MAC) operation is very prevalent in DSP computation computation of energy MA filters AR filters.
ARM for Wireless Applications ARM11 Microarchitecture On the ARMv6 Connie Wang.
Processor: Datapath and Control
Compilers for Embedded Systems Ram, Vasanth, and VJ Instructor : Dr. Edwin Sha Synthesis and Optimization of High-Performance Systems.
Introduction First 32 bit Processor in Intel Architecture. Full 32 bit processor family Sixth member of 8086 Family SX.
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
Overview of Super-Harvard Architecture (SHARC) Daniel GlickDaniel Glick – May 15, 2002 for V (Dewar)
CDA 3101 Fall 2013 Introduction to Computer Organization
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
UltraSPARC III Hari P. Ananthanarayanan Anand S. Rajan.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
NISC set computer no-instruction
Case Study: Implementing the MPEG-4 AS Profile on a Multi-core System on Chip Architecture R 楊峰偉 R 張哲瑜 R 陳 宸.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Computer Operation. Binary Codes CPU operates in binary codes Representation of values in binary codes Instructions to CPU in binary codes Addresses in.
SEMINAR ON ARM PROCESSOR
Digital Signal Processor HANYANG UNIVERSITY 학기 Digital Signal Processor 조 성 호 교수님 담당조교 : 임대현
STUDY OF PIC MICROCONTROLLERS.. Design Flow C CODE Hex File Assembly Code Compiler Assembler Chip Programming.
Hiba Tariq School of Engineering
ECE354 Embedded Systems Introduction C Andras Moritz.
Low-power Digital Signal Processing for Mobile Phone chipsets
Evaluating Register File Size
Embedded Systems Design
INTRODUCTION TO MICROPROCESSORS
Figure 13.1 MIPS Single Clock Cycle Implementation.
Subject Name: Digital Signal Processing Algorithms & Architecture
Introduction to Digital Signal Processors (DSPs)
Dynamically Reconfigurable Architectures: An Overview
Computer Organization “Central” Processing Unit (CPU)
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Morgan Kaufmann Publishers The Processor
C Model Sim (Fixed-Point) -A New Approach to Pipeline FFT Processor
DSP Architectures for Future Wireless Base-Stations
Presentation transcript:

Design and VLSI implementation of a digital audio-specific DSP core for MP3/AAC Kyoung Ho Bang, Nam Hun Jeong, Joon Seok Kim, Young Cheol Park and Dae Hee Youn IEEE Transactions on Consumer Electronics, page(s): 報告者 : 陳世偉 授課教師 : 黃英哲教授

92/03/24 Seminar – Shi-Wei Chen 2 Outline  Introduction  System architecture  Instruction set  System efficiency  Conclusion

92/03/24 Seminar – Shi-Wei Chen 3 Introduction  Standardized audio compression method – MP3 (MPEG1 layer3), AAC (Advanced Audio Coding)  The consumer market – High compression ratio – The transparent quality  Hardware performance are also rising. – DSP, ASIC, Microprocessor/Microcontroller

92/03/24 Seminar – Shi-Wei Chen 4 Comparison with CPUs CPUDSPMicroprocessor / Microcontroller ASIC Advantages1. Signal processing tasking 2. High performance 3. Advanced control techniques 4. Additional functions 1. On-chip Peripherals 2. Supervisory functions 3. Familiar architecture 4. Low power consumption 1. Particular application 2. High performance 3. ASICs are built by connecting existing circuit Disadvantages1. Limited peripherals 2. High power consumption 3. High hardware cost 1. Low performance 2. Computation delay 3. Numerical problems 4. Limited hardware resource 1. Hard to modify the actual target algorithm, no feasibility 2. Timing consumption and error-prone 3. Unsuitable for reuse DSP + ASIC + accelerator = digital audio-specific DSP core

92/03/24 Seminar – Shi-Wei Chen 5 Focus aspect aspect Method High-quality audio coding 1. MPEG-1/2 Layer II / III 2. MPEG-2 AAC 3. AC-3 Low power consumption 1. Minimum hardware resource - single ALU - 3 stage pipelining Harvard architecture 2. Disable unused hardware resource - using latch Easy programming1. Operation schedule 2. Hardware resource allocation

92/03/24 Seminar – Shi-Wei Chen 6 Audio-specific DSP feature  Data processing unit – 20-bit data, 48-bit ALU, accelerator – Multiplier : 20-bit × 20-bit » signed × signed, signed × unsigned, unsigned × unsigned – Convergent rounder, limiter  More than 18-bit PCM output  One cycle MAC for F/T transform processing  2048 module addressing for management of 2048-size buffer of AAC  512-point FFT for AC-3 and AAC

92/03/24 Seminar – Shi-Wei Chen 7 System architecture  3-stage pipeline architecture – Instruction fetch stage – Instruction decode stage – Execution stage – One instruction/one clock cycle » Except branch instruction (2 clock cycles)  Harvard architecture – Program memory – Data memory  Load-Store architecture  MAC = MU + ALU

92/03/24 Seminar – Shi-Wei Chen 8 DSP architecture Instruction Decoder Instruction Fetch Unit Data Processing Unit Data Addressing Generator X Data Memory Y Data Memory Program Memory Instruction Control signal PMDPMA DSP Core Execution Unit Condition Code D1 Bus D2 Bus XABYDBYABXDB PMD : Program Memory Data Bus PMA : Program Memory Address Bus XAB : X Data Memory Address Bus XDB : X Data Memory Data Bus YAB : Y Data Memory Address Bus YDB : Y Data Memory Data Bus

92/03/24 Seminar – Shi-Wei Chen 9 Instruction fetch  Instruction address generator – PC – Branch instruction (immediate address) – Loop instruction  Execution control  All core units and additional off-core functional units operation control return Instruction decoder

92/03/24 Seminar – Shi-Wei Chen 10 Data address generator  One index register file, two identical address calculation units.  It generate two independent data address on each cycle  Example: ro[des_line/18][des_line%18] = xr[src_line/18][src_line%18] return

92/03/24 Seminar – Shi-Wei Chen 11 Data processor unit B A S reg DC P reg X0, X1, Y0, Y1 ALUShifterMU XDB YDB AR FilesMRX, MRY, AMRX, AMRY, OMR, SR Imm. Data From S bus D1 bus D2 bus A1 busA2 bus D1 bus D2 bus (DC: Data Converter)

92/03/24 Seminar – Shi-Wei Chen 12 Instruction set Special instructions : UNPACK, HUFFMAN

92/03/24 Seminar – Shi-Wei Chen 13 System efficiency  Design tool – VHDL  Compile and simulate tool – SYNOPSYS tool  0.35μm, 3.3V COMOS technology – 40MHz ModuleGate Count Predictor AAC Huffman Decoder MP3 Huffman Decoder DSP Core

92/03/24 Seminar – Shi-Wei Chen 14 Quality test of MP NL : Noise level MER : Maximum error ratio Np : The number of processing bit No : The number of output PCM bit 1.09 ISO/IEC : NL < -101dB, MER : < 1

92/03/24 Seminar – Shi-Wei Chen 15 Clock cycle for MP3 decoder 348, Total sum (40MHz / 48kHz sample rate) × 1152 = 960,000 cycles/frame

92/03/24 Seminar – Shi-Wei Chen 16 Memory requirement H/W ResourceSize (word) MP3 Program memory2.2k Data ROM1.4k Data RAM5.7k AAC Program memory4.1k Data ROM5.5k Data RAM7.1k

92/03/24 Seminar – Shi-Wei Chen 17 Evaluation board

92/03/24 Seminar – Shi-Wei Chen 18 Conclusion  The system consists of a 20-bit fixed- point DSP core for the software implementation and a hardware accelerator.  The decoding system can decode MP3 using only MIPS with high efficiency.  The digital audio-specific DSP core is suitable for embedded system. ?