Augmented von Neumann Processors

Slides:



Advertisements
Similar presentations
Micro controllers introduction. Areas of use You are used to chips like the Pentium and the Athlon, but in terms of installed machines these are a small.
Advertisements

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Masters Presentation at Griffith University Master of Computer and Information Engineering Magnus Nilsson
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
IT Chapter 2 Part B CPU. The CPU is contained on a single integrated circuit called the microprocessor. Often referred to as the brains of a computer.
Microprocessors. Microprocessor Buses Address Bus Address Bus One way street over which microprocessor sends an address code to memory or other external.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Real-Time Video Analysis on an Embedded Smart Camera for Traffic Surveillance Presenter: Yu-Wei Fan.
Optimization Of Power Consumption For An ARM7- BASED Multimedia Handheld Device Hoseok Chang; Wonchul Lee; Wonyong Sung Circuits and Systems, ISCAS.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Multi-core processors. History In the early 1970’s the first Microprocessor was developed by Intel. It was a 4 bit machine that was named the 4004 The.
Software and Multimedia
Simultaneous Multithreading: Maximizing On-Chip Parallelism Presented By: Daron Shrode Shey Liggett.
Intel® IPP. Fighting for the performance Intel® IPP. Fighting for the performance Novosibirsk, 2008 Boris Sabanin Novosibirsk, 2008 Boris Sabanin.
Practical PC, 7th Edition Chapter 17: Looking Under the Hood
A solution to the Von Neumann bottleneck
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis
Christopher Mitchell CDA 6938, Spring The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain.
Real-Time HD Harmonic Inc. Real Time, Single Chip High Definition Video Encoder! December 22, 2004.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
A Parallel Implementation of MSER detection GPGPU Final Project Lin Cao.
CPU Inside Maria Gabriela Yobal de Anda L#32 9B. CPU Called also the processor Performs the transformation of input into output Executes the instructions.
1 Latest Generations of Multi Core Processors
Evolution of Microprocessors Microprocessor A microprocessor incorporates most of all the functions of a computer’s central processing unit on a single.
Shashwat Shriparv InfinitySoft.
CDVS on mobile GPUs MPEG 112 Warsaw, July Our Challenge CDVS on mobile GPUs  Compute CDVS descriptor from a stream video continuously  Make.
GPU Programming Shirley Moore CPS 5401 Fall 2013
CENTRAL PROCESSING UNIT. CPU Does the actual processing in the computer. A single chip called a microprocessor. Composed of an arithmetic and logic unit.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
CPU Transforms Input and Output Each computer contains one Collection of electronic circuits Processor Interpretates and execute instructions in a program.
1 A simple parallel algorithm Adding n numbers in parallel.
Hardware Architecture
CPU S, F ORM F ACTOR AND S OCKETS. F ORM F ACTOR  Form Factor is the physical size and shape of a device.  In motherboards it pertains to the size and.
Computer Architecture & Operations I
Chapter 17 Looking “Under the Hood”
GPU Architecture and Its Application
Introduction Computer Hardware Jess 2006
Seth Pugsley, Jeffrey Jestes,
Hardware specifications
Microarchitecture.
Automatic Video Shot Detection from MPEG Bit Stream
Visit for more Learning Resources
Artificial Intelligence with .NET
CIT 668: System Architecture
Multi-core processors
Embedded Systems Design
Steven Ge, Xinmin Tian, and Yen-Kuang Chen
Multi-core processors
Core i7 micro-processor
What happens inside a CPU?
INTRODUCTION TO MICROPROCESSORS
Department of Electrical & Computer Engineering
Vector Processing => Multimedia
Software and Multimedia
HARDWARE SPECIFICATIONS.
Software and Multimedia
40 years of research on speech and speaker recognition
Sum of Absolute Differences Hardware Accelerator
Compiler Back End Panel
Compiler Back End Panel
Chapter 1 Introduction.
Peng Jiang, Linchuan Chen, and Gagan Agrawal
Chapter 11: Alternative Architectures
Lecture 2 The Art of Concurrency
Chapter 17 Looking “Under the Hood”
The Perception Processor
Lecture 3 (Microprocessor)
Presentation transcript:

Augmented von Neumann Processors Binu K. Mathew, Al Davis School of Computing University of Utah

Guide to the future! "What is it?" asked Arthur. "The Hitchhiker's Guide to the Galaxy. It's a sort of electronic book… a million "pages” could be summoned at a moment's notice... A screen, about three inches by four, lit up and characters began to flicker across the surface. The words Vogon Constructor Fleets flared in green across the screen. At the same time, the book began to speak the entry as well in a still quiet measured voice.This is what the book said. - Douglas Adams, The Hitch Hiker’s Guide to the Galaxy

Future Applications Projected performance requirement: 10 GOPS Continuous speech recognition Handwriting and gesture recognition Computer Vision Heuristic searches in multimedia databases Video conferencing Power consumption of typical processors Intel Strong ARM SA110 @ 233 MHz : 1 W (Max) Intel embedded Pentium @ 233 MHz : 7.9W – 17W AMD Athlon @ 800 MHz: 45.5 W (Max core power) Can conventional archs provide required performance? At a low enough power budget ?

Primitives for Future Applications Hidden Markov Model Solvers : Speech Recognition, Handwriting and gesture recognition FFT: Audio processing DCT: Image compression Block Data Difference: Compression, motion detection Pattern matching: Database searches, feature recognition Generalized filters: Image and audio processing, array transformations Encryption/Decryption, block data transfer, heuristic processing of bulk data … Reduction operators, block math units: Image statistics, Finite element analysis, Logic simulation, Neural nets

Augmented von Neumann Processors Multiple threads of execution, task level parallelism Domain-specific coprocessors provide high performance at low power Language model from memory Pat Match HMM GFU-2 GFU-1 Bulk Math Bulk Diff CPU Core Scratch SRAM Block Transfer Enc/ Decrypt FFT/DCT Audio

Conclusion Power Area Challenges Pihl et al’s HMM coprocessor consumes 853mW @ 154 MHz in a 5V, 0.8 technology 41mW estimated power @ 1GHz on a 1.2V. 0.1  process Indication that domain specific coprocessors win! Area AMD K-7 die area is 184mm2 in a 0.25  process Same K-7 is estimated to be 4.7-12.5% of the die area of a microprocessor of 2005 in a 0.1  process Total area of all our coprocessors is less than 184 mm2 Domain specific coprocessors win again! Challenges Identify core primitives, generalize Power efficient implementation Provide plumbing between units and overall framework