ENCM 515 Review talk on 2001 Final A. Wong, Electrical and Computer Engineering, University of Calgary, Canada ucalgary.ca.

Slides:



Advertisements
Similar presentations
Computing Systems Organization
Advertisements

Chapter 2: Data Manipulation
DSPs Vs General Purpose Microprocessors
PIPELINE AND VECTOR PROCESSING
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Instruction Set Design
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
CPU Review and Programming Models CT101 – Computing Systems.
1 (Review of Prerequisite Material). Processes are an abstraction of the operation of computers. So, to understand operating systems, one must have a.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Systematic development of programs with parallel instructions SHARC ADSP2106X processor M. Smith, Electrical and Computer Engineering, University of Calgary,
Systematic development of programs with parallel instructions SHARC ADSP2106X processor M. Smith, Electrical and Computer Engineering, University of Calgary,
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Software and Hardware Circular Buffer Operations First presented in ENCM There are 3 earlier lectures that are useful for midterm review. M. R.
CACHE-DSP Tool How to avoid having a SHARC thrashing on a cache-line M. Smith, University of Calgary, Canada B. Howse, Cell-Loc, Calgary, Canada Contact.
6/3/20151 ENCM515 Comparison of Integer and Floating Point DSP Processors M. Smith, Electrical and Computer Engineering, University of Calgary, Canada.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 5: CPU and Memory.
Stored Program Concept: The Hardware View
2000/03/051 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
Computer Organization and Assembly language
More Basics of CPU Design Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University.
Inside The CPU. Buses There are 3 Types of Buses There are 3 Types of Buses Address bus Address bus –between CPU and Main Memory –Carries address of where.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Parallelism Processing more than one instruction at a time. Pipelining
Computer Systems 1 Fundamentals of Computing The CPU & Von Neumann.
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
Computer Processing of Data
Processor Architecture Needed to handle FFT algoarithm M. Smith.
Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter – Part 3 Understanding the memory pipeline issues.
DSP Processors We have seen that the Multiply and Accumulate (MAC) operation is very prevalent in DSP computation computation of energy MA filters AR filters.
The Central Processing Unit (CPU) and the Machine Cycle.
Computer Architecture Memory, Math and Logic. Basic Building Blocks Seen: – Memory – Logic & Math.
ECEG-3202 Computer Architecture and Organization Chapter 3 Top Level View of Computer Function and Interconnection.
The fetch-execute cycle. 2 VCN – ICT Department 2013 A2 Computing RegisterMeaningPurpose PCProgram Counter keeps track of where to find the next instruction.
Computer Structure & Architecture 7b - CPU & Buses.
Computer Architecture 2 nd year (computer and Information Sc.)
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
Stored Program A stored-program digital computer is one that keeps its programmed instructions, as well as its data, in read-write,
Systematic development of programs with parallel instructions SHARC ADSP21XXX processor M. Smith, Electrical and Computer Engineering, University of Calgary,
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Chapter Overview General Concepts IA-32 Processor Architecture
CPU Lesson 2.
The CPU, RISC and CISC Component 1.
Computing Systems Organization
Lesson Objectives A note about notes: Aims
Architecture Background
Computer Architecture
Teaching Computing to GCSE
Digital Signal Processors
TigerSHARC processor General Overview.
Superscalar Processors & VLIW Processors
COMS 161 Introduction to Computing
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Overview of SHARC processor ADSP Program Flow and other stuff
Understanding the TigerSHARC ALU pipeline
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
General Optimization Issues
General Optimization Issues
1-2 – Central Processing Unit
Computer Architecture
Understanding the TigerSHARC ALU pipeline
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Objectives Describe common CPU components and their function: ALU Arithmetic Logic Unit), CU (Control Unit), Cache Explain the function of the CPU as.
* M. R. Smith 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
Presentation transcript:

ENCM 515 Review talk on 2001 Final A. Wong, Electrical and Computer Engineering, University of Calgary, Canada ucalgary.ca

To Be Tackled Today Review Important concepts of DSP 2001 ENCM 515 Final Exam Question 1 Question 2

Disclaimer The answers given in this presentation are the views of the presenter and not necessarily the answers accepted by Dr. Smith

Requirements for “perfect” DSP architecture - 1 Fast instruction cycle -- not clock speed Fast hardware multiplier Floating point for easier design -- avoids scaling and overflow High precision wide busses for register, memory, processing units Fast loop operation

Requirements for “perfect” DSP architecture - 2 Several data buses available to reduce memory bus conflict/transfer overhead Harvard architecture and/or instruction caches to avoid instruction and data-fetch clashes Duplicate resources for parallel computation Dedicated address calculation hardware

Requirements for “perfect” DSP architecture - 3 Extensive temporary registers to avoid unnecessary fetches of continually used data Architecture allows easy parallel operation in multiprocessor systems -- NEW Cycle time adjustable by instruction -- UNCOMMON Duplicate resources for parallel computation of real and imaginary components -- UNCOMMON -- SIMD?

2001 Final Exam - 1 Assume that non-volatile registers have been saved as needed and that the DAG registers I4, M4, B4, L4, I3, M3, I12, M12 have been set correctly A – circle the compute component of ONE 21k instruction B – circle the first totally parallel instruction in code C – circle the instructions that demonstrate Filling the algorithm pipeline 1 F9 = F9 - F9R2 = F1 = dm(I4,M4)F5 = pm(I12,M12) 3lcntr = R2, do (pc, END_DEMOD - 1) 4F13 = F1 * F5F9 = F9 + F13F1 = dm(I4,M4)F5 = pm(I12,M12) END_DEMOD: 5F13 = F1 * F5F9 = F9 + F13 6 dm(I3,M3)

2001 Final Exam – 1 -- DSA A – circle the compute component of ONE 21k instruction -- OK B – circle the first totally parallel instruction in code -- OK C – circle the instructions that demonstrate Filling the algorithm pipeline – the dm and pm in 2 and the + and * in 4 1 F9 = F9 - F9R2 = F1 = dm(I4,M4)F5 = pm(I12,M12) 3lcntr = R2, do (pc, END_DEMOD - 1) 4F13 = F1 * F5F9 = F9 + F13F1 = dm(I4,M4)F5 = pm(I12,M12) END_DEMOD: 5F13 = F1 * F5F9 = F9 + F13 6 dm(I3,M3)

2001 Final Exam - 2 Briefly explain, using the context of this code, the concept of pipeline in parallel instruction processors. Answer – pipelines are necessary for parallelizing the above code since it involves using the same registers at different stages of the instruction cycle (Fetch, Decode, and Execute)

2001 Final Exam - 3 The code would be more understandable if the first instruction had been written as F9 = 0, R2 = 256 but that wasn’t not possible. Explain. Answer – There is a set number of bits on the data bus, if the instruction uses too many constants, there may not be enough bit to store the number.

2001 Final Exam – 3 – D.S.A The code would be more understandable if the first instruction had been written as F9 = 0, R2 = 256 but that wasn’t not possible. Explain. Answer – There is a set number of bits on the data bus, if the instruction uses too many constants, there may not be enough bit to store the number. Answer – Incomplete – better – each constant takes 32 bits, total of 64 bits needed and only 48 bit program bus to carry instructions

2001 Final Exam - 4 The code will not provide the correct synchronous detection result. There are a number of ways of fixing the code. Would changing instruction 2 to F13=F13–F13, F1=dm(I4,M4), F5=pm(I12,M12); be one of them? Answer – yes, because F13 is not set to 0 at first, it may be containing “garbage” when used, resulting in error.

2001 Final Exam - 5 Explain the differences and relative advantages between processors with a von Neumann and Harvard architecture. CPU Address Bus Data Bus Von Neumann CPU ROMData ROMData Harvard Data Bus Address Bus

2001 Final Exam – 5 – D.S.A. Picture’s are nice – but N. Q. A. – The question said “Relative advantages and disadvantages” and you never discussed these at all. CPU Address Bus Data Bus Von Neumann CPU ROMData ROMData Harvard Data Bus Address Bus

2001 Final Exam - 6 Using processors discussed in ENCM 515 provide examples of processors with a von Neumann and with a Harvard architecture. Answer von Neumann (68k) Harvard (29k)

2001 Final Exam - 7 The SHARC 21k does not have a Harvard architecture but a Super Harvard ARChitecture. What are the advantages of having a super Harvard over the normal type, and under what circumstances will these advantages disappear. Answer – The 21k allows caching of instruction for fast access. The advantage disappears when the cache is full or when cache thrash occurs.

2001 Final Exam - 8 Consider the code given earlier, will instruction 6 be cached? If it is, how do you know? If not, why? Answer – No, caching only occurs when data access on PM bus conflicts with instruction access on the PM bus

2001 Final Exam – 1 – D.S.A Answer – No, caching only occurs when data access on PM bus conflicts with instruction access on the PM bus ANSWER Yes -- 4 inside the loop clashes with 6 outside the loop 1 F9 = F9 - F9R2 = F1 = dm(I4,M4)F5 = pm(I12,M12) 3lcntr = R2, do (pc, END_DEMOD - 1) 4F13 = F1 * F5F9 = F9 + F13F1 = dm(I4,M4)F5 = pm(I12,M12) END_DEMOD: 5F13 = F1 * F5F9 = F9 + F13 6 dm(I3,M3)

Homework Saturation – arithmetic – Design, write and document an 21k assembly language code segment that accesses N points of a floating point array PMarray[] over the PM data bus, TRIPLES each value and sets all results above to be equal to before storing the result into a floating point array DMarray[] over the DM data bus..segment/pm seg_pmda;.var PMarray[256];// The initial array.endseg;.segment/dm seg_dmda;.var DMarray[512];// The final array.var N;// The number of values to be converted.endseg;