System-on-Chip Design Homework Solutions

Slides:



Advertisements
Similar presentations
Embedded System, A Brief Introduction
Advertisements

IO Interfaces and Bus Standards. Interface circuits Consists of the cktry required to connect an i/o device to a computer. On one side we have data bus.
Computer Architecture
I/O Organization popo.
PIPELINE AND VECTOR PROCESSING
Dr. Rabie A. Ramadan Al-Azhar University Lecture 3
EXTERNAL COMMUNICATIONS DESIGNING AN EXTERNAL 3 BYTE INTERFACE Mark Neil - Microprocessor Course 1 External Memory & I/O.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
COMP3221: Microprocessors and Embedded Systems Lecture 17: Computer Buses and Parallel Input/Output (I) Lecturer: Hui.
Basic Computer Organization CH-4 Richard Gomez 6/14/01 Computer Science Quote: John Von Neumann If people do not believe that mathematics is simple, it.
Lecture 12 Today’s topics –CPU basics Registers ALU Control Unit –The bus –Clocks –Input/output subsystem 1.
Computer Architecture Lecture 08 Fasih ur Rehman.
Viterbi Decoder Project Alon weinberg, Dan Elran Supervisors: Emilia Burlak, Elisha Ulmer.
Spring EE 437 Lillevik 437s06-l2 University of Portland School of Engineering Advanced Computer Architecture Lecture 2 NSD with MUX and ROM Class.
Registers CPE 49 RMUTI KOTAT.
Dr. Rabie A. Ramadan Al-Azhar University Lecture 6
MICROPROCESSOR INPUT/OUTPUT
I/O Example: Disk Drives To access data: — seek: position head over the proper track (8 to 20 ms. avg.) — rotational latency: wait for desired sector (.5.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
Computer Organization CDA 3103 Dr. Hassan Foroosh Dept. of Computer Science UCF © Copyright Hassan Foroosh 2002.
System-on-Chip Design Data Flow hardware Implementation Hao Zheng Comp Sci & Eng U of South Florida 1.
System-on-Chip Design Homework Solutions
System-on-Chip Design Analysis of Control Data Flow
1 3 Computing System Fundamentals 3.2 Computer Architecture.
CIS 4930/6930 System-on-Chip Design Transaction-Level Modeling with SystemC Dr. Hao Zheng Comp. Sci & Eng. U of South Florida.
CPU Lesson 2.
GCSE Computing - The CPU
Lab 4 HW/SW Compression and Decompression of Captured Image
DIRECT MEMORY ACCESS and Computer Buses
System-on-Chip Design
System-on-Chip Design Principles of Hardware/Software Communication
Interconnection Structures
Bus Interfacing Processor-Memory Bus Backplane Bus I/O Bus
Class Exercise 1B.
Everybody.
REGISTER TRANSFER LANGUAGE (RTL)
Registers and Counters
Presented By: Navneet Kaur Randhawa Lect. I.T. Deptt. GPC,Amritsar
Embedded Systems Design
The 8085 Microprocessor Architecture
SLIDES FOR CHAPTER 12 REGISTERS AND COUNTERS
1 Input-Output Organization Computer Organization Computer Architectures Lab Peripheral Devices Input-Output Interface Asynchronous Data Transfer Modes.
Basics of digital systems
System-on-Chip Design On-Chip Buses
EE 107 Fall 2017 Lecture 7 Serial Buses – I2C Direct Memory Access
Dr. Michael Nasief Lecture 2
Processor Organization and Architecture
Chapter 3 Top Level View of Computer Function and Interconnection
COMP2121: Microprocessors and Interfacing
Processor (I).
CS/COE0447 Computer Organization & Assembly Language
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Design Flow System Level
Introduction to Microprocessors and Microcontrollers
Interfacing Memory Interfacing.
Number Representations and Basic Processor Architecture
Introduction to cosynthesis Rabi Mahapatra CSCE617
CoCentirc System Studio (CCSS) by
Parallel communication interface 8255
Morgan Kaufmann Publishers Computer Organization and Assembly Language
The Processor Lecture 3.1: Introduction & Logic Design Conventions
Md. Mojahidul Islam Lecturer Dept. of Computer Science & Engineering
Md. Mojahidul Islam Lecturer Dept. of Computer Science & Engineering
The 8085 Microprocessor Architecture
Overview Part 1 - Registers, Microoperations and Implementations
GCSE Computing - The CPU
Instruction execution and ALU
Chapter 13: I/O Systems.
William Stallings Computer Organization and Architecture
Advanced Computer Architecture Lecture 3
Presentation transcript:

System-on-Chip Design Homework Solutions Hao Zheng Comp Sci & Eng U of South Florida

HW 4

[CIS 6930] 9.1 CPU1 -> CPU2 Communication speed/delay: Each bus transaction on the high-speed bus takes 50 ns. Each bus transaction on the low-speed bus takes 200 ns. Each memory access (read or write) takes 80 ns. Each bridge transfer takes 100 ns. The CPU’s are much faster than the bus system, and can read/write data on the bus at any chosen data rate.

Total runtime in co-processor implementation [CIS 6930] 9.3 A C function takes 1000 cycles to execute in SW. 10 inputs and 10 outputs, each of which is an integer (word) Now evaluate under what conditions that a co-processor implementation brings performance benefits. K cycles need to to execute the function in HW co-processor. Q cycles needed to transfer 1 word from SW to co-processor. Total runtime in co-processor implementation 10*Q + K + 10*Q

Better co-processor implementation if [CIS 6930] 9.3 Better co-processor implementation if 10*Q + K + 10*Q <= 1000

[CIS 6930] 9.4 Draw FSM implementing the 2-way handshake protocols. req req ack ack Draw FSM implementing the 2-way handshake protocols. Optimize it by sending two data in a single transaction.

[CIS 6930] 9.4 Draw FSM implementing the 2-way handshake protocols. 1 ack=0/ send data req <= 1 ack=1/ req <= 0 2

[CIS 6930] 9.4 Optimize it by sending two data in a single transaction. 1 ack=0/ send data1 req <= 1 ack=1/ send data2 req <= 0 2

P2: 1- or 2-Way Handshake Protocols Read section 9.2.3 in the CoDesign book. 1-way HS assumes that sender runs faster than receivers. 2-way HS can handle sender/receiver of various speeds.

P3: Steps of Basic Bus Transfers. Read section 10.2.1 in the CoDesign book. Steps: Master gets bus access by negotiating with bus arbiter. Master issues address/data/command. Slave acknowledges the transfer. Master releases the bus.

10.3 memory address of i? 0x3F68 memory address of a[0] 0x3F6C

11.1 Address Decoder A decode that maps the register to range 0x3F000000 – 0x3F00FFFF A 16-bit AND gate A decoder that maps the register to 0x3F000000. A 32-bit AND gate

11.2 Double the data transfer rate for the following design.

11.2 Double the data transfer rate for the following design.

P5: Complete the Figure HW Module CPU FIFO

P5: Complete the Figure reqin reqout din HW dout Module CPU FIFO ackin ackout Master Slave

P5: FSMs for Interfaces reqin & !full / store din; !reqin / ackin <= ‘1’ !reqin / ackin <= ‘0’ !ackout & !empty / load dout; reqout <= ‘1’ ackout / reqout <= ‘0’ Slave interface FSM Slave interface FSM

[CIS 6930] P6 A custom HW connected to a CPU through a 32-bit bus. The HW has a single 128-bit input port. How does CPU send data to HW before HW is activated? Use a FIFO as in P5 to buffer data, or CPU waits for a ready signal from HW.

P6 A custom HW connected to a CPU through a 32-bit bus. The HW has a single 128-bit input port. Interface design of HW module Use a serial-in parallel-out shift register and a counter. start / read d1 read d2 read d3 read d4 read_done <= 1

P6 A custom HW connected to a CPU through a 32-bit bus. The HW has a single 128-bit input port. Draw the timing diagram for data transfer The answer varies depending on the interface and protocol used to connect CPU and HW.

HW 3

Problem 4.1 1 2 3 4 5 The longest length in DFG is 4.

Problem 4.1

Problem 4.2 1 2 3 4 a, b 1 1 a, b 2 4 a 2 4 a a CFG 3 DFG 3 a

Problem 4.3 int mysqrt(int N) { int x = 0, j; 0x80 int mysqrt(int N) { int x = 0, j; for (j=1<<7; j!=0; j>>1) { x = x + j; if (x*x > N) x = x – j; } return x; 1 1 2 3 1 2

Problem 4.3 int mysqrt(int N) { int x = 0, j; flag_zero for (j=1<<7; j!=0; j>>1) { x = x + j; if (x*x > N) x = x – j; } return x; flag_zero !flag_if / !flag_zero / flag_if /

Problem 4.7 int mysqrt(int N) { int x = 0, j; for (j=1<<7; j!=0; j>>1) { x = x + j; if (x*x > N) x = x – j; } return x; }

Problem 4.7 0x80 flag_zero flag_if

HW 2

P2.1 The values of tokens into the snk actor is 2, 4, 8, …, i.e., Figure 2.24 The values of tokens into the snk actor is 2, 4, 8, …, i.e.,

P2.2 Fibonacci Sequence Figure 2.24

P2.3: Original SDF

P2.3: Transformed SDF snk2 fork fork add fork 1

P2.4: Accumulator with Adder src add

P2.5 The topology matrix Figure 2.26 For PASS to exist, the rank must be 2. Set X = 2 and Y = 1, then the combination of first two columns gives the third one.

HW 1

P1: Structural Models at System and Processor Levels

P1: Structural Models at System and Processor Levels

P2: why a system-level structural model is more abstract than a processor- level structural model? Each component in a system-level structural model represents a design at the processor-level, which can be in many various forms such as a behavioral model, structural model as shown before, or a different structural model, etc. Implementation of communications over buses are defined in terms of messages, not bits.

P2: why a system-level structural model is more abstract than a processor- level structural model? Components in a processor-level structural model are described at the more detailed register-transfer cycle accurate level. Component interfaces and buses are bit-accurate and

P3: differences between the behavioral models at the cycle-accurate level and the instruction level. In a cycle-accurate model, the design behavior captures how registers are updated on individual clock edges. Each instruction typically takes a number of cycles to execution. An instruction accurate model captures how memory/registers are updated after execution of each individual instructions.

P4: benefits of using instruction accurate models compared to cycle-accurate models Each instruction typically takes a number of cycles to execution. An instruction accurate model captures how memory/registers are updated after execution of each individual instructions, without considering the register updates at each cycle. Therefore, simulating instruction accurate model is much faster. Allows early development and validation of SW.

P4: benefits of using transaction-level models compared to instruction accurate models Each transaction represents a sequence of instructions, ex. printf() Simulating transaction accurate model is much faster. Important for early exploration of system design space. Provide a function-accurate system prototype for early development and evaluation of SW.

P5 What is the system level synthesis? A process that converts a system behavioral model to a system-level structural model.

P5 What is the input model to the synthesis like?

P5 What are the key elements in the generated models? Processing elements such as CPUs, DSPs, memory controller, buses, communication/peripheral interfaces, custom HW logic components, etc.

P6(a) What would the design model look like if the system behavioral model is implemented in software completely? CPU DSP

P6(b) What would the design model look like if the system behavioral model is implemented in hardware completely? HW/ASIC HW/ASIC HW/ASIC

P7 pros and cons of pure software or pure hardware implementations for a given system. Fig. 1.6 Driving factors in HW/SW co-design

P8 Differences between concurrency and parallelism Concurrency is the ability to execute simultaneous operations because these operations are completely independent. Related to behavior of applications. Parallelism is the ability to execute simultaneous operations because the operations can run on different processors or circuit elements. Related to HW implementations.