Download presentation
Presentation is loading. Please wait.
Published byConstance Ada Robertson Modified over 9 years ago
1
HW/SW Co-design Lecture 4: Lab 2 – Passive HW Accelerator Design Course material designed by Professor Yarsun Hsu, EE Dept, NTHU RA: Yi-Chiun Fang, EE Dept, NTHU
2
Outline Introduction to AMBA Bus System Passive Hardware Design Interrupt Service Routine Environment Configuration Co-designed System with GHDL Simulation Co-designed System on FPGA
3
INTRODUCTION TO AMBA BUS SYSTEM
4
AMBA 2.0 Bus System (1/7) Established by ARM Advanced High-performance Bus (AHB) For high-performance, high clock frequency system modules such as embedded processor, DMA controller, and memory controller Advanced Peripheral Bus (APB) Optimized for minimal power consumption and reduced interface complexity to support peripheral functions For more details, please refer to the following documents AMBA 2.0 Specification Introduction to AMBA Bus System GRLIB AHBCTRL - AMBA AHB controller with plug&play support
5
AMBA 2.0 Bus System (2/7) Slave on AHB The only master on APB
6
AMBA 2.0 Bus System (3/7) AMBA AHB is designed to be used with a central multiplexor interconnection scheme Avoids tri-state bus
7
AMBA 2.0 Bus System (4/7) An AHB transfer consists of two distinct sections The address phase, which lasts only a single cycle The data phase, which may require several cycles This is achieved using the HREADY signal
8
AMBA 2.0 Bus System (5/7) A slave may insert wait states into any transfer For write operations, the bus master will hold the data stable throughout the extended cycles For read transfers, the slave does not have to provide valid data until the transfer is about to complete wait states
9
AMBA 2.0 Bus System (6/7) GRLIB implements AMBA AHB with slight modifications Please refer to the GRLIB User's Manual and GRLIB IP Cores Manual for detailed informationGRLIB User's Manual GRLIB IP Cores Manual
10
AMBA 2.0 Bus System (7/7) The GRLIB implementation of AHB includes a mechanism to provide plug&play support The implementation is located at grlib-gpl-1.0.19- b3188/lib/grlib/amba/ The configuration record from each AHB unit is sent to the AHB bus controller via the HCONFIG signal identification of attached units address mapping of slaves interrupt routing type ahb_config_type is array (0 to NAHBCFG-1) of amba_config_word;
11
PASSIVE HARDWARE DESIGN
12
Passive HW Accelerators The accelerator (bus slave) does not actively send signals to the bus It only responds to the master The master gives commands to the slave via its control registers and probes its status registers master slave
13
Passive 1-D IDCT HW Acc. (1/4) A simple 2-stage design Gate delay Stage 1: ~1 mult Stage 2: ~3 add Action register Write ‘1’ to start, reset to 0 automatically by the accelerator when done Mode register Row/column mode No wait states Immediate response action mode
14
Passive 1-D IDCT HW Acc. (2/4) Data packing Since the 8x8 blocks are of type short (16-bit), each value occupies only half of the data bus (32-bit) We pack two values together to increase data bus utilization and reduce the communication overhead The action bit and mode bit are also packed together actionmodeUNUSED 31012
15
Passive 1-D IDCT HW Acc. (3/4) 1-D IDCT calculation STEP1: Write Y registers (4 transfers) STEP2: Write mode bit & action bit STEP3: Poll the action bit STEP4: Read x registers after action bit reset
16
Passive 1-D IDCT HW Acc. (4/4) static void hw_idct_1d(short *dst, short *src, unsigned int mode) { long *long_ptr = (long *)src; Y_array_base[0] = long_ptr[0]; Y_array_base[1] = long_ptr[1];... *c_reg = (long)((mode << 1) | 0x1); while (*c_reg & 0x1){ /*busy waiting loop*/ } dst[ 0] = ((short *)x_array_base)[0]; dst[ 8] = ((short *)x_array_base)[1];... }
17
INTERRUPT SERVICE ROUTINE
18
GRLIB GPTIMER (1/2) General Purpose Timer Unit Timers are present in almost any electronic device which needs timing functions (e.g. timekeeping & time measurement) Acts as a slave on AMBA APB Provides a common decrementing prescaler (clocked by the system clock) and decrementing timers Capable of asserting interrupt on timer underflow We initialize timer 2 for 1ms resolution (i.e. an interrupt will be asserted every 1ms)
19
GRLIB GPTIMER (2/2) Please refer to the GRLIB IP Cores Manual for detailed informationGRLIB IP Cores Manual
20
eCos ISR (1/3) When an interrupt occurs, the processor jumps to a specific address for execution of the Interrupt Service Routine (ISR) One of the key concerns in embedded systems with respect to interrupts is latency, which is the interval of time from when an interrupt occurs until the ISR begins to execute interrupt latency
21
eCos ISR (2/3) Basic API for implementing ISR Please refer to the eCos Reference Manual for detailed informationeCos Reference Manual #include void cyg_interrupt_create(cyg_vector_t vector, cyg_priority_t priority, cyg_addrword_t data, cyg_ISR_t* isr, cyg_DSR_t* dsr, cyg_handle_t* handle, cyg_interrupt* intr); void cyg_interrupt_delete(cyg_handle_t interrupt); void cyg_interrupt_attach(cyg_handle_t interrupt); void cyg_interrupt_detach(cyg_handle_t interrupt); void cyg_interrupt_acknowledge(cyg_vector_t vector); void cyg_interrupt_mask(cyg_vector_t vector); void cyg_interrupt_unmask(cyg_vector_t vector);
22
eCos ISR (3/3) An ISR is a C function which takes the following form An ISR should complete as soon as possible cyg_uint32 isr_function(cyg_vector_t vector, cyg_addrword_t data) {.../* do the service routine */ return CYG_ISR_HANDLED; }
23
Program Profiling (1/2) We use GPTIMER for time measurment Every time the timer asserts an interrupt, the timer ISR will increase a global variable time_tick cyg_uint32 timer_isr(cyg_vector_t vector, cyg_addrword_t data) { unsigned long *time_tick = (unsigned long *) data; (*time_tick)++; cyg_interrupt_acknowledge(vector); return CYG_ISR_HANDLED; }
24
Program Profiling (2/2) We record the latency of every function block by monitoring the time_tick variable void func() { unsigned long local_timer = time_tick;... time_elapsed += (time_tick - local_timer); }
25
ENVIRONMENT CONFIGURATION
26
Build SW Application Copy the files in lab_pkg/lab2/sw to your original Lab 1 directory Replace the Makefile and modify the path for ECOSDIR in Makefile Type “ make ” to build -D_HW_ACC_ flag will link the co-designed version of hw_idct_2d() in idct_hw.c with the testbench Without this flag, hw_idct_2d() will be identical to sw_idct_2d() -D_PROFILING_ flag will enable profiling using timer interrupt, and report the results in the end
27
Install IDCT Accelerator Copy lab_pkg/lab2/hw/devices.vhd to grlib-gpl-1.0.19-b3188/lib/grlib/amba/ and replace the original file Copy lab_pkg/lab2/hw/libs.txt and the whole lab_pkg/lab2/hw/esw folder to grlib- gpl-1.0.19-b3188/lib/ The 1-D IDCT passive accelerator is located at lab_pkg/lab2/hw/esw/idct_acc/idct_1x8.vhd Copy lab_pkg/lab2/hw/leon3mp.vhd to grlib-gpl-1.0.19-b3188/designs/leon3-gr- xc3s-1500/ and replace the original file
28
CO-DESIGNED SYSTEM WITH GHDL SIMULATION
29
GHDL Simulation (1/6) We compile our program as a virtual SDRAM for LEON3 processor LEON3 will fetch the instructions and perform the corresponding operations All the hardware signals can be recorded and dumped by GHDL
30
GHDL Simulation (2/6) In order to perform GHDL simulation, we disallow our program to link with eCos Remove -D__ECOS & -I$(ECOSDIR)/include from CFLAGS Remove -Ttarget.ld, -nostdlib, & -L$(ECOSDIR)/lib from LFLAGS Remove –D_PROFILING_ flag You can remove -D_VERBOSE_ for faster simulation You can modify the NUM_BLKS macro in idct_test.c to reduce the number of testbench iterations Type “ make ” to build You should see a file named sdram.srec
31
GHDL Simulation (3/6) Start Cygwin cd grlib-gpl-1.0.19-b3188/designs/leon3-gr- xc3s-1500/ make distclean make soft Copy sdram.srec we built into this directory and replace the original one make ghdl You can check for syntax errors through GHDL
32
GHDL Simulation (4/6) Type “./testbench.exe --vcd=waveform.vcd ” after compilation to begin simulation You should see an AHB slave with “Unknown vendor” appear, which is our IDCT accelerator
33
GHDL Simulation (5/6) The dump file waveform.vcd can be viewed on-the-fly using GTKWave Drag waveform.vcd and drop it over the gtkwave.exe icon to open You can also use Windows cmd to open “File → Reload Waveform” in GTKWave to update the dump file
34
GHDL Simulation (6/6) addr phase data phase stage 1 stage 2 probe control reg
35
CO-DESIGNED SYSTEM ON FPGA
36
Build FPGA Bitstream (1/2) Type “ make ise | tee ise_log ” under grlib-gpl-1.0.19- b3188/designs/leon3-gr-xc3s-1500/ after you install the accelerator It is strongly suggested that you verify the hardware with GHDL simulation first It is also suggested that you take a look at ise_log for more information Configure your FPGA with leon3mp.bit after generating the bitstream
37
Build FPGA Bitstream (2/2) After entering GRMON, check the system configuration using “ info sys ” You should see a device with “Unknown vendor” appear
38
Profiling Results Build the program with -D_PROFILING_ flag on Compare the computation results of sw_idct_2d() and hw_idct_2d() Compare the computation results with and without -D_VERBOSE_ flag
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.