Download presentation
Presentation is loading. Please wait.
1
Introduction to Microcontrollers
Lecture 1 Introduction to Microcontrollers Dr. Konstantinos Tatas 1
2
Components of a microprocessor/controller
CPU: Central Processing Unit I/O: Input /Output Bus: Address bus & Data bus Memory: RAM & ROM Timer Interrupt Serial Port Parallel Port ACOE343 - Real-Time Embedded Processor Systems - Frederick University 2
3
General-purpose microprocessor:
CPU for Computers Commonly no RAM, ROM, I/O on CPU chip itself Many chips on motherboard Data Bus CPU General- Purpose Micro- processor Serial COM Port I/O Port Intel’s x86: 8086,8088,80386,80486, Pentium Motorola’s 680x0: 68000, 68010, 68020,68030,6040 RAM ROM Timer Address Bus ACOE343 - Real-Time Embedded Processor Systems - Frederick University 3 3
4
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Microcontroller : A single-chip computer On-chip RAM, ROM, I/O ports... Example:Motorola’s 6811, Intel’s 8051, Zilog’s Z8 and PIC 16X CPU RAM ROM A single chip Serial COM Port I/O Port Timer Microcontroller ACOE343 - Real-Time Embedded Processor Systems - Frederick University 4
5
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Microprocessor vs. Microcontroller Microcontroller CPU, RAM, ROM, I/O and timer are all on a single chip fixed amount of on-chip ROM, RAM, I/O ports for applications in which cost, power and space are critical single-purpose (control-oriented) Low processing power Low power consumption Bit-level operations Instruction sets focus on control and bit-level operations Typically 8/16 bit Typically single-cycle/two-stage pipeline Microprocessor CPU is stand-alone, RAM, ROM, I/O, timer are separate designer can decide on the amount of ROM, RAM and I/O ports. expensive versatility general-purpose High processing power High power consumption Instruction sets focus on processing-intensive operations Typically 32/64 – bit Typically deep pipeline (5-20 stages) versatility 多用途的: any number of applications for PC ACOE343 - Real-Time Embedded Processor Systems - Frederick University 5 5
6
Some Popular Microcontrollers…
8051 Microchip Technology PIC Atmel AVR Texas Instruments MSP430 (16-bit) ACOE343 - Real-Time Embedded Processor Systems - Frederick University 6
7
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Review questions What are the main differences between a microprocessor and a microcontroller in terms of Architecture Applications Instruction set ACOE343 - Real-Time Embedded Processor Systems - Frederick University 7
8
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Example A uP running at 600 MHz has an average CPI of 1.2 and a average power consumption of 400 mW, while a uC running at 12 MHz with a two cycle datapath has a power consumption of 24 mW. Calculate their respective MIPS Which one is more efficient in MIPS/mW? ACOE343 - Real-Time Embedded Processor Systems - Frederick University 8
9
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Example 2 The previous uP costs 100$, while the respective uC costs 0.96 $ Which is more efficient in MIPS/$? ACOE343 - Real-Time Embedded Processor Systems - Frederick University 9
10
The 8051 Microcontroller architecture
Lecture 2 The 8051 Microcontroller architecture
11
Contents: Introduction Block Diagram and Pin Description of the 8051
Registers Some Simple Instructions Structure of Assembly language and Running an 8051 program Memory mapping in 8051 8051 Flag bits and the PSW register Addressing Modes 16-bit, BCD and Signed Arithmetic in 8051 Stack in the 8051 LOOP and JUMP Instructions CALL Instructions I/O Port Programming 11
12
Three criteria in Choosing a Microcontroller
meeting the computing needs of the task efficiently and cost effectively speed, the amount of ROM and RAM, the number of I/O ports and timers, size, packaging, power consumption easy to upgrade cost per unit availability of software development tools assemblers, debuggers, C compilers, emulator, simulator, technical support wide availability and reliable sources of the microcontrollers.
13
The 8051 microcontroller a Harvard architecture (separate instruction/data memories) single chip microcontroller (µC) developed by Intel in 1980 for use in embedded systems. today largely superseded by a vast range of faster and/or functionally enhanced compatible devices manufactured by more than 20 independent manufacturers
14
Block Diagram External interrupts On-chip ROM for program code
Timer/Counter Interrupt Control On-chip RAM Timer 1 Counter Inputs Timer 0 CPU Serial Port Bus Control 4 I/O Ports OSC P0 P1 P2 P3 TxD RxD Address/Data 14
15
Comparison of the 8051 Family Members
Feature ROM (program space in bytes) 4K K K RAM (bytes) Timers I/O pins Serial port Interrupt sources
17
Pin Description of the 8051 8051 (8031) 1 2 3 4 5 6 7 8 9 10 11 12
13 14 15 16 17 18 19 20 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 P1.0 P1.1 P1.2 P1.3 P1.4 P1.5 P1.6 P1.7 RST (RXD)P3.0 (TXD)P3.1 (T0)P3.4 (T1)P3.5 XTAL2 XTAL1 GND (INT0)P3.2 (INT1)P3.3 (RD)P3.7 (WR)P3.6 Vcc P0.0(AD0 ) P0.1(AD1) P0.2(AD2 ) P0.3(AD3) P0.4(AD4) P0.5(AD5) P0.6(AD6) P0.7(AD7) EA/VPP ALE/PROG PSEN P2.7(A15) P2.6(A14 ) P2.5(A13 ) P2.4(A12 ) P2.3(A11) P2.2(A10) P2.1(A9) P2.0(A8) 8051 (8031)
18
Vcc provides supply voltage to the chip. The voltage source is +5V.
Pins of 8051(1/4) Vcc(pin 40): Vcc provides supply voltage to the chip. The voltage source is +5V. GND(pin 20):ground XTAL1 and XTAL2(pins 19,18): These 2 pins provide external clock. Way 1:using a quartz crystal oscillator Way 2:using a TTL oscillator Example 4-1 shows the relationship between XTAL and the machine cycle.
19
Pins of 8051(2/4) RST(pin 9):reset
It is an input pin and is active high(normally low). The high pulse must be high at least 2 machine cycles. It is a power-on reset. Upon applying a high pulse to RST, the microcontroller will reset and all values in registers will be lost. Reset values of some 8051 registers Way 1:Power-on reset circuit Way 2:Power-on reset with debounce
20
Pins of 8051(3/4) /EA(pin 31):external access
There is no on-chip ROM in 8031 and The /EA pin is connected to GND to indicate the code is stored externally. /PSEN & ALE are used for external ROM. For 8051, /EA pin is connected to Vcc. “/” means active low. /PSEN(pin 29):program store enable This is an output pin and is connected to the OE pin of the ROM. See Chapter 14.
21
Pins of 8051(4/4) ALE(pin 30):address latch enable
It is an output pin and is active high. 8051 port 0 provides both address and data. The ALE pin is used for de-multiplexing the address and data by connecting to the G pin of the 74LS373 latch. I/O port pins The four ports P0, P1, P2, and P3. Each port uses 8 pins. All I/O pins are bi-directional.
22
Figure 4-2 (a). XTAL Connection to 8051
3 0 p F C 1 X T A L 2 X T A L 1 G N D Figure 4-2 (a). XTAL Connection to 8051 Using a quartz crystal oscillator We can observe the frequency on the XTAL2 pin.
23
Figure 4-2 (b). XTAL Connection to an External Clock Source
Using a TTL oscillator XTAL2 is unconnected. N C EXTERNAL OSCILLATOR SIGNAL XTAL2 XTAL1 GND
24
RESET Value of Some 8051 Registers:
PC 0000 ACC 0000 B 0000 PSW 0000 SP 0007 DPTR 0000 RAM are all zero.
25
Figure 4-3 (a). Power-On RESET Circuit
Vcc + 10 uF 31 EA/VPP X1 30 pF 19 MHz 8.2 K X2 18 30 pF RST 9
26
Figure 4-3 (b). Power-On RESET with Debounce
Vcc 31 EA/VPP X1 10 uF 30 pF X2 RST 9 8.2 K
27
Pins of I/O Port The 8051 has four I/O ports
Port 0 (pins 32-39):P0(P0.0~P0.7) Port 1(pins 1-8) :P1(P1.0~P1.7) Port 2(pins 21-28):P2(P2.0~P2.7) Port 3(pins 10-17):P3(P3.0~P3.7) Each port has 8 pins. Named P0.X (X=0,1,...,7), P1.X, P2.X, P3.X Ex:P0.0 is the bit 0(LSB)of P0 Ex:P0.7 is the bit 7(MSB)of P0 These 8 bits form a byte. Each port can be used as input or output (bi-direction). Program is to read data from P0 and then send data to P1 27
28
Some 8-bitt Registers of the 8051
A B R0 R1 R3 R4 R2 R5 R7 R6 DPH DPL PC DPTR Some bit Register Some 8-bitt Registers of the 8051
29
Memory Map (RAM)
30
CPU timing Most 8051 instructions are executed in one cycle.
MUL (multiply) and DIV (divide) are the only instructions that take more than two cycles to complete (four cycles) Normally two code bytes are fetched from the program memory during every machine cycle. The only exception to this is when a MOVX instruction is executed. MOVX is a one-byte, 2-cycle instruction that accesses external data memory. During a MOVX, the two fetches in the second cycle are skipped while the external data memory is being addressed and strobed.
31
8051 machine cycle
32
Example : Find the machine cycle for (a) XTAL = 11.0592 MHz
(b) XTAL = 16 MHz. Solution: (a) MHz / 12 = kHz; machine cycle = 1 / kHz = s (b) 16 MHz / 12 = MHz; machine cycle = 1 / MHz = 0.75 s
33
Edsim51 emulator diagram
34
KitCON-515 schematic
35
Timers 8051 has two 16-bit on-chip timers that can be used for timing durations or for counting external events The high byte for timer 1 (TH1) is at address 8DH while the low byte (TL1) is at 8BH The high byte for timer 0 (TH0) is at 8CH while the low byte (TL0) is at 8AH. Timer Mode Register (TMOD) is at address 88H
36
Timer Mode Register Bit 7: Gate bit; when set, timer only runs while \INT high. (T0) Bit 6: Counter/timer select bit; when set timer is an event counter when cleared timer is an interval timer (T0) Bit 5: Mode bit 1 (T0) Bit 4: Mode bit 0 (T0) Bit 3: Gate bit; when set, timer only runs while \INT high. (T1) Bit 2: Counter/timer select bit; when set timer is an event counter when cleared timer is an interval timer (T1) Bit 1: Mode bit 1 (T1) Bit 0: Mode bit 0 (T1)
37
Timer Modes M1-M0: 00 (Mode 0) – 13-bit mode (not commonly used)
M1-M0: 01 (Mode 1) - 16-bit timer mode M1-M0: 10 (Mode 2) - 8-bit auto-reload mode M1-M0: 11 (Mode 3) – Split timer mode
38
8051 Interrupt Vector Table
39
The Stack and Stack Pointer
The Stack Pointer, like all registers except DPTR and PC, may hold an 8-bit (1-byte) value. The Stack Pointer is used to indicate where the next value to be removed from the stack should be taken from. When you push a value onto the stack, the 8051 first increments the value of SP and then stores the value at the resulting memory location. When you pop a value off the stack, the 8051 returns the value from the memory location indicated by SP, and then decrements the value of SP. This order of operation is important. When the 8051 is initialized SP will be initialized to 07h. If you immediately push a value onto the stack, the value will be stored in Internal RAM address 08h. This makes sense taking into account what was mentioned two paragraphs above: First the 8051 will increment the value of SP (from 07h to 08h) and then will store the pushed value at that memory address (08h). SP is modified directly by the 8051 by six instructions: PUSH, POP, ACALL, LCALL, RET, and RETI. It is also used intrinsically whenever an interrupt is triggered
40
Programming the 8051 Microcontroller
Lecture 3 Programming the 8051 Microcontroller Dr. Konstantinos Tatas
41
A B R0 R1 R3 R4 R2 R5 R7 R6 DPH DPL PC DPTR Some bit Register Some 8-bit Registers of the 8051 A: Accumulator B: Used specially in MUL/DIV R0-R7: GPRs Registers ACOE343 - Embedded Real-Time Processor Systems - Frederick University 41
42
8051 Programming using Assembly
43
The MOV Instruction – Addressing Modes
MOV dest,source ; dest = source MOV A,#72H ;A=72H MOV A, #’r’ ;A=‘r’ OR 72H MOV R4,#62H ;R4=62H MOV B,0F9H ;B=the content of F9’th byte of RAM MOV DPTR,#7634H MOV DPL,#34H MOV DPH,#76H MOV P1,A ;mov A to port 1 Note 1: MOV A,#72H ≠ MOV A,72H After instruction “MOV A,72H ” the content of 72’th byte of RAM will replace in Accumulator. MOV AL,72H MOV A,#72H MOV AL,’r’ MOV A,#’r’ MOV BX,72H MOV AL,[BX] MOV A,72H Note 2: MOV A,R3 ≡ MOV A,3 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 43
44
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Arithmetic Instructions ADD A, Source ;A=A+SOURCE ADD A,#6 ;A=A+6 ADD A,R6 ;A=A+R6 ADD A,6 ;A=A+[6] or A=A+R6 ADD A,0F3H ;A=A+[0F3H] ACOE343 - Embedded Real-Time Processor Systems - Frederick University 44
45
Set and Clear Instructions
SETB bit ; bit=1 CLR bit ; bit=0 SETB C ; CY=1 SETB P0.0 ;bit 0 from port 0 =1 SETB P3.7 ;bit 7 from port 3 =1 SETB ACC.2 ;bit 2 from ACCUMULATOR =1 SETB 05 ;set high D5 of RAM loc. 20h Note: CLR instruction is as same as SETB i.e: CLR C ;CY=0 But following instruction is only for CLR: CLR A ;A=0 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 45
46
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
SUBB A,source ;A=A-source-CY SETB C ;CY=1 SUBB A,R5 ;A=A-R5-1 ADC A,source ;A=A+source+CY ADC A,R5 ;A=A+R5+1 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 46
47
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
DEC byte ;byte=byte-1 INC byte ;byte=byte+1 INC R7 DEC A DEC 40H ; [40]=[40]-1 CPL A ;1’s complement Example: MOV A,#55H ;A= B L01: CPL A MOV P1,A ACALL DELAY SJMP L01 NOP & RET & RETI All are like 8086 instructions. CALL ACOE343 - Embedded Real-Time Processor Systems - Frederick University 47
48
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Logic Instructions ANL byte/bit ORL byte/bit XRL byte EXAMPLE: MOV R5,#89H ANL R5,#08H ACOE343 - Embedded Real-Time Processor Systems - Frederick University 48
49
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Rotate Instructions RR A Accumulator rotate right RL A Accumulator Rotate left RRC A Accumulator Rotate right through the carry. RLC A Accumulator Rotate left through the carry. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 49
50
Structure of Assembly language and Running an 8051 program
EDITOR PROGRAM ASSEMBLER LINKER OH Myfile.asm Myfile.obj Other obj file Myfile.lst Myfile.abs Myfile.hex ORG 0H MOV R5,#25H MOV R7,#34H MOV A,#0 ADD A,R5 ADD A,#12H HERE: SJMP HERE END ACOE343 - Embedded Real-Time Processor Systems - Frederick University 50
51
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Memory mapping in 8051 ROM memory map in 8051 family 0000H 0FFFH 1FFFH 7FFFH 8751 AT89C51 8752 AT89C52 4k 8k 32k DS from Atmel Corporation from Dallas Semiconductor ACOE343 - Embedded Real-Time Processor Systems - Frederick University 51
52
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
RAM memory space allocation in the 8051 7FH 30H 2FH 20H 1FH 17H 10H 0FH 07H 08H 18H 00H Register Bank 0 (Stack) Register Bank 1 Register Bank 2 Register Bank 3 Bit-Addressable RAM Scratch pad RAM ACOE343 - Embedded Real-Time Processor Systems - Frederick University 52
53
8051 Flag bits and the PSW register
CY AC F0 RS1 OV RS0 P -- CY PSW.7 Carry flag AC PSW.6 Auxiliary carry flag -- PSW.5 Available to the user for general purpose RS1 PSW.4 Register Bank selector bit 1 RS0 PSW.3 Register Bank selector bit 0 OV PSW.2 Overflow flag -- PSW.1 User define bit P PSW.0 Parity flag Set/Reset odd/even parity RS1 RS0 Register Bank Address H-07H H-0FH H-17H H-1FH ACOE343 - Embedded Real-Time Processor Systems - Frederick University 53
54
Instructions that Affect Flag Bits:
Note: X can be 0 or 1 54 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
55
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example: MOV A,#88H ADD A,#93H 11B CY=1 AC=0 P=0 Example: MOV A,#9CH ADD A,#64H 9C CY=1 AC=1 P=0 Example: MOV A,#38H ADD A,#2FH +2F CY=0 AC=1 P=1 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 55
56
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Addressing Modes Immediate Register Direct Register Indirect Indexed ACOE343 - Embedded Real-Time Processor Systems - Frederick University 56
57
Immediate Addressing Mode
MOV A,#65H MOV A,#’A’ MOV R6,#65H MOV DPTR,#2343H MOV P1,#65H Example : Num EQU 30 … MOV R0,Num MOV DPTR,#data1 ORG 100H data1: db “Example” ACOE343 - Embedded Real-Time Processor Systems - Frederick University 57
58
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example Write the decimal value 4 on the SSD in the following figure. Switch the decimal point off. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 58
59
Register Addressing Mode
MOV Rn, A ;n=0,..,7 ADD A, Rn MOV DPL, R6 MOV DPTR, A MOV Rm, Rn ACOE343 - Embedded Real-Time Processor Systems - Frederick University 59
60
Direct Addressing Mode
Although the entire of 128 bytes of RAM can be accessed using direct addressing mode, it is most often used to access RAM loc. 30 – 7FH. MOV R0, 40H MOV 56H, A MOV A, 4 ; ≡ MOV A, R4 MOV 6, 2 ; copy R2 to R6 ; MOV R6,R2 is invalid ! SFR register and their address MOV 0E0H, #66H ; ≡ MOV A,#66H MOV 0F0H, R2 ; ≡ MOV B, R2 MOV 80H,A ; ≡ MOV P1,A ACOE343 - Embedded Real-Time Processor Systems - Frederick University 60
61
Register Indirect Addressing Mode
In this mode, register is used as a pointer to the data. MOV ; move content of RAM loc.Where address is held by Ri into A ( i=0 or 1 ) MOV @R1,B In other word, the content of register R0 or R1 is sources or target in MOV, ADD and SUBB insructions. Example: Write a program to copy a block of 10 bytes from RAM location sterting at 37h to RAM location starting at 59h. Solution: MOV R0,37h ; source pointer MOV R1,59h ; dest pointer MOV R2,10 ; counter L1: MOV MOV @R1,A INC R0 INC R1 DJNZ R2,L1 jump ACOE343 - Embedded Real-Time Processor Systems - Frederick University 61
62
Indexed Addressing Mode And On-Chip ROM Access
This mode is widely used in accessing data elements of look-up table entries located in the program (code) space ROM at the 8051 MOVC A= content of address A +DPTR from ROM Note: Because the data elements are stored in the program (code ) space ROM of the 8051, it uses the instruction MOVC instead of MOV. The “C” means code. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 62
63
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example: Assuming that ROM space starting at 250h contains “Hello.”, write a program to transfer the bytes into RAM locations starting at 40h. Solution: ORG 0 MOV DPTR,#MYDATA MOV R0,#40H L1: CLR A MOVC JZ L2 MOV @R0,A INC DPTR INC R0 SJMP L1 L2: SJMP L2 ; ORG 250H MYDATA: DB “Hello”,0 END Notice the NULL character ,0, as end of string and how we use the JZ instruction to detect that. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 63
64
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example: Write a program to get the x value from P1 and send x2 to P2, continuously . Solution: ORG 0 ;code segment MOV DPTR, #TAB1 ;moving data segment to data pointer MOV A,#0FFH ;configuring P1 as input port MOV P1,A L01: MOV A,P1 ;reading value from P1 MOVC MOV P2,A SJMP L01 ; ORG 300H ;data segment TAB1: DB 0,1,4,9,16,25,36,49,64,81 END ACOE343 - Embedded Real-Time Processor Systems - Frederick University 64
65
External Memory Addressing
MOVX ; A [R1] (in external memory) MOVX A ACOE343 - Embedded Real-Time Processor Systems - Frederick University 65
66
16-bit, BCD and Signed Arithmetic in 8051
Exercise: Write a program to add n 16-bit number. Get n from port 1. And sent Sum to SSD a) in hex b) in decimal Write a program to subtract P1 from P0 and send result to LCD (Assume that “ACAL DISP” display A to SSD ) ACOE343 - Embedded Real-Time Processor Systems - Frederick University 66
67
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
MUL & DIV MUL AB ;B|A = A*B MOV A,#25H MOV B,#65H MUL AB ;25H*65H=0E99 ;B=0EH, A=99H MUL AB ;A = A/B, B = A mod B MOV A,#25 MOV B,#10 MUL AB ;A=2, B=5 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 67
68
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Stack in the 8051 The register used to access the stack is called SP (stack pointer) register. The stack pointer in the is only 8 bits wide, which means that it can take value 00 to FFH. When powered up, the SP register contains value 07. 7FH 30H 2FH 20H 1FH 17H 10H 0FH 07H 08H 18H 00H Register Bank 0 (Stack) Register Bank 1 Register Bank 2 Register Bank 3 Bit-Addressable RAM Scratch pad RAM ACOE343 - Embedded Real-Time Processor Systems - Frederick University 68
69
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example: MOV R6,#25H MOV R1,#12H MOV R4,#0F3H PUSH 6 PUSH 1 PUSH 4 0BH 0AH 09H 08H Start SP=07H 25 0BH 0AH 09H 08H SP=08H 12 25 0BH 0AH 09H 08H SP=09H F3 12 25 0BH 0AH 09H 08H SP=10H ACOE343 - Embedded Real-Time Processor Systems - Frederick University 69
70
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example (cont.) POP 4 POP 1 POP 6 F3 12 25 0BH 0AH 09H 08H SP=10H 12 25 0BH 0AH 09H 08H SP=09H 25 0BH 0AH 09H 08H SP=08H 0BH 0AH 09H 08H Start SP=07H ACOE343 - Embedded Real-Time Processor Systems - Frederick University 70
71
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
How to use the stack You can use the stack as temporary storage for variables when calling functions RLC A ;you can only rotate A Call function DIV AB ; A has the wrong value!!!!! … function: MOV A, #5 ;values are for example sake MOV B, #10 MUL AB ;you can only multiply on A RET ACOE343 - Embedded Real-Time Processor Systems - Frederick University 71
72
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example (correct) RLC A ;you can only rotate A PUSH A ;saving A and B on the stack before PUSH B ;calling function Call function POP B ;restoring B POP A ;and A (POP in reverse order) DIV AB ; A has the wrong value!!!!! … function: MOV A, #5 ;values are for example sake MOV B, #10 MUL AB ;you can only multiply on A RET ACOE343 - Embedded Real-Time Processor Systems - Frederick University 72
73
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Saving PSW The Program Status Word registers contains flags that are often important for correct program flow You can push PSW on the stack before calling a function ADD A, R0 PUSH PSW PUSH A ;saving A and R0 on the stack before PUSH R0 ;calling function Call function POP R0 ;restoring R0 POP A ;and A (POP in reverse order) POP PSW JC loop ;If this means the carry from the ;function then don’t push PSW … function: MOV A, #5 ;values are for example sake ADD A, R2 ;the flags are set according to ADD result RET ACOE343 - Embedded Real-Time Processor Systems - Frederick University 73
74
LOOP and JUMP Instructions
DJNZ: Write a program to clear ACC, then add 3 to the accumulator ten times Solution: MOV A,#0; MOV R2,#10 AGAIN: ADD A,#03 DJNZ R2,AGAING ;repeat until R2=0 (10 times) MOV R5,A ACOE343 - Embedded Real-Time Processor Systems - Frederick University 74
75
Other conditional jumps :
JZ Jump if A=0 JNZ Jump if A/=0 DJNZ Decrement and jump if A/=0 CJNE A,byte Jump if A/=byte CJNE reg,#data Jump if byte/=#data JC Jump if CY=1 JNC Jump if CY=0 JB Jump if bit=1 JNB Jump if bit=0 JBC Jump if bit=1 and clear bit 75 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
76
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
SJMP and LJMP: LJMP(long jump) LJMP is an unconditional jump. It is a 3-byte instruction in which the first byte is the opcode, and the second and third bytes represent the 16-bit address of the target location. The 20byte target address allows a jump to any memory location from 0000 to FFFFH. SJMP(short jump) In this 2-byte instruction. The first byte is the opcode and the second byte is the relative address of the target location. The relative address range of 00-FFH is divided into forward and backward jumps, that is , within -128 to +127 bytes of memory relative to the address of the current PC. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 76
77
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
CJNE , JNC Exercise: Write a program that compare R0,R1. If R0>R1 then send 1 to port 2, else if R0<R1 then send 0FFh to port 2, else send 0 to port 2. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 77
78
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
CALL Instructions Another control transfer instruction is the CALL instruction, which is used to call a subroutine. LCALL(long call) In this 3-byte instruction, the first byte is the opcode an the second and third bytes are used for the address of target subroutine. Therefore, LCALL can be used to call subroutines located anywhere within the 64K byte address space of the 8051. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 78
79
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
ACALL (absolute call) ACALL is 2-byte instruction in contrast to LCALL, which is 13 bytes. Since ACALL is a 2-byte instruction, the target address of the subroutine must be within 2K bytes address because only 11 bits of the 2 bytes are used for the address. There is no difference between ACALL and LCALL in terms of saving the program counter on the stack or the function of the RET instruction. The only difference is that the target address for LCALL can be anywhere within the 64K byte address space of the while the target address of ACALL must be within a 2K- byte range. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 79
80
Example A B R5 R7 Address Data ORG 0H VAL1 EQU 05H MOV R5,#25H
LOOP: MOV R7,#VAL1 MOV A,#0 ADD A,R5 ADD A,#12H RRC A DJNZ A, LOOP SETB ACC.3 CLR A CJNE A, #0, LOOP HERE: SJMP HERE END 80 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
81
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
I/O Port Programming Port 1(pins 1-8) Port 1 is denoted by P1. P1.0 ~ P1.7 We use P1 as examples to show the operations on ports. P1 as an output port (i.e., write CPU data to the external pin) P1 as an input port (i.e., read pin data into CPU bus) ACOE343 - Embedded Real-Time Processor Systems - Frederick University 81
82
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
A Pin of Port 1 D Q Clk Q Vcc Load(L1) Read latch Read pin Write to latch Internal CPU bus M1 P1.X pin P1.X TB1 TB2 P0.x 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 82
83
Hardware Structure of I/O Pin
Each pin of I/O ports Internal CPU bus:communicate with CPU A D latch store the value of this pin D latch is controlled by “Write to latch” Write to latch=1:write data into the D latch 2 Tri-state buffer: TB1: controlled by “Read pin” Read pin=1:really read the data present at the pin TB2: controlled by “Read latch” Read latch=1:read value from internal latch A transistor M1 gate Gate=0: open Gate=1: close ACOE343 - Embedded Real-Time Processor Systems - Frederick University 83
84
Tri-state Buffer Output Input Tri-state control (active high) L L H
Low Highimpedance (open-circuit) H H ACOE343 - Embedded Real-Time Processor Systems - Frederick University 84
85
Writing “1” to Output Pin P1.X
D Q Clk Q Vcc Load(L1) Read latch Read pin Write to latch Internal CPU bus M1 P1.X pin P1.X TB2 2. output pin is Vcc 1. write a 1 to the pin 1 output 1 TB1 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 85
86
Writing “0” to Output Pin P1.X
D Q Clk Q Vcc Load(L1) Read latch Read pin Write to latch Internal CPU bus M1 P1.X pin P1.X TB2 2. output pin is ground 1. write a 0 to the pin output 0 1 TB1 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 86
87
Port 1 as Output(Write to a Port)
Send data to Port 1: MOV A,#55H BACK: MOV P1,A ACALL DELAY CPL A SJMP BACK Let P1 toggle. You can write to P1 directly. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 87 87
88
Reading Input v.s. Port Latch
When reading ports, there are two possibilities: Read the status of the input pin. (from external pin value) MOV A, PX JNB P2.1, TARGET ; jump if P2.1 is not set JB P2.1, TARGET ; jump if P2.1 is set Figures C-11, C-12 Read the internal latch of the output port. ANL P1, A ; P1 ← P1 AND A ORL P1, A ; P1 ← P1 OR A INC P ; increase P1 Figure C-17 Table C-6 Read-Modify-Write Instruction (or Table 8-5) See Section 8.3 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 88
89
Reading “High” at Input Pin
D Q Clk Q Vcc Load(L1) Read latch Read pin Write to latch Internal CPU bus M1 P1.X pin P1.X 2. MOV A,P1 external pin=High TB2 write a 1 to the pin MOV P1,#0FFH 1 1 TB1 3. Read pin=1 Read latch=0 Write to latch=1 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 89
90
Reading “Low” at Input Pin
D Q Clk Q Vcc Load(L1) Read latch Read pin Write to latch Internal CPU bus M1 P1.X pin P1.X 2. MOV A,P1 external pin=Low TB2 write a 1 to the pin MOV P1,#0FFH 1 TB1 3. Read pin=1 Read latch=0 Write to latch=1 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 90
91
Port 1 as Input(Read from Port)
In order to make P1 an input, the port must be programmed by writing 1 to all the bit. MOV A,#0FFH ;A= B MOV P1,A ;make P1 an input port BACK: MOV A,P ;get data from P0 MOV P2,A ;send data to P2 SJMP BACK To be an input port, P0, P1, P2 and P3 have similar methods. Program is to read data from P0 and then send data to P1 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 91 91
92
Instructions For Reading an Input Port
Following are instructions for reading external pins of ports: Mnemonics Examples Description MOV A,PX MOV A,P2 Bring into A the data at P2 pins JNB PX.Y,.. JNB P2.1,TARGET Jump if pin P2.1 is low JB PX.Y,.. JB P1.3,TARGET Jump if pin P1.3 is high MOV C,PX.Y MOV C,P2.4 Copy status of pin P2.4 to CY 92 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
93
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Reading Latch Exclusive-or the Port 1: MOV P1,#55H ;P1= ORL P1,#0F0H ;P1= 1. The read latch activates TB2 and bring the data from the Q latch into CPU. Read P1.0=0 2. CPU performs an operation. This data is ORed with bit 1 of register A. Get 1. 3. The latch is modified. D latch of P1.0 has value 1. 4. The result is written to the external pin. External pin (pin 1: P1.0) has value 1. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 93
94
Reading the Latch D Q Clk Q
1. Read pin=0 Read latch=1 Write to latch=0 (Assume P1.X=0 initially) D Q Clk Q Vcc Load(L1) Read latch Read pin Write to latch Internal CPU bus M1 P1.X pin P1.X TB2 2. CPU compute P1.X OR 1 4. P1.X=1 1 1 3. write result to latch Read pin= Read latch= Write to latch=1 TB1 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 94
95
Read-modify-write Feature
Read-modify-write Instructions Table C-6 This features combines 3 actions in a single instruction: 1. CPU reads the latch of the port 2. CPU perform the operation 3. Modifying the latch 4. Writing to the pin Note that 8 pins of P1 work independently. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 95
96
Port 1 as Input(Read from latch)
Exclusive-or the Port 1: MOV P1,#55H ;P1= AGAIN: XOR P1,#0FFH ;complement ACALL DELAY SJMP AGAIN Note that the XOR of 55H and FFH gives AAH. XOR of AAH and FFH gives 55H. The instruction read the data in the latch (not from the pin). The instruction result will put into the latch and the pin. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 96 96
97
Read-Modify-Write Instructions
Mnemonics Example SETB P1.4 SETB PX.Y CLR P1.3 CLR PX.Y MOV P1.2,C MOV PX.Y,C DJNZ P1,TARGET DJNZ PX, TARGET INC P1 INC CPL P1.2 CPL JBC P1.1, TARGET JBC PX.Y, TARGET XRL P1,A XRL ORL P1,A ORL ANL P1,A ANL DEC P1 DEC ANL: Latch data AND with A , then save back to latch and write to the external pin ORL: OR XRL: XOR JBC: jump to TARGET if bit set and clear bit CPL: complement INC: increase DEC: decrease DJNZ: decrease P1 and jump if P1 not zero MOV the latch value to carry CLR: clear bit, SETB: set bit ACOE343 - Embedded Real-Time Processor Systems - Frederick University 97 97
98
You are able to answer this Questions:
How to write the data to a pin? How to read the data from the pin? Read the value present at the external pin. Why we need to set the pin first? Read the value come from the latch(not from the external pin). Why the instruction is called read-modify write? ACOE343 - Embedded Real-Time Processor Systems - Frederick University 98
99
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Other Pins P1, P2, and P3 have internal pull-up resisters. P1, P2, and P3 are not open drain. P0 has no internal pull-up resistors and does not connects to Vcc inside the 8051. P0 is open drain. Compare the figures of P1.X and P0.X. However, for a programmer, it is the same to program P0, P1, P2 and P3. All the ports upon RESET are configured as output. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 99
100
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
A Pin of Port 0 D Q Clk Q Read latch Read pin Write to latch Internal CPU bus M1 P0.X pin P1.X TB1 TB2 P1.x 8051 IC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 100
101
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Port 0(pins 32-39) P0 is an open drain. Open drain is a term used for MOS chips in the same way that open collector is used for TTL chips. When P0 is used for simple data I/O we must connect it to external pull-up resistors. Each pin of P0 must be connected externally to a 10K ohm pull-up resistor. With external pull-up resistors connected upon reset, port 0 is configured as an output port. Open drain is a term used for MOS chips in the same way that open collector is used for TTL chips. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 101 101
102
Port 0 with Pull-Up Resistors
DS5000 8751 8951 Vcc 10 K Port 0 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 102
103
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Dual Role of Port 0 When connecting an 8051/8031 to an external memory, the uses ports to send addresses and read instructions. 8031 is capable of accessing 64K bytes of external memory. 16-bit address:P0 provides both address A0-A7, P2 provides address A8-A15. Also, P0 provides data lines D0-D7. When P0 is used for address/data multiplexing, it is connected to the 74LS373 to latch the address. There is no need for external pull-up resistors as shown in Chapter 14. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 103
104
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
74LS373 D 74LS373 ALE P0.0 P0.7 PSEN A0 A7 D0 D7 P2.0 P2.7 A8 A15 OE OC EA G 8051 ROM ACOE343 - Embedded Real-Time Processor Systems - Frederick University 104
105
Reading ROM (1/2) latches the address and send to ROM 1. Send address to ROM D 74LS373 ALE P0.0 P0.7 PSEN A0 A7 D0 D7 P2.0 P2.7 A8 A12 OE OC EA G 8051 ROM Address ACOE343 - Embedded Real-Time Processor Systems - Frederick University 105
106
Reading ROM (2/2) latches the address and send to ROM D 74LS373 ALE P0.0 P0.7 PSEN A0 A7 D0 D7 P2.0 P2.7 A8 A12 OE OC EA G 8051 ROM Address 3. ROM send the instruction back ACOE343 - Embedded Real-Time Processor Systems - Frederick University 106
107
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
ALE Pin The ALE pin is used for de-multiplexing the address and data by connecting to the G pin of the 74LS373 latch. When ALE=0, P0 provides data D0-D7. When ALE=1, P0 provides address A0-A7. The reason is to allow P0 to multiplex address and data. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 107
108
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Port 2(pins 21-28) Port 2 does not need any pull-up resistors since it already has pull-up resistors internally. In an 8031-based system, P2 are used to provide address A8-A15. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 108
109
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Port 3(pins 10-17) Port 3 does not need any pull-up resistors since it already has pull-up resistors internally. Although port 3 is configured as an output port upon reset, this is not the way it is most commonly used. Port 3 has the additional function of providing signals. Serial communications signal:RxD, TxD(Chapter 10) External interrupt:/INT0, /INT1(Chapter 11) Timer/counter:T0, T1(Chapter 9) External memory accesses in 8031-based system:/WR, /RD(Chapter 14) ACOE343 - Embedded Real-Time Processor Systems - Frederick University 109 109
110
Port 3 Alternate Functions
17 RD P3.7 16 WR P3.6 15 T1 P3.5 14 T0 P3.4 13 INT1 P3.3 12 INT0 P3.2 11 TxD P3.1 10 RxD P3.0 Pin Function P3 Bit ACOE343 - Embedded Real-Time Processor Systems - Frederick University 110
111
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Generating Delays You can generate short delays using a register and incrementing or decrementing its value Example: mov r1, #0ah loop: djnz r1, loop How much delay is that? Djnz is a 2-byte instruction it takes two machine cycles One machine cycle is 1/12 of the system clock period For a 12 MHz system clock that is: Machine cycle = 12/12 = 1 MHz Machine period = 1/(1 MHz) = 10^(-6) s = 1 μs Loop time = 10*2*1 μs = 20 μs ACOE343 - Embedded Real-Time Processor Systems - Frederick University 111
112
Generating longer delays
Each register is 8 bits long, so it can increment 256 times before overflowing For larger delays, or when interrupts are required 8051 uses two timers ACOE343 - Embedded Real-Time Processor Systems - Frederick University 112
113
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
113
114
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
TMOD Register: Gate : When set, timer only runs while INT(0,1) is high. C/T : Counter/Timer select bit. M1 : Mode bit 1. M0 : Mode bit 0. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 114
115
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
TCON Register: TF1: Timer 1 overflow flag. TR1: Timer 1 run control bit. TF0: Timer 0 overflag. TR0: Timer 0 run control bit. IE1: External interrupt 1 edge flag. IT1: External interrupt 1 type flag. IE0: External interrupt 0 edge flag. IT0: External interrupt 0 type flag. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 115
116
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Timer Mode Register Bit 7: Gate bit; when set, timer only runs while \INT high. (T0) Bit 6: Counter/timer select bit; when set timer is an event counter when cleared timer is an interval timer (T0) Bit 5: Mode bit 1 (T0) Bit 4: Mode bit 0 (T0) Bit 3: Gate bit; when set, timer only runs while \INT high. (T1) Bit 2: Counter/timer select bit; when set timer is an event counter when cleared timer is an interval timer (T1) Bit 1: Mode bit 1 (T1) Bit 0: Mode bit 0 (T1) ACOE343 - Embedded Real-Time Processor Systems - Frederick University 116
117
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Timer Modes M1-M0: 00 (Mode 0) – 13-bit mode (not commonly used) M1-M0: 01 (Mode 1) - 16-bit timer mode M1-M0: 10 (Mode 2) - 8-bit auto-reload mode M1-M0: 11 (Mode 3) – Split timer mode ACOE343 - Embedded Real-Time Processor Systems - Frederick University 117
118
Timer Control Register (TCON)
Bit 7 (TF1) 8FH : Timer 1 overflow flag; set by hardware upon overflow, cleared by software Bit 6 (TR1) 8EH: Timer 1 run-control bit; manipulated by software - setting starts timer 1, resetting stops timer 1 Bit 5 (TF0) 8DH: Timer 0 overflow flag; set by hardware upon overflow, cleared by software. Bit 4 (TR0) 8CH: Timer 0 run-control bit; manipulated by software - setting starts timer 0, resetting stops timer 0 Bit 3 (IE1) 8BH: External 1 Interrupt flag bit Bit 2 (IT1) 8AH: Bit 1 (IE0) 89H: External 0 Interrupt flag bit Bit 0 (IT0) 88H: ACOE343 - Embedded Real-Time Processor Systems - Frederick University 118
119
Initializing and stopping timers
MOV TMOD, #16H ;initialization SETB TR0 ;starting timers SETB TR1 CLR TR0 ; stop timer 0 CLR TR1 ; stop timer 1 MOV R7, TH0 ; reading timers MOV R6, TL0 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 119
120
Reading timers on the fly
Reading the Timers on the Fly To read the contents of a timer while it is running (ie; on the fly) poses a problem. MOV A, TH0 ;Let TH0 = 07 and TL0 = FF MOV R6, TL0 ;Now TH0=08 and TL0=00 but A=07 and R6=00! The solution is to read the high byte, then read the low byte, then read the high byte again. If the two readings of the low-byte are not the same repeat the procedure. The code for this method is detailed below. tryAgain:MOV A, TH0 MOV R6, TL0 CJNE A, TH0, tryAgain if the first reading of the hi-byte (in A) is not equal to current reading in the hi-byte (TH0) try againMOV R7, A; if both readings of the hi- byte are the same move the first reading into R7 - the overall reading is now in R7R6 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 120
121
Generating delays using the timers
To generate a 50 ms (or 50,000 us) delay we start the timer counting from 15,536. Then, 50,000 steps later it will overflow. Since each step is 1 us (the timer's clock is 1/12 the system frequency) the delay is 50,000 us. 0 MOV TMOD, #10H; set up timer 1 as 16-bit interval timer CLR TR1 ; stop timer 1 (in case it was started in some other subroutine) MOV TH1, #3CH MOV TL1, #0B0H ; load 15,536 (3CB0H) into timer 1 SETB TR1 ; start timer 1 JNB TF1, $; repeat this line while timer 1 overflow flag is not set CLR TF1; timer 1 overflow flag is set by hardware on transition from FFFFH - the flag must be reset by software CLR TR1 ; stop timer 1 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 121
122
Generating long delays
If the microcontroller has a system clock frequency of 12 MHz then the longest delay we can get from either of the timers is 65,536 usec. To generate delays longer than this, we need to write a subroutine to generate a delay of (for example) 50 ms and then call that subroutine a specific number of times. ... MOV TMOD, #10H; set up timer 1 as 16-bit interval timer fiftyMsDelay:CLR TR1 ; stop timer 1 (in case it was started in some other subroutine) MOV TH1, #3CH MOV TL1, #0B0H ; load 15,536 (3CB0H) into timer 1 SETB TR1 ; start timer 1 JNB TF1, $; repeat this line while timer 1 overflow flag is not set CLR TF1; timer 1 overflow flag is set by hardware on transition from FFFFH - the flag must be reset by software CLR TR1 ; stop timer 1 RET oneSecDelay:PUSH PSW PUSH AR0 ; save processor status MOV R0, #20 ; move 20 (in decimal) into R0 loop:CALL fiftyMsDelay ; call the 50 ms delay DJNZ R0, loop ; 20 times - resulting in a 1 second delay POP AR0 POP PSW ; retrieve processor status RET ACOE343 - Embedded Real-Time Processor Systems - Frederick University 122
123
Using timers to measure execution time
Timers are often used to measure the execution time of a program ORG 0H MOV TMOD, #16H ;initialization SETB TR0 ;starting timer 0 … ;main … ;program CLR TR0 ; stop timer 0 MOV R7, TH0 ; reading timer 0 MOV R6, TL0 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 123
124
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Interrupt : ACOE343 - Embedded Real-Time Processor Systems - Frederick University 124
125
Interrupt Enable Register :
EA : Global enable/disable. : Undefined. ET2 :Enable Timer 2 interrupt. ES :Enable Serial port interrupt. ET1 :Enable Timer 1 interrupt. EX1 :Enable External 1 interrupt. ET0 : Enable Timer 0 interrupt. EX0 : Enable External 0 interrupt. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 125
126
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Interrupt handling 8051 Interrupt Vector Table ACOE343 - Embedded Real-Time Processor Systems - Frederick University 126
127
Interrupt Service Routines
ORG 0 JMP main ORG 0003H ; external interrupt 0 vector …. ; interrupt handler code for external interrupt 0 RETI ORG 0013H ; external interrupt 1 vector …. ;interrupt handler code for external interrupt 1 RETI ORG 0030H ; main program main:SETB IT0 ; set external interrupt 0 as edge activated SETB IT1 ; set external interrupt 1 as edge activated SETB EX0 ; enable external interrupt 0 SETB EX1 ; enable external interrupt 1 SETB EA ; global interrupt enable … ACOE343 - Embedded Real-Time Processor Systems - Frederick University 127
128
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Examples Write a 8051 assembly program that matches 8 switches with 8 LEDs Write a 8051 assembly program that uses a two-digit SSD to display the temperature as input from an ADC. Assume that the 0- 5V range corresponds to 0-50 °C. The ADC uses RD, WR and INT pins. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 128
129
8051 Programming Using C
130
Programming microcontrollers using high-level languages
Most programs can be written exclusively using high-level code like ANSI C Extensions To achieve low-level (Assembly) efficiency, extensions to high-level languages are required Restrictions Depending on the compiler, some restrictions to the high-level language may apply ACOE343 - Embedded Real-Time Processor Systems - Frederick University 130
131
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Keil C keywords data/idata: Description: The variable will be stored in internal data memory of controller. example: unsigned char data x; //or unsigned char idata y; bdata: Description: The variable will be stored in bit addressable memory of controller. example: unsigned char bdata x; //each bit of the variable x can be accessed as follows x ^ 1 = 1; //1st bit of variable x is set x ^ 0 = 0; //0th bit of variable x is cleared xdata: Description: The variable will be stored in external RAM memory of controller. example: unsigned char xdata x; ACOE343 - Embedded Real-Time Processor Systems - Frederick University 131
132
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Keil C keywords code: Description: This keyword is used to store a constant variable in code and not data memory. example: unsigned char code str="this is a constant string"; _at_: Description: This keyword is used to store a variable on a defined location in ram. example: CODE: unsigned char idata x _at_ 0x30; // variable x will be stored at location 0x30 // in internal data memory sbit: Description: This keyword is used to define a special bit from SFR (special function register) memory. example: sbit Port0_0 = 0x80; // Special bit with name Port0_0 is defined at address 0x80 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 132
133
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Keil C keywords sfr: Description: sfr is used to define an 8-bit special function register from sfr memory. example: sfr Port1 = 0x90; // Special function register with name Port1 defined at addrress 0x90 sfr16: Description: This keyword is used to define a two sequential 8-bit registers in SFR memory. example: sfr16 DPTR = 0x82; // 16-bit special function register starting at 0x82 // DPL at 0x82, DPH at 0x83 using: Description: This keyword is used to define register bank for a function. User can specify register bank 0 to example: void function () using 2{ // code } // Funtion named "function" uses register bank 2 while executing its code Interrupt: Description: defines interrupt service routine void External_Int0() interrupt 0{ //code } ACOE343 - Embedded Real-Time Processor Systems - Frederick University 133
134
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Pointers //Generic Pointer char * idata ptr; //character pointer stored in data memory int * xdata ptr1; //Integer pointer stored in external data memory //Memory Specific pointer char idata * xdata ptr2; //Pointer to character stored in Internal Data memory //and pointer is going to be stored in External data memory int xdata * data ptr3; //Pointer to character stored in External Data memory //and pointer is going to be stored in data memory ACOE343 - Embedded Real-Time Processor Systems - Frederick University 134
135
Writing hardware-specific code
#include <REGx51.h> //header file for 89C51 void main(){ //main function starts unsigned int i; //Initializing Port1 pin1 P1_1 = 0; //Make Pin1 o/p while(1){ //Infinite loop main application //comes here for(i=0;i<1000;i++) ; //delay loop P1_1 = ~P1_1; //complement Port1.1 //this will blink LED connected on Port1.1 } } ACOE343 - Embedded Real-Time Processor Systems - Frederick University 135
136
C and Assembly together
extern unsigned long add(unsigned long, unsigned long); void main(){ unsigned long a; a = add(10,30); //calling Assembly function while(1); } ACOE343 - Embedded Real-Time Processor Systems - Frederick University 136
137
C and Assembly together
name asm_test ?PR?_add?asm_test segment code ?DT?_add?asm_test segment data ;let other function use this data space for passing variables public ?_add?BYTE ;make function public or accessible to everyone public _add ;define the data segment for function add rseg ?DT?_add?asm_test ?_add?BYTE: parm1: DS 4 ;First Parameter parm2: ds 4 ;Second Parameter ;either you can use parm1 for reading passed value as shown below ;or directly use registers used to pass the value. rseg ?PR?_add?asm_test _add: ;reading first argument mov parm1+3,r7 mov parm1+2,r6 mov parm1+1,r5 mov parm1,r4 ;param2 is stored in fixed location given by param2 ;now adding two variables mov a,parm2+3 add a,parm1+3 ;after addition of LSB, move it to r7(LSB return register for Long) mov r7,a mov a,parm2+2 addc a,parm1+2 ;store second LSB mov r6,a mov a,parm2+1 addc a,parm1+1 ;store second MSB mov r5,a mov a,parm2 addc a,parm1 mov r4,a ret end ACOE343 - Embedded Real-Time Processor Systems - Frederick University 137
138
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
The infinite loop A loop with no termination condition or one that will never be met may be unwanted in computer systems, but common in embedded systems. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 138
139
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example 1 Generate a 5V peek-to-peek 200μs period square waveform on the DAC output ACOE343 - Embedded Real-Time Processor Systems - Frederick University 139
140
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example 2 Generate a 5V peek-to-peek 200μs period sawtooth waveform on the DAC output ACOE343 - Embedded Real-Time Processor Systems - Frederick University 140
141
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example 3 Generate a 5V peek-to-peek 2ms period sine waveform on the DAC output code unsigned char Sine[180] = { /* Sine values */ 127,131,136,140,145,149,153,158,162,166,170,175, 179,183,187,191,194,198,202,205,209,212,215,218, 221,224,227,230,232,235,237,239,241,243,245,246, 248,249,250,251,252,253,253,254,254,254,254,254, 253,253,252,251,250,249,248,246,245,243,241,239, 237,235,232,230,227,224,221,218,215,212,209,205, 202,198,194,191,187,183,179,175,170,166,162,158, 153,149,145,140,136,131,127,123,118,114,109,105, 101, 96, 92, 88, 84, 79, 75, 71, 67, 64, 60, 56, 52, 49, 45, 42, 39, 36, 33, 30, 27, 24, 22, 19, 17, 15, 13, 11, 9, 8, 6, 5, 4, 3, 2, 1, 1, 0, 0, 0, 0, 0, 1, 1, 2, 3, 4, 5, 6, 8, 9, 11, 13, 15, 17, 19, 22, 24, 27, 30, 33, 36, 39, 42, 45, 49, 52, 56, 60, 63, 67, 71, 75, 79, 84, 88, 92, 96, 101,105,109,114,118 }; /************************************************************ * START of the PROGRAM * ************************************************************ / void main (void) { unsigned char i; * Enable the D/A Converter * ENDAC0 = 1; /* Enable DAC0 */ * Create the waveforms on DAC0 * while(1){ /* Run for ever */ for(i = 0; i < 179; i++) DAC0 = Sine[i]; } * while(1) */ } /* main() */ } ACOE343 - Embedded Real-Time Processor Systems - Frederick University 141
142
Mixed C/Assembly code (μVision Version 2.06)
Parameter passing in registers Examples: ACOE343 - Embedded Real-Time Processor Systems - Frederick University 142
143
Function return values
ACOE343 - Embedded Real-Time Processor Systems - Frederick University 143
144
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example extern unsigned char add2_func(unsigned char, unsigned char); void main(){ unsigned char a; a = add2_func(10,30); //a will have 40 after execution while(1); } ;assembly file “add2.asm” NAME _Add2_func ?PR?add2_func?Add2 SEGMENT CODE PUBLIC add2_func RSEG ?PR?Add2_func?Add2 add2_func: mov a, r7 ;first parameter passed to r7 add a, r5 ;second parameter passed to r5 mov r7, a ;return parameter must be in r7 RET END ACOE343 - Embedded Real-Time Processor Systems - Frederick University 144
145
Calling C from Assembly
NAME A_FUNC ?PR?a_func?A_FUNC SEGMENT CODE EXTRN CODE (c_func) PUBLIC a_func RSEG ?PR?a_func?A_FUNC a_func: USING 0 LCALL c_func RET END void c_func (void) { } ACOE343 - Embedded Real-Time Processor Systems - Frederick University 145
146
Data Converters Analog to Digital Converters (ADC)
Convert an analog quantity (voltage, current) into a digital code Digital to Analog Converters (DAC) Convert a digital code into an analog quantity (voltage, current) Dr. Konstantinos Tatas and Dr. Costas Kyriacou 146
147
Video (Analog - Digital)
Modulator Amplifier Filters Analog Pre- amplifier A/D Digital Image enhancement and coding ACOE343 - Embedded Real-Time Processor Systems - Frederick University 147 147
148
Temperature Recording by a Digital System
Time Temperature (ºC) Time Sampling & quantization Coding ACOE343 - Embedded Real-Time Processor Systems - Frederick University 148 148
149
Need for Data Converters
Digital processing and storage of physical quantities (sound, temperature, pressure etc) exploits the advantages of digital electronics Better and cheaper technology compared to the analog More reliable in terms of storage, transfer and processing Not affected by noise Processing using programs (software) Easy to change or upgrade the system (e.g. Media Player 7 Media Player 8 ή Real Player) Integration of different functions (π.χ. Mobile = phone + watch + camera + games + + ACOE343 - Embedded Real-Time Processor Systems - Frederick University 149 149
150
Signals (Analog - Digital)
2 4 6 8 10 12 14 16 u(V) 1 7 3 5 9 t (S) Analog Signal can take infinity values can change at any time 1010 1110 1111 1100 1000 Digital Signal can take one of 2 values (0 or 1) can change only at distinct times Reconstruction of an analog signal from a digital one (Can take only predefined values) 1001 0110 0101 0100 2 4 6 8 10 12 14 16 u(V) 1 7 3 5 9 t (S) ADC 1001 0110 0101 1010 1111 1110 1000 1100 0100 D3 D2 D1 D0 1 1 1 1 1 DAC ACOE343 - Embedded Real-Time Processor Systems - Frederick University 150 150
151
QUANTIZATION ERROR The difference between the true and quantized value of the analog signal Inevitable occurrence due to the finite resolution of the ADC The magnitude of the quantization error at each sampling instant is between zero and half of one LSB. Quantization error is modeled as noise (quantization noise) 151 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
152
SAMPLING FREQUENCY (RATE)
The frequency at which digital values are sampled from the analog input of an ADC A low sampling rate (undersampling) may be insufficient to represent the analog signal in digital form A high sampling rate (oversampling) requires high bitrate and therefore storage space and processing time A signal can be reproduced from digital samples if the sampling rate is higher than twice the highest frequency component of the signal (Nyquist-Shannon theorem) Examples of sampling rates Telephone: 4 KHz (only adequate for speech, ess sounds like eff) Audio CD: 44.1 KHz Recording studio: 88.2 KHz ACOE343 - Embedded Real-Time Processor Systems - Frederick University 152
153
Digital to Analog Converters
Vout (mV) 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 The analog signal at the output of a D/A converter is linearly proportional to the binary code at the input of the converter. If the binary code at the input is 0001 and the output voltage is 5mV, then If the binary code at the input becomes 1001, the output voltage will become 45mV If a D/A converter has 4 digital inputs then the analog signal at the output can have one out of …… values. 16 If a D/A converter has N digital inputs then the analog signal at the output can have one out of ……. values. 2Ν ACOE343 - Embedded Real-Time Processor Systems - Frederick University 153 153
154
Characteristics of Data Converters
Number of digital lines The number bits at the input of a D/A (or output of an A/D) converter. Typical values: 8-bit, 10-bit, 12-bit and 16-bit Can be parallel or serial Microprocessor Compatibility Microprocessor compatible converters can be connected directly on the microprocessor bus as standard I/O devices They must have signals like CS, RD, and WR Activating the WR signal on an A/D converter starts the conversion process. Polarity Polar: the analog signals can have only positive values Bipolar: the analog signals can have either a positive or a negative value Full-scale output The maximum analog signal (voltage or current) Corresponds to a binary code with all bits set to 1 (for polar converters) Set externally by adjusting a variable resistor that sets the Reference Voltage (or current) ACOE343 - Embedded Real-Time Processor Systems - Frederick University 154 154
155
Characteristics of Data Converters (Cont…)
Resolution The analog voltage (or current) that corresponds to a change of 1LSB in the binary code It is affected by the number of bits of the converter and the Full Scale voltage (VFS) For example if the full-scale voltage of an 8-bit D/A converter is 2.55V the the resolution is: VFS/(2N-1) = 2.55 /(28-1) 2.55/255 = 0.01 V/LSB = 10mV/LSB Conversion Time The time from the moment that a “Start of Conversion” signal is applied to an A/D converter until the corresponding digital value appears on the data lines of the converter. For some types of A/D converters this time is predefined, while for others this time can vary according to the value of the analog signal. Settling Time The time needed by the analog signal at the output of a D/A converter to be within 10% of the nominal value. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 155 155
156
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
ADC RESPONSE TYPES Linear Most common Non-linear Used in telecommunications, since human voice carries more energy in the low frequencies than the high. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 156
157
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
ADC TYPES Direct Conversion Fast Low resolution Successive approximation Low-cost Slow Not constant conversion delay Sigma-delta High resolution, low-cost, high accuracy ACOE343 - Embedded Real-Time Processor Systems - Frederick University 157
158
Interfacing with Data Converters
Microprocessor compatible data converters are attached on the microprocessor’s bus as standard I/O devices. ACOE343 - Embedded Real-Time Processor Systems - Frederick University 158 158
159
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Programming Example 1 Write a program to generate a positive ramp at the output of an 8-bit D/A converter with a 2V amplitude and a 1KHz frequency. Assume that the full scale voltage of the D/A converter is 2.55V. The D/A converter is in P0 and the WR signal is in P1.1 main() { do { for (i=0;i<200;i++) P1_0=1; P0=i; delayu(5); } } while (1) ACOE343 - Embedded Real-Time Processor Systems - Frederick University 159 159
160
D/A Converters example
Write a program to generate the waveform, shown below, at the output of an 8-bit digital to analog converter. The period of the waveform should be approximately 8 ms. Assume that a time delay function with a 1 μs resolution is available. The full scale output of the converter is 5.12 V and the address of the DAC is P0, while the WR signal is in P1.1. Assuming that an 8-bit A/D converter is used to interface a temperature sensor measuring temperature values in the temperature range , specify: The resolution in of the system in The digital output word for a temperature of 32.5 The temperature corresponding to a digital output word of ACOE343 - Embedded Real-Time Processor Systems - Frederick University 160 160
161
Introduction to Digital Signal Processors (DSPs)
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas
162
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Outline/objectives Identify the most important DSP processor architecture features and how they relate to DSP applications Understand the types of code appropriate for DSP implementation ACOE343 - Embedded Real-Time Processor Systems - Frederick University 162
163
What is a DSP? A specialized microprocessor for real- time DSP applications Digital filtering (FIR and IIR) FFT Convolution, Matrix Multiplication etc 163 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
164
Hardware used in DSP ASIC FPGA GPP DSP Performance Very High High
Medium Medium High Flexibility Very low Power consumption low Low Medium Development Time Long Short 164 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
165
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Common DSP features Harvard architecture Dedicated single-cycle Multiply-Accumulate (MAC) instruction (hardware MAC units) Single-Instruction Multiple Data (SIMD) Very Large Instruction Word (VLIW) architecture Pipelining Saturation arithmetic Zero overhead looping Hardware circular addressing Cache DMA ACOE343 - Embedded Real-Time Processor Systems - Frederick University 165
166
Harvard Architecture Physically separate memories and paths for instruction and data 166 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
167
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Single-Cycle MAC unit Can compute a sum of n- products in n cycles ACOE343 - Embedded Real-Time Processor Systems - Frederick University 167
168
Single Instruction - Multiple Data (SIMD)
A technique for data-level parallelism by employing a number of processing elements working in parallel ACOE343 - Embedded Real-Time Processor Systems - Frederick University 168
169
Very Long Instruction Word (VLIW)
A technique for instruction-level parallelism by executing instructions without dependencies (known at compile-time) in parallel Example of a single VLIW instruction: F=a+b; c=e/g; d=x&y; w=z*h; 169 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
170
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
CISC vs. RISC vs. VLIW ACOE343 - Embedded Real-Time Processor Systems - Frederick University 170
171
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Pipelining DSPs commonly feature deep pipelines TMS320C6x processors have 3 pipeline stages with a number of phases (cycles): Fetch Program Address Generate (PG) Program Address Send (PS) Program ready wait (PW) Program receive (PR) Decode Dispatch (DP) Decode (DC) Execute 6 to 10 phases ACOE343 - Embedded Real-Time Processor Systems - Frederick University 171
172
Saturation Arithmetic
fixed range for operations like addition and multiplication normal overflow and underflow produce the maximum and minimum allowed value, respectively Associativity and distributivity no longer apply 1 signed byte saturation arithmetic examples: = 127 -127 – 5 = -128 ( ) – 25 = 122 ≠ 64 + (70 -25) = 109 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 172
173
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Examples Perform the following operations using one-byte saturation arithmetic 0x77 + 0x99 = 0x4*0x42= 0x3*0x51= ACOE343 - Embedded Real-Time Processor Systems - Frederick University 173
174
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Zero Overhead Looping Hardware support for loops with a constant number of iterations using hardware loop counters and loop buffers No branching No loop overhead No pipeline stalls or branch prediction No need for loop unrolling ACOE343 - Embedded Real-Time Processor Systems - Frederick University 174
175
Hardware Circular Addressing
A data structure implementing a fixed length queue of fixed size objects where objects are added to the head of the queue while items are removed from the tail of the queue. Requires at least 2 pointers (head and tail) Extensively used in digital filtering y[n] = a0x[n]+a1x[n-1]+…+akx[n-k] 175 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
176
Direct Memory Access (DMA)
The feature that allows peripherals to access main memory without the intervention of the CPU Typically, the CPU initiates DMA transfer, does other operations while the transfer is in progress, and receives an interrupt from the DMA controller once the operation is complete. Can create cache coherency problems (the data in the cache may be different from the data in the external memory after DMA) Requires a DMA controller ACOE343 - Embedded Real-Time Processor Systems - Frederick University 176
177
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Cache memory Separate instruction and data L1 caches (Harvard architecture) Cache coherence protocols required, since most systems use DMA ACOE343 - Embedded Real-Time Processor Systems - Frederick University 177
178
DSP vs. Microcontroller
Harvard Architecture VLIW/SIMD (parallel execution units) No bit level operations Hardware MACs DSP applications Microcontroller Mostly von Neumann Architecture Single execution unit Flexible bit-level operations No hardware MACs Control applications ACOE343 - Embedded Real-Time Processor Systems - Frederick University 178
179
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Examples Estimate how long will the following code fragment take to execute on A general purpose processor with 1 GHz operating frequency, five-stage pipelining and 5 cycles required for multiplication, 1 cycle for addition A DSP running at 500 MHz, zero overhead looping and 6 independent ALUs and 2 independent single- cycle MAC units? for (i=0; i<8; i++) { a[i] = 2*i + 3; b[i] = 3*i + 5; } ACOE343 - Embedded Real-Time Processor Systems - Frederick University 179
180
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Review Questions Which of the following code fragments is appropriate for SIMD implementation? a[0]=b[0]+c[0]; a[0]=b[0]&c[0]; a[2]=b[2]+c[2]; a[0]=b[0]%c[0]; a[4]=b[4]+c[4]; a[0]=b[0]+c[0]; a[6]=b[6]+c[6]; a[0]=b[0]/c[0]; Can the following instructions be merged into one VLIW instruction? If not in how many? a=b+c; d=c/e; f=d&a; g=b%c; ACOE343 - Embedded Real-Time Processor Systems - Frederick University 180
181
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Review Questions Which of the following is not a typical DSP feature? Dedicated multiplier/MAC Von Neumann memory architecture Pipelining Saturation arithmetic Which implementation would you choose for lowest power consumption? ASIC FPGA General-Purpose Processor DSP ACOE343 - Embedded Real-Time Processor Systems - Frederick University 181
182
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Examples How many VLIW instructions does the following program fragment require if there two independent data paths (a,b), with 3 ALUs and 1 MAC available in each and 8 instructions/word? How many cycles will it take to execute if they are the first instructions in the program and all instructions require 1 cycle, assuming the pipelining architecture of slide 10 with 6 phases of execution? ADD a1,a2,a3 ;a3 = a1+a2 SUB b1,b3,b4 ;b4 = b1-b3 MUL a2,a3,a5 ;a5 = a2-a3 MUL b3,b4,b2 ;b2 = b3*b4 AND a7,a0,a1 ;a1 = a7 AND a0 MUL a3,a4,a5 ;a5 = a3*a4 OR a6,a3,a2 ;a2 = a6 OR a3 ACOE343 - Embedded Real-Time Processor Systems - Frederick University 182
183
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
References DR. Chassaing, “DSP Applications using C and the TMS320C6x DSK”, Wiley, 2002 Texas Instruments, TMS320C64x datasheets Analog Devices, ADSP-21xx Processors ACOE343 - Embedded Real-Time Processor Systems - Frederick University 183
184
The TMS320C6x Family of DSPs
Lecture 5 The TMS320C6x Family of DSPs
185
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Features High-Performance Fixed-Point Digital Signal Processor (TMS320C6413/C6410) − TMS320C6413 2-ns Instruction Cycle Time 500-MHz Clock Rate 4000 MIPS − TMS320C6410 2.5-ns Instruction Cycle Time 400-MHz Clock Rate 3200 MIPS Eight 32-Bit Instructions/Cycle ACOE343 - Real-Time Embedded Processor Systems - Frederick University
186
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Features Eight Highly Independent Functional Units Six ALUs (32-/40-Bit), Each Supports Single 32- Bit, Dual 16-Bit, or Quad 8-Bit Arithmetic per Clock Cycle Two Multipliers Support Four 16 x 16-Bit Multiplies (32-Bit Results) per Clock Cycle or Eight 8 x 8-Bit Multiplies (16-Bit Results) per Clock Cycle Load-Store Architecture 64 32-Bit General-Purpose Registers Instruction Packing Reduces Code Size All Instructions Conditional ACOE343 - Real-Time Embedded Processor Systems - Frederick University
187
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Features L1/L2 Memory Architecture − 128K-Bit (16K-Byte) L1P Program Cache (Direct Mapped) − 128K-Bit (16K-Byte) L1D Data Cache (2-Way Set- Associative) − 2M-Bit (256K-Byte) L2 Unified Mapped RAM/Cache [C6413] − 1M-Bit (128K-Byte) L2 Unified Mapped RAM/Cache [C6410] Endianess: Little Endian, Big Endian − 512M-Byte Total Addressable External Memory Space Enhanced Direct-Memory-Access (EDMA) Controller (64 Independent Channels) 16 prioritized interrupts ACOE343 - Real-Time Embedded Processor Systems - Frederick University
188
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Block Diagram ACOE343 - Real-Time Embedded Processor Systems - Frederick University
189
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Cache ACOE343 - Real-Time Embedded Processor Systems - Frederick University
190
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Timers Two 32-bit timers used as timers/event/counters/interrupt sources Configuring a timer requires four basic steps: If the timer is not currently in the hold state, place the timer in hold (HLD = 0). Note that after device reset, the timer is already in the hold state Write the desired value to the timer period register (PRD). Write the desired value to the timer control register (CTL). Do not change the GO and HLD bits in CTL. Start the timer by setting the GO and HLD bits in CTL to 1. ACOE343 - Real-Time Embedded Processor Systems - Frederick University
191
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Interrupts 16 interrupt sources 2 timer interrupts 4 external interrupts 4 McBSP interrupts 4 DMA interrupts ACOE343 - Real-Time Embedded Processor Systems - Frederick University
192
Interrupt control registers
CSR: control status register IER: Interrupt enable register IFR: interrupt flag register ISR: interrupt set register ICR: interrupt clear register ISTP: interrupt service table pointer IRP: interrupt return pointer NRP: NMI return pointer ACOE343 - Real-Time Embedded Processor Systems - Frederick University
193
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
Interrupt Priority ACOE343 - Real-Time Embedded Processor Systems - Frederick University
194
Interrupt Service Table (single fetch packet)
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
195
Interrupt Service Table (branch to additional code)
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
196
ACOE343 - Real-Time Embedded Processor Systems - Frederick University
DMA ACOE343 - Real-Time Embedded Processor Systems - Frederick University
197
Programming the TMS320C6x Family of DSPs
Lecture 6 Programming the TMS320C6x Family of DSPs
198
Programming the TMS320C6x Family of DSPs
Programming model Assembly language Assembly code structure Assembly instructions C/C++ Intrinsic functions Optimizations Software Pipelining Inline Assembly Calling Assembly functions Using Interrupts Using DMA ACOE343 - Embedded Real-Time Processor Systems - Frederick University
199
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Programming model Two register files: A and B 16 registers in each register file (A0-A15), (B0-B15) A0, A1, B0, B1 used in conditions A4-A7, B4-B7 used for circular addressing ACOE343 - Embedded Real-Time Processor Systems - Frederick University
200
Assembly language structure
A TMS320C6x assembly instruction includes up to seven items: Label Parallel bars Conditions Instruction Functional unit Operands Comment Format of assembly instruction: Label: parallel bars [condition] instruction unit operands ;comment ACOE343 - Embedded Real-Time Processor Systems - Frederick University
201
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Parallel bars || : indicates that current instruction executes in parallel with previous instruction, otherwise left blank ACOE343 - Embedded Real-Time Processor Systems - Frederick University
202
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Condition All assembly instructions are conditional If no condition is specified, the instruction executes always If a condition is specified, the instruction executes only if the condition is valid Registers used in conditions are A1, A2, B0, B1, and B2 Examples: [A] ;executes if A ≠ 0 [!A] ;executes if A = 0 [B0] ADD .L1 A1,A2,A3 || [!B0] ADD .L2 B1,B2,B3 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
203
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Instruction Either directive or mnemonic Directives must begin with a period (.) Mnemonics should be in column 2 or higher Examples: .sect data ;creates a code section .word value ;one word of data ACOE343 - Embedded Real-Time Processor Systems - Frederick University
204
Functional units (optional)
L units: 32/40 bit arithmetic/compare and 32 bit logic operations S units: 32-bit arithmetic operations, 32/40-bit shifts and 32-bit bit-field operations, 32-bit logical operations, Branches, Constant generation, Register transfers to/from control register file (.S2 only) M units: 16 x 16 multiply operations D units: 32-bit add, subtract, linear and circular address calculation, Loads and stores with 5-bit constant offset, Loads and stores with 15-bit constant, offset (.D2 only) ACOE343 - Embedded Real-Time Processor Systems - Frederick University
205
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Operands All instructions require a destination operand. Most instructions require one or two source operands. The destination operand must be in the same register file as one source operand. One source operand from each register file per execute packet can come from the register file opposite that of the other source operand. Example: ADD .L1 A0,A1,A3 ADD .L1 A0,B1,A2 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
206
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Instruction format Fetch packet The same functional unit cannot be used in the same fetch packet ADD .S1 A0, A1, A2 ;.S1 is used for || SHR .S1 A3, 15, A4 ;...both instructions ACOE343 - Embedded Real-Time Processor Systems - Frederick University
207
Arithmetic instructions
Add/subtract/multiply: ADD .L1 A3,A2,A1 ;A1←A2+A3 SUB .S1 A1,1,A1 ;decrement A1 MPY .M2 A7,B7,B6 ;multiply LSBs || MPYH .M1 A7,B7,A6 ;multiply MSBs ACOE343 - Embedded Real-Time Processor Systems - Frederick University
208
Move and Load/store Instructions- Addressing Modes
Loading constants: MVK .S1 val1, A4 ;move low halfword MVKH .S1 val1, A4 ;move high halfword Indirect Addressing Mode: LDH .D2 *B2++, B7 ;load halfword B7←[B2], increment B2 || LDH .D1 *A2++, A7 ; load halfword A7←[A2], increment A2 STW .D2 A1, *+A4[20] ;store [A4]+20 words ← A2, ;preincrement/don’t modify A4 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
209
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example Calculate the values of register and memory for the following instructions: A2= 0x , MEM[0x ] = 0x0, MEM[0x ] = 0x1, MEM[0x ] = 0x2, MEM[0x C] = 0x3, LDH .D1 *++A2, A7 A2= ? A7= ? LDH .D1 *A2--[2], A7 A2= ? A7= ? LDH .D1 *-A2, A7 A2= ? A7= ? LDH .D1 *++A2[2], A7 A2= ? A7= ? ACOE343 - Embedded Real-Time Processor Systems - Frederick University
210
Branch and Loop Instructions
Loop example: MVK .S1 count, A1 ;loop counter || MVKH .S2 count, A1 LOOP MVK .S1 val1, A4 ;loop MVKH .S1 val1, A4 ;body SUB .S1 A1,1,A1 ;decrement counter [A1] B .S2 Loop ;branch if A1 ≠ 0 NOP 5 ;5 NOPs for branch ACOE343 - Embedded Real-Time Processor Systems - Frederick University
211
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Assembler Directives .short : initiates 16-bit integer .int (.word .long) : initiates 32-bit integer .float : 32-bit single-precision floating-point .double : 64-bit double-precision floating-point .trip : .bss .far .stack ACOE343 - Embedded Real-Time Processor Systems - Frederick University
212
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Programming Using C Data types Intrinsic functions Inline assembly Linear assembly Calling assembly functions Code optimizations Software pipelining ACOE343 - Embedded Real-Time Processor Systems - Frederick University
213
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Data types char, signed char 8 bits ASCII unsigned char Short 16 bits 2's complement unsigned short 16 bits binary int, signed int 32 bits 2's complement unsigned int 32 bits binary long, signed long 40 bits 2's complement unsigned long 40 bits binary Enum Float 32 bits IEEE 32-bit Double 64 bits IEEE 64-bit long double Pointers 3 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
214
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Intrinsic functions Available C functions used to increase efficiency int_mpy(): MPY instruction, multiplies 16 LSBs int_mpyh(): MPYH instruction, multiplies 16 MSBs int_mpylh(): MPYHL instruction, multiplies 16 LSBs with 16 MSBs int_mpyhl(): MPYHL instruction, multiplies 16 MSBs with 16 LSBs ACOE343 - Embedded Real-Time Processor Systems - Frederick University
215
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Inline Assembly Assembly instructions and directives can be incorporated within a C program using the asm statement asm (“assembly code”); ACOE343 - Embedded Real-Time Processor Systems - Frederick University
216
Calling Assembly Functions
An external declaration of an assembly function can be called from a C program extern int func(); ACOE343 - Embedded Real-Time Processor Systems - Frederick University
217
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example Program that calculates S=n+(n-1)+…+1 by calling assembly function #include <stdio.h> main() { short n=6; short result; result = sumfunc(n); printf(“sum = %d”, result); } ACOE343 - Embedded Real-Time Processor Systems - Frederick University
218
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example (continued) Assembly function: .def _sumfunc _sumfunc: MV .L1 A4,A1 ;n is loop counter SUB .S1 A1,1,A1 ;decrement n LOOP: ADD .L1 A4,A1,A4 ;A4 is accumulator [A1] B .S2 LOOP ;branch if A1 ≠ 0 NOP 5 ;branch delay nops B .S2 B3 ;return from calling NOP 5 ;five NOPS for delay .end ACOE343 - Embedded Real-Time Processor Systems - Frederick University
219
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example Write a program that calculates the first 6 Fibonacci numbers by calling an assembly function ACOE343 - Embedded Real-Time Processor Systems - Frederick University
220
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Linear Assembly enables writing assembly-like programs without worrying about register usage, pipelining, delay slots, etc. The assembler optimizer program reads the linear assembly code to figure out the algorithm, and then it produces an optimized list of assembly code to perform the operations. Source file extension is .sa The linear assembly programming lets you: use symbolic names forget pipeline issues ignore putting NOPs, parallel bars, functional units, register names more efficiently use CPU resources than C. ACOE343 - Embedded Real-Time Processor Systems - Frederick University
221
Linear Assembly Example
_sumfunc: .cproc np ;.cproc directive starts a C callable procedure .reg y ;.reg directive use descriptive names for values that will be stored in registers MVK np,cnt loop: .trip 6 ; trip count indicates how many times a loop will iterate SUB cnt,1,cnt ADD y,cnt,y [cnt] B loop .return y .endproc ; .endproc to end a C procedure Equivalent assembly function .def _sumfunc _sumfunc: MV .L1 A4,A1 ;n is loop counter LOOP: SUB .S1 A1,1,A1 ;decrement n ADD .L1 A4,A1,A4 ;A4 is accumulator [A1] B .S2 LOOP ;branch if A1 ≠ 0 NOP 5 ;branch delay nops B .S2 B3 ;return from calling NOP 5 ;five NOPS for delay .end ACOE343 - Embedded Real-Time Processor Systems - Frederick University
222
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Software Pipelining A loop optimization technique so that all functional units are utilized within one cycle. Similar to hardware pipelining, but done by the programmer or the compiler, not the processor Three stages: Prolog (warm-up): instructions needed to build up the loop kernel (cycle) Loop kernel (cycle): all instructions executed in parallel. Entire kernel executed in one cycle. Epilog (cool-off): Instructions necessary to complete all iterations ACOE343 - Embedded Real-Time Processor Systems - Frederick University
223
Software pipelining procedure
Draw a dependency graph Draw nodes and paths Write number of cycles for each instruction Assign functional units Set up a scheduling table Obtain code from scheduling table ACOE343 - Embedded Real-Time Processor Systems - Frederick University
224
Software pipelining example
for (i=0; i<16; i++) sum = sum + a[i]*b[i]; ACOE343 - Embedded Real-Time Processor Systems - Frederick University
225
Dependency Graph LDH: 5 cycles MPY: 2 cycles ADD: 1 cycle SUB: 1 cycle
LOOP: 6 cycles ACOE343 - Embedded Real-Time Processor Systems - Frederick University
226
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Scheduling Table Unit C1, C9.. C2, C10… C3, C11.. C4, C12… C5, C13… C6, C14… C7, C15… C8, C16… .D1 LDH .D2 .M1 MPY .L1 ADD .L2 SUB .S2 B Unit C1, C9.. C2, C10… C3, C11.. C4, C12… C5, C13… C6, C14… C7, C15… C8, C16… Prolog Kernel .D1 LDH .D2 .M1 MPY .L1 ADD .L2 SUB .S2 B ACOE343 - Embedded Real-Time Processor Systems - Frederick University
227
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Assembly Code ;cycle 1 MVK .L2 16,B1 ;loop count || ZERO .L1 A7 ;sum || LDH .D1 *A4++,A2 ;input in A2 || LDH .D2 *B4++,B2 ;input in B2 ;cycle 2 LDH .D1 *A4++,A2 ;input in A2 || [B1] SUB .L2 B1,1,B1 ;decrement count ;cycle 3 || [B1] SUB .L2 B1,1,B1 ;decrement || [B1] B .S2 LOOP ;cycle 4 ;cycle 5 ACOE343 - Embedded Real-Time Processor Systems - Frederick University
228
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Assembly code ;cycle 6 LDH .D1 *A4++,A2 ;input in A2 || LDH .D2 *B4++,B2 ;input in B2 || [B1] SUB .L2 B1,1,B1 ;decrement || [B1] B .S2 LOOP || MPY .M1x A2,B2,A6 ;cycle 7 ;cycles 8-21(loop kernel) LOOP: LDH .D1 *A4++,A2 ;input in A2 || MPY .M1x A2,B2,A6 ;multiplication || ADD .L1 A6,A7,A7 ;cycle 22 (epilog) ADD .L1 A6,A7,A7 ;final sum ACOE343 - Embedded Real-Time Processor Systems - Frederick University
229
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Example Use software pipelining in the following example: for (i=0; i<16; i++) sum = sum + a[i]*b[i]; ACOE343 - Embedded Real-Time Processor Systems - Frederick University
230
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Loop unrolling A technique for reducing the loop overhead The overhead decreases as the unrolling factor increases at the expense of code size Doesn’t work with zero overhead looping hardware DSPs for (i=0; i<64; i++) { sum +=*(data++); } for (i=0; i<64/4; i++) { sum +=*(data++); } ACOE343 - Embedded Real-Time Processor Systems - Frederick University
231
Loop Unrolling example
Unroll the following loop by a factor of 2, 4, and eight for (i=0; i<64; i++) { a[i] = b[i] + c[i+1]; } ACOE343 - Embedded Real-Time Processor Systems - Frederick University
232
Code optimization steps
When code performance is not satisfactory the following steps can be taken: Use intrinsic functions Use compiler optimization levels Use profiling then convert functions that need optimization to linear ASM Optimize code in ASM ACOE343 - Embedded Real-Time Processor Systems - Frederick University
233
Profiling using profiling tool
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
234
Profiling using clock function
#include <time.h> /* in order to call clock()*/ main() { … clock_t start, stop, overhead; start = clock(); /* Calculate overhead of calling clock*/ stop = clock(); /* and subtract this value from The results*/ overhead = stop − start; start = clock(); /* code to be profiled */ stop = clock(); printf(”cycles: %d\n”, stop − start − overhead); } ACOE343 - Embedded Real-Time Processor Systems - Frederick University
235
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Code optimization Use instructions in parallel Eliminate NOPs Unroll loops Use software pipelining ACOE343 - Embedded Real-Time Processor Systems - Frederick University
236
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Using Interrupts 16 interrupt sources 2 timer interrupts 4 external interrupts 4 McBSP interrupts 4 DMA interrupts ACOE343 - Embedded Real-Time Processor Systems - Frederick University
237
Loop program with interrupt
interrupt void c_int11 //ISR { int sample_data; sample_data = input_sample(); //input data output_sample(sample_data); //output data } void main() comm_intr(); //init DSK, codec, McBSP //enable INT11 and GIE while(1); //infinite loop ACOE343 - Embedded Real-Time Processor Systems - Frederick University
238
ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Using DMA ACOE343 - Embedded Real-Time Processor Systems - Frederick University
239
Real-time Digital Signal Processing with the TMS320C6x
Lecture 7 Real-time Digital Signal Processing with the TMS320C6x Dr. Konstantinos Tatas
240
Outline Digital Signal Processing Basics
ADC and sampling Aliasing Discrete-time signals Digital Filtering FFT Effects of finite fixed-point wordlength Efficient DSP application implementation on TMS320C6x processors FIR filter implementation IIR filter implementation
241
Digital Signal Processing Basics
A basic DSP system is composed of: An ADC providing digital samples of an analog input A Digital Processing system (μP/ASIC/FPGA) A DAC converting processed samples to analog output Real-time signal processing: All processing operation must be complete between two consecutive samples
242
ADC and Sampling An ADC performs the following:
Quantization Binary Coding Sampling rate must be at least twice as much as the highest frequency component of the analog input signal
243
Aliasing When sampling at a rate of fs samples/s, if k is any positive or negative integer, it’s impossible to distinguish between the sampled values of a sinewave of f0 Hz and a sinewave of (f0+kfs) Hz.
244
Discrete-time signals
A continuous signal input is denoted x(t) A discrete-time signal is denoted x(n), where n = 0, 1, 2, … Therefore a discrete time signal is just a collection of samples obtained at regular intervals (sampling frequency)
245
Common Digital Sequences
Unit-step sequence: Unit-impulse sequence:
246
The z transform Discrete equivalent of the Laplace transform
247
z-transform properties
Linear Shift theorem Note:
248
Transfer Function z-transform of the output/z transfer of the input
Pole-zero form
249
Pole-zero plot
250
System Stability Position of the poles affects system stability
The position of zeroes does not
251
Example 1 A system is described by the following equation:
y(n)=0.5x(n) + 0.2x(n-1) + 0.1y(n-1) Plot the system’s transfer function on the z plane Is the system stable? Plot the system’s unit step response Plot the system’s unit impulse response
252
The Discrete Fourier Transform (DFT)
Discrete equivalent of the continuous Fourier Transform A mathematical procedure used to determine the harmonic, or frequency, content of a discrete signal sequence
253
The Fast Fourier Transform (FFT)
FFT is not an approximation of the DFT, it gives precisely the same result
254
Digital Filtering In signal processing, the function of a filter is to remove unwanted parts of the signal, such as random noise, or to extract useful parts of the signal, such as the components lying within a certain frequency range Analog Filter: Input: electrical voltage or current which is the direct analogue of a physical quantity (sensor output) Components: resistors, capacitors and op amps Output: Filtered electrical voltage or current Applications: noise reduction, video signal enhancement, graphic equalisers Digital Filter: Input: Digitized samples of analog input (requires ADC) Components: Digital processor (PC/DSP/ASIC/FPGA) Output: Filtered samples (requires DAC)
255
Averaging Filter
256
Ideal Filter Frequency Response
257
Realistic vs. Ideal Filter Response
258
FIR filtering Finite Impulse Response (FIR) filters use past input samples only Example: y(n)=0.1x(n)+0.25x(n-1)+0.2x(n-2) Z-transform: Y(z)=0.1X(z)+0.25X(z)z^(- 1)+0.2X(z)(z^-2) Transfer function: H(z)=Y(z)/X(z)= z^(-1)+0.2(z^-2) No poles, just zeroes. FIR is stable
259
FIR filter design Inverse DFT of H(m)
260
FIR Filter Implementation
y(n)=h(0)x(n)+h(1)x(n-1)+h(2)x(n-2)+h(3)x(n-3)
261
Example 2 A filter is described by the following equation:
y(n)=0.5x(n) + 1x(n-1) + 0.5x(n-2), with initial condition y(-1) = 0 What kind of filter is it? Plot the filter’s transfer function on the z plane Is the filter stable? Plot the filter’s unit step response Plot the filter’s unit impulse response
262
IIR Filtering Infinite Impulse Response (IIR) filters use past outputs together with past inputs
263
IIR Filter Implementation
y(n)=b(0)x(n)+b(1)x(n-1)+b(2)x(n-2)+b(3)x(n-3) + a(0)y(n)+a(1)y(n-1)+a(2)y(n-2)+a(3)y(n-3)
264
FIR - IIR filter comparison
Simpler to design Inherently stable Can be designed to have linear phase Require lower bit precision IIR Need less taps (memory, multiplications) Can simulate analog filters
265
Example 3 A filter is described by the following equation:
y(n)=0.5x(n) + 0.2x(n-1) + 0.5y(n-1) + 0.2y(n-2), with initial condition y(-1)=y(-2) = 0 What kind of filter is it? Plot the filter’s transfer function on the z plane Is the filter stable? Plot the filter’s unit step response Plot the filter’s unit impulse response
266
Software and Hardware Implementation of FIR filters
267
Fixed-Point Binary Representation
Representation of a number with integer and fractional part: This is denoted as Qnm representation The binary point is implied It will affect the accuracy (dynamic range and precision) of the number Purely a programmer’s convention and has no relationship with the hardware.
268
Examples x = b Q0.15 => x= 2^(-1) + 2^(-4) + 2^(-11)+2^(-12) Q1.14 => x= 2^0 + 2^(−3) + 2^(−10) + 2^(−11) Q2.13 => x = 2^1 + 2^(−2) + 2^(−9) + 2^(−10) Q7.8 => x = ? Q12.3 => x = ?
269
EFFECTS OF FINITE FIXED-POINT BINARY WORD LENGTH
Quantization Errors ADC Coefficients Truncation Rounding Data Overflow
270
ADC Quantization Error
ADC converts an analog signal x(t) into a digital signal x(n), through sampling, quantization and encoding Assuming x(n) is interpreted as the Q15 fractional number such that −1 ≤ x(n) < 1 dynamic range of fractional numbers is 2. Since the quantizer employs B bits, the number of quantization levels available is 2B The spacing between two successive quantization levels is Δ = 2/2^B = 2^(1-B) Therefore the quantization error is |e(n)|≤Δ/2
271
Coefficient Quantization Error
Effects on FIR filters Location of zeroes changes Therefore, frequency response changes Effects on IIR filters Location of poles and zeroes change Could move poles outside of unit circle, leading to unstable implementations
272
Roundoff error
273
Overflow error signals and coefficients normalized in the range of −1 to 1 for fixed-point arithmetic, the sum of two B-bit numbers may fall outside the range of −1 to 1. Severely distorts the signal Overflow handling Saturation arithmetic “Clips” the signal, although better than overflow Should only be used to guarantee no overflow, but should not be the only solution Scaling of signals and coefficients
274
Coefficient representation
Fractional 2’s complement (Q) representation is used To avoid overflow, often scaling down by a power of two factor (S) (right shift) is used. The scaling factor is given by the equation: S=Imax(|h(0)|+|h(1)|+|h(2)|+…) Furthermore, filter coefficient larger than 1, cause overflow and are scaled down further by a factor B, in order to be less than 1
275
Example 1 Given the FIR filter Solution:
y(n)=0.1x(n)+0.25x(n-1)+0.2x(n-2) Assuming the input range occupies ¼ of the full range Develop the DSP implementation equations in Q-15 format. What is the coefficient quantization error? Solution: S=1/4((|h(0)|+|h(1)|+|h(2)|) = ¼( )=3.25/4 Overflow cannot occur, no input (S) scaling required No coefficents > 1, no coefficient (B) scaling required
276
Example 2 Given the FIR filter Solution:
y(n)=0.8x(n)+3x(n-1)+0.6x(n-2) Assuming the input range occupies ¼ of the full range Develop the DSP implementation equations in Q-15 format. What is the coefficient quantization error? Solution: S=1/4((|h(0)|+|h(1)|+|h(2)|) = ¼( )=4.4/4 = 1.1 Therefore: S=2 Largest coefficient: h(1) = 3, therefore B=4 ys(n)=0.2xs(n)+0.75xs(n-1)+0.15xs(n-2)
277
Simple FIR Hardware implementation
VHDL 4-tap filter entity ENTITY my_fir is Port (clk, rst: in std_logic; sample_in: in std_logic_vector(length-1 downto 0); sample_out: out std_logic_vector(length-1 downto 0) ); END ENTITY my_fir;
278
Example (continued) VHDL 4-tap filter architecture
ARCHITECTURE rtl of my_fir is type taps is array 0 to 3 of std_logic_vector(length-1 downto 0); Signal h: taps_type; Signal x_p1, x_p2, x_p3: std_logic_vector(length-1 downto 0); --past samples Signal y: std_logic_vector(2*length-1 downto 0); Begin Process (clk, rst) If rst=‘1’ then x_p1 <= (others => ‘0’); x_p2 <= (others => ‘0’); x_p3 <= (others => ‘0’); Elsif rising_edge(clk) then x_p1 <= sample_in; --delay registers x_p2 <= x_p1; x_p3 <= x_p2; End if; End process; y <= sample_in*h(0)+x_p1*h(1)+x_p2*h(2)+x_p3*h(3); Sample_out <= y(2*length-1 downto length); End;
279
FIR Software Implementation
int yn=0; //filter output initialization short xdly[N+1]; //input delay samples array interrupt void c_int11() //ISR { short i; yn=0; short h[N] = { //coefficients }; xdly[0]=input_sample(); for (i=0; i<N; i++) yn += (h[i]*xdly[i]); for (i=N-1; i>0; i--) xdly[i] = xdly[i-1]; output_sample(yn >> 15); //filter output return; //return from ISR }
280
IIR C implementation int yn=0; //filter output initialization
short xdly[N+1]; //input delay samples array Short ydly[M]; //output delay array interrupt void c_int11() //ISR { short i; yn=0; short a[N] = { //coefficients }; short b[M] = { //coefficients }; xdly[0]=input_sample(); for (i=0; i<N; i++) yn += (a[i]*xdly[i]); for (i=0; i<M; i++) yn += (b[i]*ydly[i]); for (i=N-1; i>0; i--) xdly[i] = xdly[i-1]; ydly[0] = yn >> 15; for (i=M-1; i>0; i--) ydly[i] = ydly[i-1]; output_sample(yn >> 15); //filter output return; //return from ISR }
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.