Lecture 4: Load/Store Architectures CS 2011 Fall 2014, Dr. Rozier.

Slides:



Advertisements
Similar presentations
Slides created by: Professor Ian G. Harris Efficient C Code  Your C program is not exactly what is executed  Machine code is specific to each ucontroller.
Advertisements

ARM versions ARM architecture has been extended over several versions.
Embedded Systems Programming
Overheads for Computers as Components 2nd ed.
Embedded Systems Architecture
CS/COE0447 Computer Organization & Assembly Language
© 2000 Morgan Kaufman Overheads for Computers as Components ARM instruction set zARM versions. zARM assembly language. zARM programming model. zARM memory.
Chapter 2 Instruction Sets 金仲達教授 清華大學資訊工程學系 (Slides are taken from the textbook slides)
Machine Instructions Operations
Embedded System Design Center ARM7TDMI Microprocessor Data Processing Instructions Sai Kumar Devulapalli.
Embedded System Design Center Sai Kumar Devulapalli ARM7TDMI Microprocessor Load and store instruction.
Machine Instructions Operations 1 ITCS 3181 Logic and Computer Systems 2015 B. Wilkinson Slides4-1.ppt Modification date: March 18, 2015.
COMP3221 lec9-logical-I.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 9: C/Assembler Logical and Shift - I
ARM Microprocessor “MIPS for the Masses”.
Binary Logic (review) Basic logical operators: (Chapter 7 expanded)
Lecture 5: Decision and Control CS 2011 Fall 2014, Dr. Rozier.
Thumb Data Processing Instructions and Breakpoint Instructions 02/18/2015 Mingliang Ge Yi (Leo) Wu Xinuo (Johnny) Zhao.
Embedded Systems Programming ARM assembler. Creating a binary from assembler source arm=linux-as Assembler Test1.S arm-linux-ld Linker Arm-boot.o Executable.
Elec2041 lec-11-mem-I.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lecture 11: Memory Access - I
ARM Instructions I Prof. Taeweon Suh Computer Science Education Korea University.
Embedded System Design Center Sai Kumar Devulapalli ARM7TDMI Microprocessor Thumb Instruction Set.
Lecture 18 Last Lecture Today’s Topic Instruction formats
Topic 8: Data Transfer Instructions CSE 30: Computer Organization and Systems Programming Winter 2010 Prof. Ryan Kastner Dept. of Computer Science and.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Lecture 2: Basic Instructions CS 2011 Fall 2014, Dr. Rozier.
Lecture 4. ARM Instructions #1 Prof. Taeweon Suh Computer Science Education Korea University ECM586 Special Topics in Embedded Systems.
Lecture 4. ARM Instructions Prof. Taeweon Suh Computer Science & Engineering Korea University COMP427 Embedded Systems.
Registers and MAL Lecture 12. The MAL Architecture MAL is a load/store architecture. MAL supports only those addressing modes supported by the MIPS RISC.
Lecture 3. ARM Instructions Prof. Taeweon Suh Computer Science Education Korea University ECM583 Special Topics in Computer Systems.
Lecture 2: Basic Instructions EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
BITWISE OPERATIONS – Microprocessor Asst. Prof. Dr. Choopan Rattanapoka and Asst. Prof. Dr. Suphot Chunwiphat.
Module : Algorithmic state machines. Machine language Machine language is built up from discrete statements or instructions. On the processing architecture,
1 Chapter 4 ARM Assembly Language Smruti Ranjan Sarangi Computer Organisation and Architecture PowerPoint Slides PROPRIETARY MATERIAL. © 2014 The McGraw-Hill.
Lecture 2: Advanced Instructions, Control, and Branching EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer.
Unit-2 Instruction Sets, CPUs
Lecture 6: Branching CS 2011 Fall 2014, Dr. Rozier.
Lecture 8: Loading and Storing to Memory CS 2011 Fall 2014, Dr. Rozier.
Lecture 2: Representing Numbers CS 2011 Fall 2014, Dr. Rozier.
Assembly Variables: Registers Unlike HLL like C or Java, assembly cannot use variables – Why not? Keep Hardware Simple Assembly Operands are registers.
Instruction Set Architectures Early trend was to add more and more instructions to new CPUs to do elaborate operations –VAX architecture had an instruction.
ARM Shifts, Multiplies & Divide??. MVN Pseudo Instructions Pseudo Intruction: Supported by assembler, not be hardware.
Binary Logic (review) Basic logical operators:(Chapter 7 expanded) NOT AND – outputs 1 only if both inputs are 1 OR – outputs 1 if at lest one input is.
Lecture 10: Load/Store cont. and Integer Arithmetic CS 2011 Fall 2014, Dr. Rozier.
ARM Instruction Set Computer Organization and Assembly Languages Yung-Yu Chuang with slides by Peng-Sheng Chen.
Lecture 6: Decision and Control CS 2011 Spring 2016, Dr. Rozier.
Smruti Ranjan Sarangi, IIT Delhi Chapter 4 ARM Assembly Language
ARM Intro.
ARM Registers Register – internal CPU hardware device that stores binary data; can be accessed much more rapidly than a location in RAM ARM has.
Assembly Language Assembly Language
Processor Instructions set. Learning Objectives
Chapter 4 Addressing modes
The University of Adelaide, School of Computer Science
Topic 5: Processor Architecture Implementation Methodology
Topic 6: Bitwise Instructions
Architecture Overview
Multiplication by small constants (pp. 139 – 140)
Topic 5: Processor Architecture
Instruction encoding The ISA defines Format = Encoding
Instruction encoding The ISA defines Format = Encoding
The ARM Instruction Set
Branching instructions
Overheads for Computers as Components 2nd ed.
ARM ORGANISATION.
Instruction encoding The ISA defines Format = Encoding
Computer Architecture
Immediate data Immediate operands : ADD r3, r3, #1 valid ADD r3, #1,#2 invalid ADD #3, r1,r2 invalid ADD r3, r2, #&FF ( to represent hexadecimal immediate.
Introduction to Assembly Chapter 2
An Introduction to the ARM CORTEX M0+ Instructions
Arithmetic and Logic Chapter 3
Presentation transcript:

Lecture 4: Load/Store Architectures CS 2011 Fall 2014, Dr. Rozier

LADIES AND TIGERS

The Lady and the Tiger Two doors containing either Ladies or Tigers

The Lady and the Tiger You will be shown two doors, to two rooms. – Each could contain either a lady or a tiger… – It could be that both rooms contain a lady, or that both rooms contain a tiger! You will need to reason carefully and logically to survive! Each question, pick a door, or decide not to open a door. – You score one point for picking a lady, or for refusing to pick if both doors contain tigers. – Three points available for your homework/projects grade today – If you answer wrong, you may write a short paper describing what you did wrong, and how to find the right answer, due next class.

The Lady and the Tiger Form up into groups On a sheet of paper, list the first and last names of each student in the group, and pick a team name – Discuss your answers, and record them – Each group will then give their answers to the class

The Lady and the Tiger Q1 One of these is true… In this room, there is a lady, and in the other room there is a tiger. The other is false… In one of these rooms there is a lady, and in one of these rooms there is a tiger.

The Lady and the Tiger Q1 One of these is true…The other is false…

The Lady and the Tiger Q2 Either both signs are false… At least one of these rooms contains a lady Or both are true… A tiger is in the other room…

The Lady and the Tiger Q2 Either both signs are false…Or both are true…

The Lady and the Tiger Q3 Either both signs are false… Either a tiger is in this room, or a lady is in the other room. Or both are true… An lady is in the other room.

The Lady and the Tiger Q3 Either both signs are false…Or both are true…

What does this have to do with CS?

CS and CE What are the disciplines? – Computer Engineering? – Computer Science?

What it isn’t "What would we like our children- the general public of the future—to learn about computer science in schools? We need to do away with the myth that computer science is about computers. Computer science is no more about computers than astronomy is about telescopes, biology is about microscopes or chemistry is about beakers and test tubes. Science is not about tools, it is about how we use them and what we find out when we do." -- Ian Parberry

What it isn’t A confusion of even longer standing came from the fact that the unprepared included the electronic engineers that were supposed to design, build, and maintain the machines. The job was actually beyond the electronic technology of the day, and, as a result, the question of how to get and keep the physical equipment more or less in working condition became in the early days the all-overriding concern. As a result, the topic became —primarily in the USA— prematurely known as "computer science" —which, actually is like referring to surgery as "knife science"— and it was firmly implanted in people's minds that computing science is about machines and their peripheral equipment. -- Edsger Dijkstra

What it really is Computer science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems. Computer scientists invent algorithmic processes that create, describe, and transform information and formulate suitable abstractions to model complex systems. Computer engineering is the process of analyzing, designing, and integrating the hardware and software systems needed for information processing or computation. Computer engineers are saddled with the difficult tasks of modeling, designing, and analyzing cyberphysical systems which solve interdisciplinary problems in a wide variety of domains.

BASIC LOAD STORE

ARMv6 Remember! – RISC architecture – Load/Store architecture

RISC Load/Store Architecture Processor Registers Add Cmp Load Etc Store Memory

Loading and Storing ARM, MIPS, and other Load/Store Architectures – Do not support processing data in memory – Must first move data into registers before processing. Sound inefficient? – In practice it isn’t! – Memory is slow, registers are fast!

Loading and Storing The Load/Store architecture paradigm – LOAD data values you need from memory into registers – Process data in registers – STORE the results from the registers into memory Processor Registers Add Cmp Load Etc Store Memory

Single register data transfer STR – store a word from a register STR r0, [r1] Store r0 to the location pointed to by r1 LDR r0, [r1] Load the contents pointed to by r1 into r0

Single register data transfer LDR – load a word from memory into a register LDR r0, [r1] Load the contents pointed to by r1 into r0

Offsets Our offset can be – An unsigned 12bit immediate value – A register Offset can be – Added (default) – Subtracted (prefix with a ‘-’)

Offsets Can be done: – Prefix: str r0, [r1, r2]Store r0 to [r1+r2] – Prefix, increment: str r0, [r1, r2]!Store r0 to [r1+r2] r1 = r1 + r2 – Postfix:str r0, [r1], r2Store r0 to [r1] r1 = r1 + r2

Load/Store with Offset Prefix

Load/Store with Offset Postfix

A basic example int a[4]; a[3] = a[0] + a[1] + a[2]

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] Let’s say r0 contains the BASE address of the array a[] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0?? r1?? r2?? r3?? r4?? … r15 REG r0?? r1?? r2?? r3?? r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x00 r2?? r3?? r4?? … r15 REG r0x00 r1x00 r2?? r3?? r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x00 r2x00 r3?? r4?? … r15 REG r0x00 r1x00 r2x00 r3?? r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x00 r2x00 r3x05 r4?? … r15 REG r0x00 r1x00 r2x00 r3x05 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x00 r2x05 r3x05 r4?? … r15 REG r0x00 r1x00 r2x05 r3x05 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x01 r2x05 r3x05 r4?? … r15 REG r0x00 r1x01 r2x05 r3x05 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x01 r2x05 r3x02 r4?? … r15 REG r0x00 r1x01 r2x05 r3x02 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x01 r2x07 r3x02 r4?? … r15 REG r0x00 r1x01 r2x07 r3x02 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+2] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x02 r2x07 r3x02 r4?? … r15 REG r0x00 r1x02 r2x07 r3x02 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+2] ldrr3, [r0, r1];r3 = a[0+2] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x02 r2x07 r3x03 r4?? … r15 REG r0x00 r1x02 r2x07 r3x03 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+2] ldrr3, [r0, r1];r3 = a[0+2] addr2, r2, r3;r2 = r2 + r3 MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x02 r2x 0A r3x03 r4?? … r15 REG r0x00 r1x02 r2x 0A r3x03 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+2] ldrr3, [r0, r1];r3 = a[0+2] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Increment to a[0+3] MEM 0x0x05 0x1x02 0x2x03 0x3?? MEM 0x0x05 0x1x02 0x2x03 0x3?? REG r0x00 r1x03 r2x 0A r3x03 r4?? … r15 REG r0x00 r1x03 r2x 0A r3x03 r4?? … r15

A basic example int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #0;Initialize sum to 0 ldrr3, [r0, r1];r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0, r1];r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+2] ldrr3, [r0, r1];r3 = a[0+2] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Increment to a[0+3] strr2, [r0, r1];a[0+3] = r2 MEM 0x0x05 0x1x02 0x2x03 0x3x 0A MEM 0x0x05 0x1x02 0x2x03 0x3x 0A REG r0x00 r1x03 r2x 0A r3x03 r4?? … r15 REG r0x00 r1x03 r2x 0A r3x03 r4?? … r15

Improving Performance! int a[4]; a[3] = a[0] + a[1] + a[2] movr1, #0;Go for a[0+0] movr2, #1;Initialize sum to 0 ldrr3, [r0], r1;r3 = a[0+0] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+1] ldrr3, [r0], r1;r3 = a[0+1] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Go for a[0+2] ldrr3, [r0], r1;r3 = a[0+2] addr2, r2, r3;r2 = r2 + r3 addr1, r1, #1;Increment to a[0+3] strr2, [r0];a[0+3] = r2 MEM 0x0x05 0x1x02 0x2x03 0x3x 0A MEM 0x0x05 0x1x02 0x2x03 0x3x 0A REG r0x00 r1x03 r2x 0A r3x03 r4?? … r15 REG r0x00 r1x03 r2x 0A r3x03 r4?? … r15 From 12 instructions to 9 instructions, a 25% reduction in instruction count!

Going further, Block Data Transfer LDM/STM – Load/Store Multiple – Allow between 1 and 16 registers to be transferred to or from memory.

Going further, Block Data Transfer LDM/STM – Load/Store Multiple – Allow between 1 and 16 registers to be transferred to or from memory.

BASIC DATA PROCESSING

Architecture of ARM

Data Processing Basic data processing instructions Destination Register Operand 1 RegisterOperand 2

Data Processing Basic data processing instructions ADDRd = Rn + Operand2 SUBRd = Rn – Operand2 RSBRd = Operand2 – Rn

Data Processing Basic data processing instructions ADDRd = Rn + Operand2 SUBRd = Rn – Operand2 RSBRd = Operand2 – Rn MOVRd = Operand2 MVNRd = -Operand2 Operand2 is 12-bits long, and can be an immediate, or a register. How does the ARM know?

Operand2 is Versatile! Immediate value – An 8-bit constant Register – How many bits to address our registers r0 – r15?

Operand2 is Versatile! Immediate value – An 8-bit constant Register – How many bits to address our registers r0 – r15? At most 8-bits for our immediate or 4-bits for a register. We have 4 more unaccounted for bits…

The ARM Barrel Shifter ARM architectures have a unique piece of hardware known as a barrel shifter. – Device moves bits in a word left or right. Most processors have stand alone instructions for shifting bits. ARM allows shifts as part of regular instructions. Allows for quick multiplication and division.

The ARM Barrel Shifter

Reality of the hardware – There are no shift instructions – Barrel shifter can be controlled WITH an instruction – Can only be applied to operand 2 on instructions which use the ALU

Types of Shifting Logical Shifts – lsl – left – lsr – right Arithmetic Shifts – asr – right Rotates – ror – right – rrx – right with extend

Example mov r0, r1, lsl #1 This would perform a logical shift left of 1 bit on r1, and then copy the result into r0. mov r0, r1, lsl r2 This would do the same as before, but use the value of r2 for the shift amount.

Logical Shifts Logical shifting a number left or right has the effect of doubling or halving it. lsl – Highest order bit shifts into the carry flag – Lowest order bit is filled with 0. lsr – Lowest order bit shifts into the carry flag – Highest order bit is filled with 0. LSLCb7b6b5b4b3b2b1b0 Before After

Arithmetic Shift Preserves the sign bit. asr – Extends the sign bit to the second most significant – Shifts the least significant into the carry flag. LSLCb7b6b5b4b3b2b1b0 Before After

Arithmetic Shift Preserves the sign bit. asr – Extends the sign bit to the second most significant – Shifts the least significant into the carry flag. Why isn’t there an Arithmetic Shift Left? LSLCb7b6b5b4b3b2b1b0 Before After

Rotations Rotates bits from low order to high order ror – Moves bits from the lowest order to the highest, setting the carry bit in the process with the last bit rotated out. rrx – Always and only rotates by one position. – Carry flag is dropped into the highest order bit. Lowest order bit is moved to the carry flag LSLCb7b6b5b4b3b2b1b0 Before After

Rotations ror rrx

Adding a Shift or Rotate Shifts and rotates can be used with: – adc, add, and – bic – cmn, cmp – eor – mov, mvn – orr – rsb – sbc, sub – teq, tst

For next time Homework 1 will post tonight. Continue discussion of Chapter 2 on Thursday.