CS/COE 0447 Jarrett Billingsley Memory and Addresses CS/COE 0447 Jarrett Billingsley
Class announcements btw if you look at the "notes" alongside each slide, there are often answers to questions I ask and other extra/fun info repeat after me: loads copy from memory to registers stores copy from registers to memory it'll make more sense today. project 1 will be released this week… it's a small thing but it'll give you practice with basic assembly CS447
Memory CS447
Running programs, variables What is the memory? Control Registers Datapath Processor System Memory Persistent Storage Running programs, variables Files!!! smaller, but faster volatile big, but slow non-volatile the CPU can only run programs in memory. the CPU can only operate on data that is… where? - volatile = loses data when powered off - non-volatile = keeps data even without power, often for years - maybe in the future memory won't be volatile, or the line between system memory and persistent storage will go away… - the CPU can only operate on data in registers – so that means we have to be able to copy data between mem and regs! CS447
Bytes, bytes, bytes the memory is a big one-dimensional array of bytes what do these bytes mean? ¯\_(ツ)_/¯ i dunno lol every byte value has an address addresses start at 0, like arrays in C/Java when each byte has its own address, we call it a byte-addressable machine we do not measure memory data or addresses in bits Addr 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000A 000B 000C 000D Val 00 30 04 DE C0 EF BE 6C 34 01 02 03 - addresses start at 0 because it's a measure of distance from the beginning (same for arrays, really) - you can think of addresses as indexes into the array of memory bytes - not many non-byte-addressable machines these days… virtually all are byte-addressable CS447
How much memory? on an n-bit byte-addressable architecture addresses are n-bit numbers* each address refers to one byte so how many bytes can your CPU access? that is, what's the range of an n-bit number? 2n bytes! machines with 32-bit addresses can access 232 B = 4GiB this is why we went to 64-bit architectures: 4GiB ain't enough - usually addresses are n bits, but it's entirely possible to have addresses bigger/smaller than the word size - 64-bit architectures can address 2^64 = 16 EiB of memory, which is ridiculous CS447
Prefixes are hard if you see KB, that means "kilobyte"… …is that 210 bytes, or 103 bytes? historically, 210 this is technically incorrect, but… most people still say "kilo, mega" to mean the powers of 2 even if they write KiB, MiB you will see powers of ten in: hard drive sizes (1TB = 909GiB) network speeds (1Mb/s = 122KiB/s) file sizes on recent versions of macOS Prefix Value Kilo (K) 103 Mega (M) 106 (1 million) Giga (G) 109 (1 billion) Tera (T) 1012 (1 trillion) Prefix Value Kibi (Ki) 210 (1,024) Mebi (Mi) 220 (1,048,576) Gibi (Gi) 230 (~1 billion) Tebi (Ti) 240 (~1 trillion) - 1000 vs. 1024 isn't too big of a difference, but the bigger the prefix, the bigger the discrepancy - 1024 bits = 1 kibibit… lmao - 1TB = 1012B ÷ 240 = 909 GiB - 1Mb/s = 125,000 B/s ÷ 210 = 122KiB/s - THANKS FOR MAKING THIS SITUATION MORE OF A PAIN, APPLE CS447
Values bigger than a byte CS447
Words, words, words (animated) our memory only holds bytes, so how do we store 32-bit words? Addr 00 01 02 03 04 05 06 07 08 09 0A 0B 0C Val 00 30 04 DE C0 EF BE 6C 34 01 02 6B 35 FF 02 we make groups. when we load a value, the CPU reassembles the bytes into a word. t0 6B35FF02 t0 6C340001 when we store a value, the CPU splits the word into bytes. - this works for anything bigger than a byte: halfwords (16-bit values), 64-bit values, etc. - but the order of the bytes is important (see in a few slides) this happens transparently: the CPU handles this for you. CS447
Addresses of things bigger than a byte the address of any value is the address of its first byte. Addr 00 01 02 03 04 05 06 07 08 09 0A 0B 0C Val 00 30 04 DE C0 EF BE 6C 34 01 02 what are the addresses of these values? - 0, 4, 7, and 8 respectively. CS447
A matter of perspective let's say there's a word at address 4 what word do those 4 bytes represent? if we think of addresses increasing downward… Addr Val ... 0004 DE 0005 C0 0006 EF 0007 BE Addr Val ... 0007 BE 0006 EF 0005 C0 0004 DE if we think of addresses increasing upward… …is it 0xDEC0EFBE? …is it 0xBEEFC0DE? - you might think of addresses increasing upward because 0 is a "low" address, and 4.1 billion is a "high" address - you will see both kinds of memory diagrams… be conscious of the labels! - beefcode. CS447
Endianness CS447
0xDEC0EFBE 0xBEEFC0DE 0 1 2 3 DE C0 EF BE Endianness big-endian: endianness is a rule for what order to put bytes in in larger values big-endian: first byte is the most significant byte little-endian: first byte is the least significant byte 0 1 2 3 DE C0 EF BE 0xDEC0EFBE 0xBEEFC0DE first byte = byte at smallest (first) address - big endian is "read it straight out in the same order as in memory" - little endian seems silly but is more common and does have some advantages in memory hardware design - nothing to do with value of bytes, only order CS447
What DOESN'T endianness affect? the arrangement of the bits within a byte only changes meaning of order of the bytes 1-byte values, arrays of bytes, ASCII strings… single bytes are not affected the ordering of bytes inside the CPU there's no need for "endian" arithmetic it only affects moving larger than 1-byte values: between the CPU and memory or between multiple computers 0xBEEFC0DE 0x7B03F77D 0xED0CFEEB 0xDEC0EFBE l l e H o H e l l o - note the bytes are still DE, C0 etc. - the CPU works with whole words CS447
Which is better: little or big? as long as you're consistent, it doesn't matter endianness pops up whenever you have sequences of bytes: files networks hardware buses and… memory! which one is MIPS? it's bi-endian – can work in either configuration but MARS uses the endianness of the computer it's running on so little-endian for everyone. - if you're designing memory hardware, little-endian has a slight advantage. - big-endian usually feels nicer to our brains, but our brains are weird. - most computers today are little-endian. but we still have to deal with it. - if you are designing serial hardware (or serial software protocols), then there is also bit endianness but that's not as common - unless you're somehow using a pre-intel macbook?? CS447
Variables, Loads, Stores loads copy from ___ to ___ Variables, Loads, Stores loads copy from ___ to ___. stores copy from ___ to ___. CS447
every variable really has two parts: an address and a value Variable addresses everything in memory has an address a super important concept: every variable really has two parts: an address and a value if you want to create a variable in memory… first you need to give it an address this extremely tedious task is handled by assemblers and compilers - EVERYTHING has an address. variables, functions, objects, arrays etc. - a variable's name is really just a mnemonic for its address. CS447
Load-store architectures in a load-store architecture, all memory accesses are done with two kinds of instructions: loads and stores (like in MIPS) load loads copy data from memory into CPU registers Registers Memory store stores copy data from CPU registers into memory - in some architectures, many instructions can access memory CS447
Operating on variables in memory (animated) we want to increment a variable that is in memory where do values have to be for the CPU to operate on them? what do we want the overall outcome to be? so, what three steps are needed to increment that variable? load the value from memory into a register add 1 to the value in the register store the value back into memory every variable access works like this!!! HLLs just hide this from you 5 4 x 4 5 t0 CS447
MIPS ISA: Variables in Memory CS447
Putting a variable in memory we can declare a global variable like this: .data x: .word 4 the Java equivalent would be static int x = 4; .data says "I'm gonna declare variables" you can declare as many as you want! to go back to writing code, use .text name type initial value - in 449, you'll see these names as well – machine code files have .data and .text segments - you should get errors if you try to write instructions in the data segment but lemme know if MARS doesn’t ok CS447
Loading and storing words you can load and store entire 32-bit words with lw and sw the instructions look like this (variable names not important): lw t1, x # loads from variable x into t1 sw t1, x # stores from t1 into variable x in MIPS, stores are written with the destination on the right. !? you can remember it with this diagram… the memory is "on the right" for both loads and stores Registers Memory lw sw - load WORD and store WORD - ARM does this too, I don't really understand it CS447
lw t0, x add t0, t0, 1 sw t0, x Read, modify, write you now know enough to increment x! first we load x into a register then add 1… and then store. let's see what values are in t0 and memory after this program runs lw t0, x add t0, t0, 1 sw t0, x CS447
Loading and storing bytes and halfwords some values are tiny to load/store bytes, we use lb/sb to load/store 16-bit (halfword) values, we use lh/sh these look and work just like lw/sw, like: lb t0, tiny # loads a byte into t0 sb t0, tiny # stores a byte into tiny …or DO THEY?!?!?!? how big are registers? what should go in those extra 16/24 bits then? - registers are 32 bits… CS447
Extension and Truncation CS447
can I get an extension sometimes you need to widen a number with fewer bits to more zero extension is easy: put 0s at the beginning. 10012 to 8 bits 0000 10012 but there are also signed numbers which we didn't talk about yet the top bit (MSB) of signed numbers is the sign (+/-) sign extension puts copies of the sign bit at the beginning 10012 to 8 bits 1111 10012 00102 to 8 bits 0000 00102 like spreading peanut butter… 111111111111 1011 - sign extension is important because it preserves the sign and value of signed numbers! - but we'll learn more about signed numbers later. CS447
lh and lhu behave similarly. E X P A N D V A L U E if you load a byte into a register… 31 00000000 10010000 If the byte is signed… what should it become? 31 11111111 10010000 lb does sign extension. If the byte is unsigned… what should it become? 31 00000000 10010000 lbu does zero extension. lh and lhu behave similarly. CS447
How does the CPU know whether it's signed or unsigned do YOU think the CPU knows this? no you have to use the right instruction. if you declare a variable as .word, use lw/sw. if you declare a variable as .half, use lh OR lhu, and sh. if you declare a variable as .byte, use lb OR lbu, and sb. in particular, lbu is usually what you want for byte variables but lb is one character shorter and just looks so nice and consistent… CS447
Truncation if we go the other way, the upper part of the value is cut off. the sign issue doesn't exist when storing, cause we're going from a larger number of bits to a smaller number therefore, there are no sbu/shu instructions this may or may not be what you wanted! sh 31 10100010 00001110 11111111 00000100 31 10100010 00001110 11111111 00000100 11111111 00000100 - THIS is what compilers are warning you about when they say "possible loss of precision" or whatever! CS447