Machine-Level Programming IV: Machine Data 9th Lecture

Slides:



Advertisements
Similar presentations
Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
Advertisements

Dynamic Memory Allocation I Topics Basic representation and alignment (mainly for static memory allocation, main concepts carry over to dynamic memory.
Structured Data I: Homogenous Data Sept. 17, 1998 Topics Arrays –Single –Nested Pointers –Multilevel Arrays Optimized Array Code class08.ppt “The.
Structured Data II Heterogenous Data Sept. 22, 1998 Topics Structure Allocation Alignment Operating on Byte Strings Unions Byte Ordering Alpha Memory Organization.
Data Representation and Alignment Topics Simple static allocation and alignment of basic types and data structures Policies Mechanisms.
1 1 Machine-Level Programming IV: x86-64 Procedures, Data Andrew Case Slides adapted from Jinyang Li, Randy Bryant & Dave O’Hallaron.
Machine-Level Programming 6 Structured Data Topics Structs Unions.
Ithaca College 1 Machine-Level Programming VIII: Advanced Topics: alignment & Unions Comp 21000: Introduction to Computer Systems & Assembly Lang Systems.
Machine-Level Programming IV: Structured Data Topics Arrays Structs Unions CS 105 “Tour of the Black Holes of Computing”
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Machine-Level Programming IV: Data : Introduction.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Machine-Level Programming V: Unions and Memory layout.
University of Amsterdam Computer Systems – Data in C Arnoud Visser 1 Computer Systems New to C?
University of Washington Today Lab 3 out  Buffer overflow! Finish-up data structures 1.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Machine-Level Programming IV: Data Topics: Arrays.
Fabián E. Bustamante, Spring 2007 Machine-Level Prog. IV - Structured Data Today Arrays Structures Unions Next time Arrays.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Machine-Level Programming IV: Data MCS284: Computer.
Assembly - Arrays תרגול 7 מערכים.
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Machine-Level Programming IV: Data Slides adapted.
Lecture 9 Aggregate Data Topics Arrays Structs Unions again Lab 02 comments Vim, gedit, February 22, 2016 CSCE 212 Computer Architecture.
Instructor: Fatma CORUT ERGİN
Data in Memory variables have multiple attributes symbolic name
Machine-Level Programming V: Miscellaneous Topics
Machine-Level Programming IV: Data
Machine-Level Programming IV: Data
Arrays CSE 351 Spring 2017 Instructor: Ruth Anderson
C Tutorial (part 5) CS220, Spring 2012
Machine-Level Programming IV: Structured Data
Machine-Level Programming IV: Data
Arrays.
Machine-Level Programming IV: Structured Data Sept. 30, 2008
Heterogeneous Data Structures & Alignment
Machine-Level Programming IV: x86-64 Procedures, Data
Machine-Level Programming IV: Data CSCE 312
Arrays.
Machine-level Programming IV: Data Structures
Arrays CSE 351 Winter
Feb 2 Announcements Arrays/Structs in C Lab 2? HW2 released
Carnegie Mellon Machine-Level Programming III: Procedures : Introduction to Computer Systems October 22, 2015 Instructor: Rabi Mahapatra Authors:
Machine-Level Programming 6 Structured Data
Machine-Level Programming IV: Structured Data Sept. 19, 2002
Machine-Level Programming IV: Data
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Instructor: Phil Gibbons
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Instructors: Majd Sakr and Khaled Harras
Machine-Level Programming V: Control: loops Comp 21000: Introduction to Computer Organization & Systems Systems book chapter 3* * Modified slides from.
Structured Data II Heterogenous Data Feb. 15, 2000
Machine-Level Programming IV: Data
Machine-Level Programming IV: Data /18-213/14-513/15-513: Introduction to Computer Systems 8th Lecture, September 20, 2018.
Instructors: Majd Sakr and Khaled Harras
Machine Level Representation of Programs (III)
Machine Level Representation of Programs (IV)
Machine-Level Programming VIII: Data Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017 Systems book chapter 3* * Modified slides.
Arrays CSE 351 Winter 2018 Instructor: Mark Wyse Teaching Assistants:
Machine-Level Programming 5 Structured Data
Executables & Arrays CSE 351 Autumn 2018
Machine-Level Programming 5 Structured Data
Machine-Level Representation of Programs (x86-64)
Machine-Level Programming IV: Structured Data
Program Optimization CSE 238/2038/2138: Systems Programming
Structured Data I: Homogenous Data Feb. 10, 2000
Machine-Level Programming V: Control: loops Comp 21000: Introduction to Computer Organization & Systems Systems book chapter 3* * Modified slides from.
Instructor: Fatma CORUT ERGİN
Machine-Level Programming VIII: Data Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017 Systems book chapter 3* * Modified slides.
Data Spring, 2019 Euiseong Seo
Machine-Level Programming VIII: Data Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017 Systems book chapter 3* * Modified slides.
Machine-Level Programming IV: Structured Data
Machine-Level Programming IV: Data
Machine-Level Programming IV: Structured Data
Machine-Level Programming IV: Structured Data
Presentation transcript:

Machine-Level Programming IV: Machine Data 9th Lecture Instructor: David Trammell Slides modified from those provided by textbook authors Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective (3rd Edition)

Today Arrays Structures Floating Point One-dimensional Multi-dimensional (nested) Multi-level Structures Allocation Access Alignment Floating Point

Machine Memory is a Byte Array How do we allocate arrays of multi-byte types? Given array of data type T and length L We contiguously allocate a region of L * sizeof(T) bytes char s[12]; x x + 12 int arr[5]; x x + 4 x + 8 x + 12 x + 16 x + 20 double data[3]; x + 24 x x + 8 x + 16 char *cpa[3]; x x + 8 x + 16 x + 24

Array Access in Byte Addressable Memory Basic Principle: T A[L] Array of data type T and length L Identifier A can be used as a pointer of type (T *) to array element 0 Reference Type Value arr[4] int 3 //element 4 arr int * x //pointer to arr arr+1 int * x + 4 //pointer to arr[1] &arr[2] int * x + 8 //pointer to arr[2] arr[5] int (garbage) //(out of bounds) *(arr+1) int 5 //the value at M[x+4] &arr[i] int * ? arr + i int * ? int arr[5]; 1 5 2 3 x x + 4 x + 8 x + 12 x + 16 x + 20

Array Access in Byte Addressable Memory Basic Principle: T A[L] Array of data type T and length L (let type T have a size of K bytes) Identifier A can be used as a pointer of type (T *) to array element 0 Reference Type Value arr[4] int 3 //element 4 arr int * x //pointer to arr arr+1 int * x + 4 //pointer to arr[1] &arr[2] int * x + 8 //pointer to arr[2] arr[5] int (garbage) //(out of bounds) *(arr+1) int 5 //the value at M[x+4] &arr[i] int * x + 4i //the value at M[x+4i] arr + i int * x + K*i // int arr[5]; 1 5 2 3 x x + 4 x + 8 x + 12 x + 16 x + 20

Array Example Array elements are guaranteed to be contiguous in memory #define M 5 int a[M] = { 1, 5, 2, 1, 3 }; int b[M] = { 0, 2, 1, 3, 9 }; int c[M] = { 9, 4, 7, 2, 0 }; int a[] 1 5 2 3 16 20 24 28 32 36 int b[] 2 1 3 9 36 40 44 48 52 56 int c[] 9 4 7 2 56 60 64 68 72 76 Array elements are guaranteed to be contiguous in memory Separately declared arrays a, b and c might NOT be contiguous (although they are in this example)

Array Accessing Example int a[] 1 5 2 3 16 20 24 28 32 36 int get_digit (int *z, int digit) { return z[digit]; } Register %rdi contains starting address of array Register %rsi contains array index Desired digit at %rdi + 4*%rsi Use memory reference (%rdi,%rsi,4) X86-64 # %rdi = z (1st arg) # %rsi = digit (2nd arg) movl (%rdi,%rsi,4),%eax

Array Loop Example void inc_z(int *z) { size_t i; for (i = 0; i < 5; i++) z[i]++; } # %rdi = z (1st arg) movl $0, %eax # i = 0 jmp .L3 # goto .L3 .L4: # .L4 (loop start) addl $1, (%rdi,%rax,4) # z[i]++ addq $1, %rax # i++ .L3: # .L3 (loop test) cmpq $4, %rax # cmp i to 4 jbe .L4 # if i <= 4 goto .L4 rep; ret

Multidimensional (Nested) Arrays Declaration: T A[R][C]; 2D array of data type T R rows, C columns Type T element requires K bytes Array Size is: R * C * K bytes Arrangement in memory Row-Major Ordering (that is, all of row 0, then all of row 1 … etc.) int A[R][C]; Board: show 3D example: a[2][3][2] to illustrate the idea of row major as enumerating indices from right to left … A[0][0] A[0][C-1] A[1][0] A[1][C-1] A[R-1][0] A[R-1][C-1] 4*R*C Bytes

Nested Array Example “Row-Major” ordering in memory int MAT[4][5] = {{1, 5, 2, 0, 6}, {1, 5, 2, 1, 3 }, {1, 5, 2, 1, 7 }, {1, 5, 2, 2, 1 }}; 1 5 2 6 1 5 2 3 1 5 2 7 1 5 2 MAT[4][5] 76 96 116 136 156 MAT[] is an array of 4 elements (contiguously allocated) Each element in MAT[] is an array of 5 ints (allocated contiguously) “Row-Major” ordering in memory

Nested Array Row Access Address to the start of Each Row If A[i] is an array of C elements … Starting address A + i * (C * K) 1 5 2 6 1 5 2 3 1 5 2 7 1 5 2 76 96 116 136 156

Nested Array, Row Access: MAT[4][5] 1 5 2 6 3 7 int *get_row(int index, int MAT[][]){ return MAT[index]; } # %rdi = index # %rsi = MAT //pointer to start start of matrix leaq (%rdi,%rdi,4),%rax #index + index*4 == 5*index leaq (%rsi,%rax,4),%rax # MAT + (20 * index) Row Vector MAT[index] is an array of 5 int’s Starting address: MAT +(20*index) Machine Code 1st leaq is for math, 2nd computes an address. How can we tell? Address is: MAT + 4*(5*index) == MAT + 20*index

Nested Array Element Access Array Elements A[i][j] is element of type T, which uses K bytes &A[i][j] is: A + i * (C * K) + j * K == A + K*( C*i + j )

Nested Array Element Access Array Elements A[i][j] is element of type T, which uses K bytes &A[i][j] is: A + i * (C * K) + j * K == A + K*( C*i + j ) MAT 1 5 2 6 3 7 int get_elem(int index, int digit, int MAT[][]){ return MAT[index][digit]; } # %rdi = index # %rsi = digit # %rdx = MAT //pointer to start start of matrix leaq (%rdi,%rdi,4), %rax # rax = 5*index addl %rax, %rsi # rsi = 5*index + digit movl (%rdx,%rsi,4), %eax # Mem[MAT+4*(5*index+digit)]

Multi-Level Array Example Variable ap denotes array of 3 elements Each element is a pointer 8 bytes Each pointer points to array of int’s int a[5] = { 1, 5, 2, 1, 3 }; int b[5] = { 0, 2, 1, 3, 9 }; int c[5] = { 9, 4, 7, 2, 0 }; int *ap[3] = {b, a, c}; //pointers! 36 160 16 56 168 176 ap a[] b[] c[] 1 5 2 3 20 24 28 32 9 40 44 48 52 4 7 60 64 68 72 76

Element Access in Multi-Level Array int get_ap_digit (size_t index, size_t digit, int ap[][]) { return ap[index][digit]; } # salq $2, %rsi # 4*digit addq (%rdx,%rdi,8),%rsi # rsi = ap[index] + 4*digit movl (%rsi), %eax # return *p ret Computation Element access Mem[Mem[ap+8*index]+4*digit] Must do two memory reads First get pointer to row array Then access element within array

Array Element Accesses Nested array Multi-level array int get_MAT_digit (size_t index, size_t digit, int MAT[][]) { return MAT[index][digit]; } int get_ap_digit (size_t index, size_t digit, int *ap[]) { return ap[index][digit]; } Accesses looks similar in C, but address computations very different: Mem[MAT+20*index+4*digit] Mem[Mem[ap+8*index]+4*digit]

Next Arrays Structures Unions Floating Point One-dimensional Multi-dimensional (nested) Multi-level Structures Allocation Access Alignment Unions Floating Point

Structure Representation next 16 24 32 struct rec { int a[4]; size_t i; struct rec *next; }; a Structure represented as block of memory Big enough to hold all of the fields Fields ordered according to declaration Even if another ordering could yield a more compact representation Compiler determines overall size + positions of fields

Generating Pointer to Structure Member next 16 24 32 r+4*idx struct rec { int a[4]; size_t i; struct rec *next; }; a int *get_ap (struct rec *r, size_t idx) { return &r->a[idx]; } Generating Pointer to Array Element Offset of each structure member determined at compile time Compute as r + 4*idx # r in %rdi # idx in %rsi leaq (%rdi,%rsi,4), %rax ret

Structures & Alignment Unaligned Data Aligned Data Primitive data type requires K bytes Address must be multiple of K struct S1 { char c; int i[2]; double v; } *p; c i[0] i[1] v p p+1 p+5 p+9 p+17 c 3 bytes i[0] i[1] 4 bytes v p+0 p+4 p+8 p+16 p+24 Multiple of 4 Multiple of 8 Multiple of 8 Multiple of 8

Alignment Principles Aligned Data Why Align? If a primitive data type requires K bytes Then address must be multiple of K This is required on some machines X86-64 doesn’t require it, but it is advised Why Align? Memory is accessed in chunks for efficiency Chunks are aligned on 4 or 8 byte boundaries Inefficient to access datum that spans boundaries Compiler puts gaps in structure to ensure alignment

Specific Cases of Alignment (x86-64) 1 byte: char, … no restrictions on address 2 bytes: short, … lowest 1 bit of address must be 02 4 bytes: int, float, … lowest 2 bits of address must be 002 8 bytes: double, long, pointers, … lowest 3 bits of address must be 0002 16 bytes: long double (GCC on Linux) lowest 4 bits of address must be 00002 (or lowest hex digit since a hex digit is 4 bits)

Satisfying Alignment with Structures Within structure: Must satisfy each element’s alignment requirement Overall structure placement Each structure has alignment requirement K K = Largest alignment of any element Initial address & structure length must be multiples of K Example: K = 8, due to double element struct S1 { char c; int i[2]; double v; } *p; c 3 bytes i[0] i[1] 4 bytes v p+0 p+4 p+8 p+16 p+24 Multiple of 4 Multiple of 8 Multiple of 8 Multiple of 8

Meeting Overall Alignment Requirement For largest alignment requirement K Overall structure must be multiple of K struct S2 { double v; int i[2]; char c; } *p; v i[0] i[1] c 7 bytes p+0 p+8 p+16 p+24 Multiple of K=8

Arrays of Structures Overall structure length multiple of K struct S2 { double v; int i[2]; char c; } a[10]; Overall structure length multiple of K Satisfy alignment requirement for every element a[0] a[1] a[2] • • • a+0 a+24 a+48 a+72 v i[0] i[1] c 7 bytes a+24 a+32 a+40 a+48

Saving Space in a Structure Always put largest data types first Effect (K=4) struct S4 { char c; int i; char d; } *p; struct S5 { int i; char c; char d; } *p; c 3 bytes i d 3 bytes i c d 2 bytes

Union Allocation Allocate according to largest element union U1 { char c; int i[2]; double v; } *up; c i[0] i[1] v up+0 up+4 up+8 struct S1 { char c; int i[2]; double v; } *sp; c 3 bytes i[0] i[1] 4 bytes v sp+0 sp+4 sp+8 sp+16 sp+24

Using Union to Determine Endianness typedef union { char c; int i; } char_int_u; c 3 bytes i 0 4 int isLittleEndian() { charint_u test; test.i = 1; return test.c; }

Using Union to Access Bit Patterns union uf_union { unsigned u; float f; }; u f 4 union uf_union uf; unsigned s, exp, frac; uf.f = -1.5; s = uf.u >> 31; exp = (uf.u >> 23) & 0xFF; frac = uf.u & 0x007FFFFF; printf("Float %f\n", uf.f); printf("sign: 0x%X\n", s); printf("exp: 0x%X\n", exp); printf("frac: 0x%X\n", frac); Not the same as casting (unsigned)uf.f !

Byte Ordering Example x86 x86_64 union { unsigned char c[8]; unsigned short s[4]; unsigned int i[2]; unsigned long l[1]; } example; x86 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] s[0] s[1] s[2] s[3] i[0] i[1] l[0] x86_64 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] s[0] s[1] s[2] s[3] i[0] i[1] l[0]

Byte Ordering on IA32 Little Endian Output on IA32 (x86_32): f0 f1 f2 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] s[0] s[1] s[2] s[3] i[0] i[1] l[0] LSB MSB LSB MSB Print Output on IA32 (x86_32): Characters 0-7 == [0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7] Shorts 0-3 == [0xf1f0,0xf3f2,0xf5f4,0xf7f6] Ints 0-1 == [0xf3f2f1f0,0xf7f6f5f4] Long 0 == [0xf3f2f1f0]

Byte Ordering on Sun SPARC Big Endian f0 f1 f2 f3 f4 f5 f6 f7 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] s[0] s[1] s[2] s[3] i[0] i[1] l[0] MSB LSB MSB LSB Print Output on Sun: Characters 0-7 == [0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7] Shorts 0-3 == [0xf0f1,0xf2f3,0xf4f5,0xf6f7] Ints 0-1 == [0xf0f1f2f3,0xf4f5f6f7] Long 0 == [0xf0f1f2f3]

Byte Ordering on x86-64 Little Endian Output on x86-64: f0 f1 f2 f3 f4 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] s[0] s[1] s[2] s[3] i[0] i[1] l[0] LSB MSB Print Output on x86-64: Characters 0-7 == [0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7] Shorts 0-3 == [0xf1f0,0xf3f2,0xf5f4,0xf7f6] Ints 0-1 == [0xf3f2f1f0,0xf7f6f5f4] Long 0 == [0xf7f6f5f4f3f2f1f0]

Last … Arrays Structures Floating Point in x86-64 One-dimensional Multi-dimensional (nested) Multi-level Structures Allocation Access Alignment Floating Point in x86-64

XMM Registers: 16 registers (each holding 16 bytes) Sixteen 8-bit integers Eight 16-bit integers Four 32-bit integers Four single-precision floats Two double-precision floats One single-precision float One double-precision float

Scalar or SIMD (single instructiom multiple data) + %xmm0 %xmm1 addss %xmm0,%xmm1 Scalar Operations: Single Precision SIMD Operations: Single Precision Scalar Operations: Double Precision + %xmm0 %xmm1 addps %xmm0,%xmm1 + %xmm0 %xmm1 addsd %xmm0,%xmm1

FP Basics Arguments passed in %xmm0, %xmm1, ... Result returned in %xmm0 All XMM registers caller-saved (no callee saved XMM regs) addss means add scalar single-precision addsd means add scalar double-precision float fadd(float x, float y) { return x + y; } double dadd(double x, double y) { return x + y; } # x in %xmm0, y in %xmm1 addss %xmm1, %xmm0 ret # x in %xmm0, y in %xmm1 addsd %xmm1, %xmm0 ret

FP Memory Referencing Integer (and pointer) arguments passed in regular registers FP values passed in XMM registers Different mov instructions to move between XMM registers, and between memory and XMM registers double dincr(double *p, double v) { double x = *p; *p = x + v; return x; } # p in %rdi, v in %xmm0 movapd %xmm0, %xmm1 # Copy v movsd (%rdi), %xmm0 # x = *p addsd %xmm0, %xmm1 # t = x + v movsd %xmm1, (%rdi) # *p = t ret

Other Aspects of FP Code Lots of instructions Floating-point comparisons Instructions ucomiss and ucomisd Set condition codes CF, ZF, and PF (parity flag) FP constants: stored in memory like a string Set XMM reg to 0.0 with xorpd %xmm0,%xmm0 addsd .LC0(%rip), %xmm0 addsd .LC1(%rip), %xmm0 … .LC0: .long 2920577761 .long 1079533895 .LC1: .long 1202590843 .long 1079597793

Summary Arrays Structures Floating Point Elements packed into contiguous region of memory Use index arithmetic to locate individual elements Structures Elements packed into single region of memory Access using offsets determined by compiler Padding may be required to to ensure alignment Floating Point Data held and operated on in XMM registers Similar to Integer machine code, but more complex instructions

Reading Read: 3.8 and 3.9 (pages 255  ) Skip or Skim 3.10 Skim 3.11 floating point This Thursday: mid-term exam review Mid-Term is Tuesday March 6th