Nicholas Weaver & Vladimir Stojanovic

Slides:



Advertisements
Similar presentations
CS1061: C Programming Lecture 21: Dynamic Memory Allocation and Variations on struct A. O’Riordan, 2004, 2007 updated.
Advertisements

CSCI 171 Presentation 11 Pointers. Pointer Basics.
1 Memory Allocation Professor Jennifer Rexford COS 217.
Kernighan/Ritchie: Kelley/Pohl:
CS 61C: Great Ideas in Computer Architecture Introduction to C, Part III Instructors: Krste Asanovic & Vladimir Stojanovic
Introduction to C Programming in Unix Environment - II Abed Asi Extended System Programming Laboratory (ESPL) CS BGU Fall 2014/2015 Some slides.
Memory Arrangement Memory is arrange in a sequence of addressable units (usually bytes) –sizeof( ) return the number of units it takes to store a type.
CS61C L04 Introduction to C (pt 2) (1) Garcia, Fall 2011 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C.
CS 61C L4 Structs (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #4: Strings & Structs
CS 61C L03 C Arrays (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #3: C Pointers & Arrays
CS61C L06 C Memory Management (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
1 CS 201 Dynamic Data Structures Debzani Deb. 2 Run time memory layout When a program is loaded into memory, it is organized into four areas of memory.
Run-time Environment and Program Organization
CS 61C: Great Ideas in Computer Architecture Lecture 4: Introduction to C, Part III Instructor: Sagar Karandikar
Pointers Applications
CS 61C L03 C Arrays (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/ CS61C : Machine Structures Lecture #3: C Pointers & Arrays
CS 61C L04 C Structures, Memory Management (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS 61C Great Ideas in Computer Architecture Lecture 3: Introduction to C, Part II Instructor: Sagar Karandikar
Arrays and Pointers in C Alan L. Cox
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
17. ADVANCED USES OF POINTERS. Dynamic Storage Allocation Many programs require dynamic storage allocation: the ability to allocate storage as needed.
1 C - Memory Simple Types Arrays Pointers Pointer to Pointer Multi-dimensional Arrays Dynamic Memory Allocation.
Chapter 0.2 – Pointers and Memory. Type Specifiers  const  may be initialised but not used in any subsequent assignment  common and useful  volatile.
CS 61C: Great Ideas in Computer Architecture Introduction to C, Part II Instructors: John Wawrzynek & Vladimir Stojanovic
University of Washington Today Finished up virtual memory On to memory allocation Lab 3 grades up HW 4 up later today. Lab 5 out (this afternoon): time.
CS 61C: Great Ideas in Computer Architecture Introduction to C, Part III Instructors: John Wawrzynek & Vladimir Stojanovic
Pointers: Basics. 2 What is a pointer? First of all, it is a variable, just like other variables you studied  So it has type, storage etc. Difference:
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
Cs641 pointers/numbers.1 Overview °Stack and the Heap °malloc() and free() °Pointers °numbers in binary.
Computer Organization and Design Pointers, Arrays and Strings in C Montek Singh Sep 18, 2015 Lab 5 supplement.
Topics memory alignment and structures typedef for struct names bitwise & for viewing bits malloc and free (dynamic storage in C) new and delete (dynamic.
Topic 4: C Data Structures CSE 30: Computer Organization and Systems Programming Winter 2011 Prof. Ryan Kastner Dept. of Computer Science and Engineering.
1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.
CS 61C: Great Ideas in Computer Architecture Introduction to C, Part II Instructors: Krste Asanovic & Vladimir Stojanovic
MORE POINTERS Plus: Memory Allocation Heap versus Stack.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Lucas Bang Lecture 11: Pointers.
Chapter 5 Pointers and Arrays Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh.
C Tutorial - Pointers CS 537 – Introduction to Operating Systems.
Pointers: Basics. 2 Address vs. Value Each memory cell has an address associated with it
CS 61C: Great Ideas in Computer Architecture C Pointers Instructors: Vladimir Stojanovic & Nicholas Weaver 1.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Arrays and Pointers (part 1) CSE 2031 Fall July 2016.
Variables Bryce Boe 2012/09/05 CS32, Summer 2012 B.
1 Binghamton University Exam 1 Review CS Binghamton University Birds eye view -- Topics Information Representation Bit-level manipulations Integer.
CSE 220 – C Programming malloc, calloc, realloc.
Dynamic Allocation in C
EGR 2261 Unit 11 Pointers and Dynamic Variables
Stack and Heap Memory Stack resident variables include:
Computer Organization and Design Pointers, Arrays and Strings in C
Winter 2009 Tutorial #6 Arrays Part 2, Structures, Debugger
C Primer.
Run-time organization
Pointers and Memory Overview
CSCI206 - Computer Organization & Programming
The Hardware/Software Interface CSE351 Winter 2013
CS 61C: Great Ideas in Computer Architecture Lecture 3: Pointers
Clear1 and Clear2 clear1(int array[], int size) { int i; for (i = 0; i < size; i += 1) array[i] = 0; } clear2(int *array, int size) {
Lecturer SOE Dan Garcia
Memory Allocation CS 217.
Outline Defining and using Pointers Operations on pointers
Dynamic Memory Allocation (and Multi-Dimensional Arrays)
Dynamic Memory A whole heap of fun….
Pointers The C programming language gives us the ability to directly manipulate the contents of memory addresses via pointers. Unfortunately, this power.
Dynamic Allocation in C
Dynamic Memory A whole heap of fun….
Homework Continue with K&R Chapter 5 Skipping sections for now
C Programming - Lecture 5
Dynamic Memory – A Review
15213 C Primer 17 September 2002.
Module 13 Dynamic Memory.
Presentation transcript:

Nicholas Weaver & Vladimir Stojanovic CS 61C: Great Ideas in Computer Architecture C Arrays, Strings and Memory Management Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.Berkeley.edu/~cs61c/sp16

Reminder! C Arrays are Very Primitive An array in C does not know its own length, and its bounds are not checked! Consequence: We can accidentally access off the end of an array Consequence: We must pass the array and its size to any procedure that is going to manipulate it Segmentation faults and bus errors: These are VERY difficult to find; be careful! (You’ll learn how to debug these in lab) But also “fun” to exploit: “Stack overflow exploit”, maliciously write off the end of an array on the stack “Heap overflow exploit”, maliciously write off the end of an array on the heap If you write programs in C, you will write code that has array-bounds errors!

Use Defined Constants Array size n; want to access from 0 to n-1, so you should use counter AND utilize a variable for declaration & incrementation Bad pattern int i, ar[10]; for(i = 0; i < 10; i++){ ... } Better pattern const int ARRAY_SIZE = 10; int i, a[ARRAY_SIZE]; for(i = 0; i < ARRAY_SIZE; i++){ ... } SINGLE SOURCE OF TRUTH You’re utilizing indirection and avoiding maintaining two copies of the number 10 DRY: “Don’t Repeat Yourself” And don’t forget the < rather than <=: When Nick took 60c, he lost a day to a “segfault in a malloc called by printf on large inputs”: Had a <= rather than a < in a single array initialization!

When Arrays Go Bad: Heartbleed In TLS encryption, messages have a length… And get copied into memory before being processed One message was “Echo Me back the following data, its this long...” But the (different) echo length wasn’t checked to make sure it wasn’t too big... So you send a small request that says “read back a lot of data” And thus get web requests with auth cookies and other bits of data from random bits of memory… M 5 HB L=5000 107:Oul7;GET / HTTP/1.1\r\n Host: www.mydomain.com\r\nCookie: login=1 17kf9012oeu\r\nUser-Agent: Mozilla….

Pointing to Different Size Objects Modern machines are “byte-addressable” Hardware’s memory composed of 8-bit storage cells, each has a unique address Type declaration tells compiler how many bytes to fetch on each access through pointer E.g., 32-bit integer stored in 4 consecutive 8-bit bytes But we actually want “Byte alignment” Some processors will not allow you to address 32b values without being on 4 byte boundaries Others will just be very slow if you try to access “unaligned” memory. short *y int *x char *z 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 Byte address 8-bit character stored in one byte 16-bit short stored in two bytes 32-bit integer stored in four bytes

sizeof() operator sizeof(type) returns number of bytes in object But number of bits in a byte is not standardized In olden times, when dragons roamed the earth, bytes could be 5, 6, 7, 9 bits long By definition, sizeof(char)==1 C does not play well with Unicode (unlike Python), so no char c = ‘💩’ Can take sizeof(arg), or sizeof(structtype) Structure types get padded to ensure structures are also aligned We’ll see more of sizeof when we look at dynamic memory management

Pointer Arithmetic pointer + number pointer – number e.g., pointer + 1 adds 1 something to a pointer char *p; char a; char b; p = &a; p += 1; int *p; int a; int b; In each, p now points to b (Assuming compiler doesn’t reorder variables in memory. Never code like this!!!!) Adds 1*sizeof(char) to the memory address Adds 1*sizeof(int) to the memory address Pointer arithmetic should be used cautiously (and by Nick’s standard, “cautious” == Almost NEVER!)

Arrays and Pointers a[i]  *(a+i) Passing arrays: Array  pointer to the initial (0th) array element a[i]  *(a+i) An array is passed to a function as a pointer The array size is lost! Usually bad style to interchange arrays and pointers Avoid pointer arithmetic! Must explicitly pass the size Really int *array int foo(int array[], unsigned int size) { … array[size - 1] … } main(void) int a[10], b[5]; … foo(a, 10)… foo(b, 5) …

Arrays and Pointers What does this print? 4 foo(int array[], unsigned int size) { … printf(“%d\n”, sizeof(array)); } main(void) int a[10], b[5]; … foo(a, 10)… foo(b, 5) … printf(“%d\n”, sizeof(a)); What does this print? 4 ... because array is really a pointer (and a pointer is architecture dependent, but likely to be 4 or 8 on modern 32-64 bit machines!) One difference is that you cannot reassign the name a to point to another object, so the compiler knows at compile time the size of the array, which {10*8}, which is why it prints 80 What does this print? 40

Arrays and Pointers These code sequences have the same effect! int i; int array[10]; for (i = 0; i < 10; i++) { array[i] = …; } int *p; int array[10]; for (p = array; p < &array[10]; p++) { *p = …; } These code sequences have the same effect! But the former is much more readable:

Clickers/Peer Instruction Time int x[] = { 2, 4, 6, 8, 10 }; int *p = x; int **pp = &p; (*pp)++; (*(*pp))++; printf("%d\n", *p); Result is: A: 2 B: 3 C: 4 D: 5 E: None of the above

Administrivia hw0 - edX is due on Sunday 1/31, hw0 mini bio is due in lab next week, turn in to your lab TA MT1 will be Thursday, 2/25 from 6-8PM MT2 will be Monday, 4/4 from 7-9 PM Email William and Fred if you have conflicts (except for 16B, that is currently being resolved)

61C Instructor in the News… Nick presented at the Enigma conference: “The Golden Age of Bulk Surveillance” https://www.youtube.com/watch?v=zqnKdGnzoh0

Concise strlen() int strlen(char *s) { char *p = s; while (*p++) ; /* Null body of while */ return (p – s – 1); } What happens if there is no zero character at end of string?

Point past end of array? Array size n; want to access from 0 to n-1, but test for exit by comparing to address one element past the array int ar[10], *p, *q, sum = 0; ... p = &ar[0]; q = &ar[10]; while (p != q) /* sum = sum + *p; p = p + 1; */ sum += *p++; Is this legal? C defines that one element past end of array must be a valid address, i.e., not cause an error BUT DO NOT DO THIS: This is ONLY valid for arrays declared in this manner, NOT arrays declared using dynamic allocation (malloc)!

Valid Pointer Arithmetic Add an integer to a pointer. Subtracting an integer from a pointer Subtract 2 pointers (in the same array) Compare pointers (<, <=, ==, !=, >, >=) Compare pointer to NULL (indicates that the pointer points to nothing) Everything else illegal since makes no sense: adding two pointers multiplying pointers subtract pointer from integer

Arguments in main() To get arguments to the main function, use: int main(int argc, char *argv[]) What does this mean? argc contains the number of strings on the command line (the executable counts as one, plus one for each argument). Here argc is 2: unix% sort myFile argv is a pointer to an array containing the arguments as strings

Example foo hello 87 "bar baz" argc = 4 /* number arguments */ argv[0] = "foo", argv[1] = "hello", argv[2] = "87", argv[3] = "bar baz", Array of pointers to strings

C Memory Management How does the C compiler determine where to put all the variables in machine’s memory? How to create dynamically sized objects? To simplify discussion, we assume one program runs at a time, with access to all of memory. Later, we’ll discuss virtual memory, which lets multiple programs all run at same time, each thinking they own all of memory.

C Memory Management stack heap static data code Memory Address (32 bits assumed here) ~ FFFF FFFFhex stack Program’s address space contains 4 regions: stack: local variables inside functions, grows downward heap: space requested for dynamic data via malloc(); resizes dynamically, grows upward static data: variables declared outside functions, does not grow or shrink. Loaded when program starts, can be modified. code: loaded when program starts, does not change heap static data code ~ 0000 0000hex 21

Where are Variables Allocated? If declared outside a function, allocated in “static” storage If declared inside function, allocated on the “stack” and freed when function returns main() is treated like a function int myGlobal; main() { int myTemp; }

The Stack fooA frame fooB frame fooC frame fooD frame Stack Pointer Every time a function is called, a new frame is allocated on the stack Stack frame includes: Return address (who called me?) Arguments Space for local variables Stack frames uses contiguous blocks of memory; stack pointer indicates start of stack frame When function ends, stack pointer moves up; frees memory for future stack frames We’ll cover details later for MIPS processor fooA() { fooB(); } fooB() { fooC(); } fooC() { fooD(); } fooA frame fooB frame fooC frame fooD frame Stack Pointer

Stack Animation Last In, First Out (LIFO) data structure stack main () } Stack grows down Stack Pointer Stack Pointer void a (int m) { b(1); } Stack Pointer void b (int n) { c(2); } Stack Pointer void c (int o) { d(3); } Stack Pointer void d (int p) { }

Managing the Heap C supports functions for heap management: malloc() allocate a block of uninitialized memory calloc() allocate a block of zeroed memory free() free previously allocated block of memory realloc() change size of previously allocated block careful – it might move!

Malloc() void *malloc(size_t n): Allocate a block of uninitialized memory NOTE: Subsequent calls probably will not yield adjacent blocks n is an integer, indicating size of requested memory block in bytes size_t is an unsigned integer type big enough to “count” memory bytes Returns void* pointer to block; NULL return indicates no more memory Additional control information (including size) stored in the heap for each allocated block. Examples: int *ip; ip = (int *) malloc(sizeof(int)); typedef struct { … } TreeNode; TreeNode *tp = (TreeNode *) malloc(sizeof(TreeNode)); sizeof returns size of given type in bytes, produces more portable code “Cast” operation, changes type of a variable. Here changes (void *) to (int *)

Managing the Heap void free(void *p): Releases memory allocated by malloc() p is pointer containing the address originally returned by malloc() int *ip; ip = (int *) malloc(sizeof(int)); ... .. .. free((void*) ip); /* Can you free(ip) after ip++ ? */ typedef struct {… } TreeNode; TreeNode *tp = (TreeNode *) malloc(sizeof(TreeNode)); free((void *) tp); When insufficient free memory, malloc() returns NULL pointer; Check for it! if ((ip = (int *) malloc(sizeof(int))) == NULL){ printf(“\nMemory is FULL\n”); exit(1); /* Crash and burn! */ } When you free memory, you must be sure that you pass the original address returned from malloc() to free(); Otherwise, system exception (or worse)! And never use that memory again “use after free”

Using Dynamic Memory Root Key=10 Left Right Key=5 Left Right Key=16 typedef struct node { int key; struct node *left; struct node *right; } Node; Node *root = NULL; Node *create_node(int key, Node *left, Node *right) { Node *np; if ( (np = (Node*) malloc(sizeof(Node))) == NULL) { printf("Memory exhausted!\n"); exit(1); } else { np->key = key; np->left = left; np->right = right; return np; } void insert(int key, Node **tree) if ( (*tree) == NULL) { (*tree) = create_node(key, NULL, NULL); return; } if (key <= (*tree)->key) insert(key, &((*tree)->left)); insert(key, &((*tree)->right)); insert(10, &root); insert(16, &root); insert(5, &root); insert(11 , &root); Root Key=10 Left Right Key=5 Left Right Key=16 Left Right Key=11 Left Right

Observations Code, Static storage are easy: they never grow or shrink Stack space is relatively easy: stack frames are created and destroyed in last-in, first-out (LIFO) order Managing the heap is tricky: memory can be allocated / deallocated at any time If you forget to deallocate memory: “Memory Leak” Your program will eventually run out of memory If you call free twice on the same memory: “Double Free” Possible crash or exploitable vulnerability If you use data after calling free: “Use after free”

Clickers/Peer Instruction! int x = 2; int result; int foo(int n) { int y; if (n <= 0) { printf("End case!\n"); return 0; } else { y = n + foo(n-x); return y; } result = foo(10); Right after the printf executes but before the return 0, how many copies of x and y are there allocated in memory? A: #x = 1, #y = 1 B: #x = 1, #y = 5 C: #x = 5, #y = 1 D: #x = 1, #y = 6 E: #x = 6, #y = 6

And In Conclusion, … C has three main memory segments in which to allocate data: Static Data: Variables outside functions Stack: Variables local to function Heap: Objects explicitly malloc-ed/free-d. Heap data is biggest source of bugs in C code