Data Structure 김용성 Data Structure in C
chapter 1 BASIC CONCEPTS 1.1 overview : system life cycle (foundation : data abstraction, algorithm specification performance & discussion ) analysis and measure System life cycle (1) requirement define the purpose of the project (2) analysis break the problem down into manageable piece approach method : button-up & Top-down Data Structure in C
chapter 1 BASIC CONCEPTS System life cycle - cont. (3) design abstract data type (language independent) specification of algorithm (4) Refinement & coding representation for data objects and write algorithm efficiency of the algorithm ; data objects representation (5) Verification correctness proofs for program testing the program with variety of input data & removing errors Data Structure in C
chapter 1 BASIC CONCEPTS 1.2 Algorithm Specification Def : a finite set of instruction accomplished a particular task Criteria input : Zero or more quantities are externally supplied output : At least one quantity is produced Definiteness : Clear and unambiguous Finiteness : terminate a finite number of steps Effectiveness : Definiteness & feasible describe algorithm in many way natural language flowchart Data Structure in C
chapter 1 BASIC CONCEPTS Ex 1.1> Selection sort ; sorts a set of n 1 integers Simple solution From those integers that are currently unsorted, find the smallest and place it next in the sorted list. describe the sorting problem but not an algorithm (정수가 초기에 저장, 결과 저장) i th integer stored i th position, list[ i ] , 0 i n for ( i = 0 ; i < n ; i++) { Examine list [ i ] to list [ n - 1] and suppose that the smallest integer is at list [min] ; interchange list [ i ] and list[min]; } Data Structure in C
chapter 1 BASIC CONCEPTS macro version void swap( int *x, int *y ) /* both parameters are pointers to ints */ { int temp = *x ; /* declares temp as an int and assigns to it the contents of what x points to */ *x = *y ; /* stores what y points to into the location where x points */ &y = temp ; /* places the contents of temp in location pointed to by y */ } implementation Selection Sort Data Structure in C
chapter 1 BASIC CONCEPTS Theorem 1.1 : Function sort(list,n) correctly sorts a set of n 1 integers. The result remains in list[0], … list[n-1] such that list[0] list[1] … list[n-1] pf> i : q에 대해서 for문이 완료되면 list[q] list[r] (단, q < r< n)이 된다. 또한, i > q 이면 list[0]에서 list[q] 까지 변하지 않는다. 따라서 for문의 마지막 문을 수행하면 ( i = n-2 ) list[0] list[1] … list[n-1]가 된다. Ex 1.2> [ Binary search ] : already sorted and stored in array list [ ] (1) searchnum < list[ middle ] : right = middle - 1 (2) searchnum = list[ middle ] : return middle (3) searchnum > list[ middle ] : left = middle + 1 Data Structure in C
chapter 1 BASIC CONCEPTS Binary search algorithm While ( there are more integers to check ) { middle = (left + right ) / 2; if (searchnum < list[ middle ] ) right = middle - 1; else if (searchnum == list[ middle ] ) return middle; else left = middle + 1; } Implementation compare macro version define compare(x, y) ( ( (x) < (y) ) ? -1 : ( (x) == (y) ? 0 : 1 ) compare function version Data Structure in C
chapter 1 BASIC CONCEPTS int compare ( int x, int y) { /* compare x and y, return -1 for less than, 0 for equal, 1 for greater */ if ( x < y ) return -1 ; else if ( x == y ) return 0 ; else return 1 ; } Searching an ordered list int binsearch( int list[ ], int searchnum, int left, int right) /* search list[0] <= list[1] <= ··· <= list[n-1] for searchnum. Return its position if found. Otherwise return -1 */ int middle ; Data Structure in C
chapter 1 BASIC CONCEPTS Searching an ordered list -cont. while (left <= right ) { middle = (left + right ) /2 ; switch ( COMPARE (list [ middle ], searchnum) ) { case -1 : left = middle + 1 ; break; case 0 : return middle ; case 1 : right = middle -1 ; } return -1 ; Data Structure in C
chapter 1 BASIC CONCEPTS 1.2.2. Recursive algorithm recursive 종류 direct recursion : call themselves indirect recursion : other function invoke the calling function recursive 장점 powerfully & express complex process in very simple recursive ex. N! = n * (n-1)! : f(n) = n * f(n-1) nCm = n-1Cm-1 + x-1Cm : f(n , m) = f(n-1 , m-1) + f(n-1 , m) Implementation ex int binsearch( int list[ ], int searchnum, int left, int right) { /* search list[0] <= list[1] <= ··· <= list[n-1] for searchnum. Return Data Structure in C
chapter 1 BASIC CONCEPTS Implementation ex -cont. its position if found. Otherwise return -1 */ int middle; if (left <= right ) { middle = (left + right ) /2 ; switch ( COMPARE (list [ middle ], searchnum) ) { case -1 : return binsearch(list, searchnum, middle + 1, right); case 0 : return middle ; case 1 : return binsearch(list, searchnum, left , middle -1 ); } return -1 ; Data Structure in C
chapter 1 BASIC CONCEPTS Ex 1.4> [ Permutation ] : n 1, all possible permutation of set list : n! Perform(list, 0, 2) perform ( list, 0, 2) i = 0 ; j = 0 I = 1, j = 1 perform(list, 1, 2) Perform(list, 2,2) : a b c i= 1 j = 2 i = 0 ; j = 1 perform(list, 2, 2) : a e b perform(list, 1, 2) perform(list, 2, 2) : b a c i = 0 ; j = 2 perform(list, 2, 2) : b c a perform(list, 1, 2) perform(list, 2, 2) : c a b perform(list, 2, 2) : c b a a b c Data Structure in C
chapter 1 BASIC CONCEPTS 1. 3 Data Abstraction basic data type : char, int, float ( predefined data type) group data structure : user defined data type Array : homogenous data type Struct : hetrogenous data type Ex1> struct student { char last_name ; int student_id ; char grade ; } Ex2> int i, *pi ; Data Structure in C
chapter 1 BASIC CONCEPTS Def : data type ; collection objects and set of operation that act on those object Ex3> int { 0, 1, 2, …. Int-max, int-min} … Objects { +, –, *, /, % } … operation Def : ADT : Specification of the objects and specification of operation on the objects is separated from representation of objects and implemation of operation distinction between specification and implementation language Ada : package C++ : class C : ADT는 표현하지 못하지만 same notation 이용하면 가능 specification ; names of every function, type do arguments type of its result Data Structure in C
chapter 1 BASIC CONCEPTS ADT는 구현에 관해서 독립이고, 몇 개의 함수로 표시 (1) Creator/constructor : create a new instance of the designated type (2) Transformers : create a new instance by using one or more other instance (3) Observers/reports : provide information about an instance but not change the instance Ex1.5> [ ADT of natural - Number] Structure Natural_Number is object: an ordered subrange of the integers starting at zero at and ending at the maximum integer (INT_MAX) on the computer functions: for all x, y Nat_Number, TRUE,FALSE Boolean and where +, -, <, and == are the usual integer operations Data Structure in C
chapter 1 BASIC CONCEPTS Ex1.5> [ ADT of natural - Number] -cont. Nat_No Zero( ) ::= 0 Boolean Is_Zero(x) ::= if (x) return FALSE else return TRUE Nat_No Add(x,y) ::= if((x + y)<= INT_MAX) return x + y else return INT_MAX Boolean Equal(x,y) ::= if(x == y) return TRUE else return FALSE Nat_No Successor(x) ::= if (x == INT_MAX) return x else return x + 1 Nat_No Subtract(x, y) ::= if(x < y) return 0 else return x - y end Natural_Number Data Structure in C
chapter 1 BASIC CONCEPTS 1.4 Performance Analysis Criteria program meet the original specifications work correctly document that show how to use it and how it works effectively use function to create logical unit program’s code readable Performance analysis : complexity theory efficiently use primary and secondary storage running time acceptable for the task Performance measurement : space & time complexity Def : space complexity : a mount of memory needs to run completion : time complexity : amount of computer time needs to run completion Data Structure in C
chapter 1 BASIC CONCEPTS 1. 4. 1 Space Complexity space complexity is the sum fixe & variable space requirement (1) fixed space requirements do not depend on the number and size of program’s I/O instruction space, simple variable, fixed size structured variable contents (2) variable space requirements structured variable whose size depend on the particular instance I of the problem being solved Sp(I) : space requirement of a program P working on an instance I I : characteristics in the number, size and value of inputs Ex> : 입력이 n개의 요소는 갖는 배열인 경우 n은 인스턴스 특성이며, 이때 Sp(n)으로 표시한다. S(p) : total space requirement = c +Sp(I) Data Structure in C
chapter 1 BASIC CONCEPTS Ex 1.6> Sabc(I) = 0 ; space complexity = Sp(I) float abc(float a, float b, float c) { return a+b+b*c +(a+b-c) / (a+b)+ 4.00 ; } Ex 1.7> Call by value Ssum(I) = Ssum(n) ; array input (interactive function) float sum(float list[ ], int n) float tempsum = 0 ; int i ; for ( i = 0; i < n ; i++) tempsum += list[ i ] ; return tempsum ; C에서는 배열의 시작 주소만 전달하므로 C에서는 Ssum(n) = 0 Data Structure in C
chapter 1 BASIC CONCEPTS Ex 1.8> recursive function float rsum(float list[ ], int n ) { if ( n ) return rsum( list, n-1) + list[ n-1 ] ; return 0 ; } the number of bytes required for one recursive call 배열이 n = MAX_SIZE이면 Srsum(MAX_SIZE) = 6*MAX_SIZE recursive version has far greater overhead than its iterative version Type Parameter: float parameter: integer return address: (used internally) TOTAL per recursive call Number of bytes Name List[] n 2 2(unless a far address) 6 Data Structure in C
chapter 1 BASIC CONCEPTS 1. 4. 2. Time Complexity T( p ) : Sum of its compile time & run ( execution ) time 일반적으로 T( p ) = run time ( Compile time - fixed time ) Ex> Tp( n ) = C1ADD(n) + C2sub(n) + C3LDA(n) + C4STA(n) Def : program step : 인스턴스 특성에 관계없이 실행 시간이 구문적이나 의미적으로 독립적인 뜻을 갖는 프로그램 Segment Ex 1.9> [ iterative summing of a list of numbers ] count value insert float sum( float list[ ], int n ) { float tempsum = 0 ; count++ ; /* for assignment */ int i ; for ( i = 0; i < n; i++ ) { Data Structure in C
chapter 1 BASIC CONCEPTS count++; /* for the for loop */ tempsum += list[ i ] ; count++;/* for assignment */ } count++ ; /* last execution of for */ count++ ; /*for return */ return tempsum ; Simplified Version < 2n + 3 > float sum( float list[ ], int n ) { float tempsum = 0 ; int i ; for ( i = 0; i < n; i ++ ) count += 2; count += 3 ; return 0 ; Data Structure in C
chapter 1 BASIC CONCEPTS Ex 1.10> [ Recursive summing of a list of numbers ] float rsum( float list[ ], int n ) { count++ ; /*for if conditional */ if ( n ) { count++ ; /*for return and rsum invocation */ return rsum(list, n-1) + list[ n-1 ] ; } count++; return list[ 0 ]; n = 0 일 때, 총 단계 수는 2 n > 0 일 때, 총단계 수는 2n +2 각 단계는 얼마나 많은 시간이 걸리는지 모른다. recursive iteration 보다 많은 시간이 소요 된다. Data Structure in C
chapter 1 BASIC CONCEPTS Ex1.11 > [ matrix addition ] c a+b ( rows *cols ) void add( int a[ ] [MAX_SIZE], in b[ ] [MAZ_SIZE], int c[ ] [MAZ_SIZE], int rows, int cols ) { int i, j ; for ( i = 0; i < rows; i++ ) { for (j = 0; j < cols; j++ ) coust += 2; count +=2; } count++; count = 0, count = 2rows*cols + 2rows +1 rows > cols 인 경우는 cols >rows 로 계산 하는 것이 유리함 Data Structure in C
chapter 1 BASIC CONCEPTS Tabular method ( s/e : steps / execution ) total steps = s/e * frequency (number of times) Ex1.12 > [ iteration function to sum of list of number ] for loop : 0부터 n까지 이므로 frequency가 n+1회 tempsum은 frequency가 n 회 Statement float sum(float list[ ], int n) { float tempsum = 0; int i ; for (i = 0; i < n; i++ ) tempsum += list[ i ] ; return tempsum; } Total s/e Frequency Total steps 2n+3 0 0 0 1 1 1 1 n+1 n+1 1 n n Data Structure in C
chapter 1 BASIC CONCEPTS Tabular method ( s/e : steps / execution ) -cont. Ex 1.13> [ recursive function to sum a list of numbers ] Statement Total s/e Frequency Total steps 2n+2 0 0 0 1 n+1 n+1 1 n n 1 1 1 float rsum(float list[ ], int n) { if ( n) return rsum(list, n-1) + list[n-1] ; return list[ 0 ] } Data Structure in C
chapter 1 BASIC CONCEPTS Tabular method ( s/e : steps / execution ) -cont. Ex 1.14> [ matrix addition ] 입력 크기가 증가 함에 따라 연산시간의 증가 Statement Total s/e Frequency Total steps 2 rows*cols +2rows +1 0 0 0 1 rows+1 rows+1 1 rows*(cols+1) rows*cols + rows 1 rows*cols rows*cols Void add(int a[ ] [MAX_SIZE]…) { int i , j ; for ( i=0 ; i < rows ; i++ ) for ( j=0 ; j < cols ; j++ ) c[i][j] = a [i][j] + b [i][j] } Data Structure in C
chapter 1 BASIC CONCEPTS Tabular method ( s/e : steps / execution ) -cont. Basic operation ; binsearch [ program 1.6 ] parameter = element 수 n으로 결정하는 경우 search num에 따라 수행 횟수가 달라진다. Best case step count : minimum of steps worst case step count : maximum number of steps average case step count : average number of steps Data Structure in C
chapter 1 BASIC CONCEPTS Asymptotic notation break even point ( 손익 분기점 ) n 의 값 C1, C2, C3, n ; C3n C1n2 + C2n Def [ Big “oh” ] : at most ( lower bound) f(n) = O( g(n) ) iff , c, no f(n) cg(n) , for all n n0 Ex 1.15> : n 2 일 때 3n+2 4n 이므로 3n +2 = O(n) O(1) : construct time O(1)<O(lgn)<O(n)<O(nlgn)<O(n2)<O(n3) n(2n)<O(n!) f(n) = O( g(n) ) O( g(n) ) = f(n) symbol “=“ as “is” and not as “equal” Data Structure in C
chapter 1 BASIC CONCEPTS theorem 1.2 : f(n) = amnm + + a1n +ao f(n) = O(nm) pf> f(n) | ai | ni nm | ai | ni-m nm | ai | , for n 1 so, f(n) = O( nm ) Def [ omega ] ; at least ( upper bound ) f(n) = ( g(n) ) iff c, no f(n) c( g(n) ), for all n no Ex 1.16> 3n + 2 (n) as 3n + 2 3n for n 1 theorem 1.3 : amnm + + a1n +ao f(n) = (nm) Def [ theta ] ; upper & lower bound C1, C2, no C1g(n) f(n) C2g(n) for all n n0 Tsum(n) = 2n+3 ; Tsum(n) = (n) Tsum(rows, cols) = 2row.cols + 2rows +1 = (row * cols) Data Structure in C
chapter 1 BASIC CONCEPTS EX 1.18> [Complexity of matrix addition] Figure 1.5 Time complexity of matrix addition 접근적 관계 수 입력(비 실행문은 0으로 대입) line complexity중 최대값을 complexity time으로. Ex 1.19> binary search Ex 1.20> magic square Statement Asymptotic complexity void add(int a[][MAX_SIZE]…) { int i, j; for (i=0; i<rows; i++) for (j=0; j<cols; j++) c[i][j] = a[i][j] + b[i][j]; } (rows) (rows.cols) Total (rows.cols) Data Structure in C
chapter 1 BASIC CONCEPTS 1.4.4 Practical Complexities p=(n), Q= (n2)일 때 P > Q인가? Function value Figure 1.7 Function values Time Name 1 1 1 1 1 1 0 1 2 3 4 5 1 2 4 8 16 32 0 2 8 24 64 160 1 4 16 64 256 1024 1 8 64 512 4096 32768 1 log n n n log n n2 n3 Constant Logarithmic Linear Log linear Quadratic Cubic 2 4 16 256 65536 4294967296 1 2 24 40326 20922789888000 26313 * 1033 Instance characteristic n 2n n! Exponential Factorial Data Structure in C
chapter 1 BASIC CONCEPTS 1.5 Performance measurement #include <time.h>; cpu소요시간 검증 s = microsecond = 10-6 seconds, ms = millisecond = 10-3 seconds sec = seconds, min = minutes, hr = hours, d = days, yr = years Figure 1.9 Times on a 1 billion instruction per second computer Time for f(n) instructions on a 109 instr/sec computer n f(n) = n f(n) = log2n f(n) = n2 f(n) = n3 f(n) = n4 f(n) = n10 f(n) = 2n 10 20 30 40 50 100 1,000 10,000 100,000 1,000,000 .01s .02s .03s .04s .05s .10s 1.00s 10.00s 100.00s 1.00ms .03s .09s .15s .21s .28s .66s 9.96s 130.03s 1.66ms 19.92ms .1s .4s .9s 1.6s 2.5s 10s 1ms 100ms 10sec 16.67min 1s 8s 27s 64s 125s 1ms 1sec 16.67min 11.57d 31.71yr 10s 160s 810s 2.56ms 6.25ms 100ms 16.67min 115.7d 3171yr 3.17*107yr 10sec 2.84hr 6.83d 121.36d 3.1yr 3171yr 3.17*1013yr 3.17*1023yr 3.17*1033yr 3.17*1043yr 1s 1ms 1sec 18.3min 13d 4*1013yr 32*10283yr Data Structure in C
chapter 1 BASIC CONCEPTS EX 1.22> [Worst case performance of the selection function] worst case of selection sort; reverse order 데이터 건수, 10, 20 ………. ,1600 그래프가 (n2)가 유사하다 #include <stdio.h> #include <time.h> #define MAX_SIZE 1601 #define ITERATIONS 26 #define SWAP(x, y, t) ((t) = (x), (x) = (y), (y) = (t)) void main(void) { int i, j, position; int list[MAX_SIZE]; int sizelist[] = {0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500 ,600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600}; clock_t start, stop; Data Structure in C
chapter 1 BASIC CONCEPTS double duration; printf(“ n time\n”); for(i=0; I<ITERATIONS; i++) { for(j=0; j<sizelist[i]; j++) list[j] = sizelist[j] - j; start = clock(); sort(list, sizelist[i]); stop = clock(); /* CLK_TCK = number of clock ticks per second */ duration = ((double) (stop-start)) / CLK_TCK; printf(“%6d %f\n”, sizelist[I], duration); } Data Structure in C
chapter 1 BASIC CONCEPTS 30 …. 100 200 300 600 800 700 400 500 n Time .00 .22 .38 .60 .82 1.15 1.48 .11 900 1000 1100 1200 1300 1400 1500 1600 1.86 2.31 2.80 3.35 3.90 4.54 5.22 5.93 Figure 1.11: Worst case performance of selection sort(in seconds) Data Structure in C