Sorting Tutorial Using C On Linux.

Slides:



Advertisements
Similar presentations
File Management in C. What is a File? A file is a collection of related data that a computers treats as a single unit. Computers store files to secondary.
Advertisements

Recitation By yzhuang, sseshadr. Agenda Debugging practices – GDB – Valgrind – Strace Errors and Wrappers – System call return values and wrappers – Uninitialization.
Lecture 20 Arrays and Strings
BITS Pilani, Pilani Campus TA C252 Computer Programming - II Vikas Singh File Handling.
Memory and Files Dr. Andrew Wallace PhD BEng(hons) EurIng
CS 241 Section Week #5 2/23/12. 2 Topics This Section MP4 overview Function Pointers Pthreads File I/O.
Functions Definition: Instruction block called by name Good design: Each function should perform one task and do it well Functions are the basic building.
Guide To UNIX Using Linux Third Edition
C Programming. C vs C++ C syntax and C++ syntax are the same but... C is not object oriented * There is no string class * There are no stream objects.
CSE1301 Computer Programming: Lecture 19 File I/O
Review of C++ Programming Part II Sheng-Fang Huang.
Modular Programming Chapter Value and Reference Parameters t Function declaration: void computesumave(float num1, float num2, float& sum, float&
University of Calgary – CPSC 441. C PROGRAM  Collection of functions  One function “main()” is called by the operating system as the starting function.
1 Homework Introduction to HW7 –Complexity similar to HW6 –Don’t wait until last minute to start on it File Access will be needed in HW8.
Modular Programming Chapter Value and Reference Parameters computeSumAve (x, y, sum, mean) ACTUALFORMAL xnum1(input) ynum2(input) sumsum(output)
Program A computer program (also software, or just a program) is a sequence of instructions written in a sequence to perform a specified task with a computer.
Dynamic Memory Allocation The process of allocating memory at run time is known as dynamic memory allocation. C does not Inherently have this facility,
File Handling Spring 2013Programming and Data Structure1.
CS 590 Programming Environments with UNIX. Computer Lab Account Course Homepage
Introduction to Programming Using C Files. 2 Contents Files Working with files Sequential files Records.
File IO and command line input CSE 2451 Rong Shi.
CPS4200 Unix Systems Programming Chapter 2. Programs, Processes and Threads A program is a prepared sequence of instructions to accomplish a defined task.
5 1 Data Files CGI/Perl Programming By Diane Zak.
1 File Handling. 2 Storage seen so far All variables stored in memory Problem: the contents of memory are wiped out when the computer is powered off Example:
Open Source Server Side Scripting ECA 236 Open Source Server Side Scripting Files & Directories.
Memory Layout, File I/O Bryce Boe 2013/06/27 CS24, Summer 2013 C.
FILE IO in ‘C’ by Dr P.Padmanabham Professor (CSE)&Director Bharat Institute of Engineering &Technology Hyderabad Mobile
24-2 Perform File I/O using file pointers FILE * data-type Opening and closing files Character Input and Output String Input and Output Related Chapter:
C LANGUAGE Characteristics of C · Small size
GAME203 – C Files stdio.h C standard Input/Output “getchar()”
Multi-dimensional Arrays and other Array Oddities Rudra Dutta CSC Spring 2007, Section 001.
Files A collection of related data treated as a unit. Two types Text
CS 241 Section Week #5 9/22/11. 2 Topics This Section File I/O Advanced C.
Files. FILE * u In C, we use a FILE * data type to access files. u FILE * is defined in /usr/include/stdio.h u An example: #include int main() { FILE.
C Programming Day 2. 2 Copyright © 2005, Infosys Technologies Ltd ER/CORP/CRS/LA07/003 Version No. 1.0 Union –mechanism to create user defined data types.
Using System Calls (Unix) Have to tell compiler (if C/C++) where to find the headers, etc. – i.e., the “include” files May have to tell compiler where.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Chapter 4.
Strings CSCI 112: Programming in C.
Chapter 6 CS 3370 – C++ Functions.
A bit of C programming Lecture 3 Uli Raich.
Chapter 4 File Processing
Lecture 11 File input/output
File I/O.
CSC215 Lecture Input and Output.
Programmazione I a.a. 2017/2018.
CS111 Computer Programming
File Input/Output.
Programming in C Input / Output.
CSCI206 - Computer Organization & Programming
Topics Introduction to File Input and Output
Chapter 14 - Advanced C Topics
CSC215 Lecture Input and Output.
Beginning C Lecture 11 Lecturer: Dr. Zhao Qinpei
Exam 4 review Copyright © 2008 W. W. Norton & Company.
File I/O in C Lecture 7 Narrator: Lecture 7: File I/O in C.
File Input and Output.
File Handling.
Programming in C Miscellaneous Topics.
Programming in C Miscellaneous Topics.
Programming in C Input / Output.
Files.
7 Arrays.
Homework Continue with K&R Chapter 5 Skipping sections for now
Module 12 Input and Output
Topics Introduction to File Input and Output
File I/O.
I/O CS580U - Fall 2018.
Professor Jodi Neely-Ritz University of Florida
Files Chapter 8.
Presentation transcript:

Sorting Tutorial Using C On Linux

Requirements Sorting of data Take input from a file (e.g. students.csv). File contains student names and percentage marks separated by comma. Sort the data according to the Name field. Store the output to another file. Output filename will be input file name suffixed by “.sorted” (e.g. students.csv.sorted) No limit on the number of records in the files. Code must handle very large data e.g. 1 Million records -D option to enable debug log. High level commentary to show the progress and total time taken by the program.

Goals How to design modules? Medium / Complex C programming concepts Code formatting, indentation, comments File handling C functions: fopen(), fclose(), fgets(), fwrite() e.t.c. Argument checking using getopt() Sorting using qsort() Memory allocation for creating dynamic size arrays using malloc() and realloc() Variable arguments handling in a function (debug_log) How to design Test Cases and data for testing?

Prerequisites Basic and some advance concepts of C. Linux tutorial. Man Page study. Basic knowledge of test cases designing.

Design Man page (manual / document) for the program Input / Output Methods Data Structures Unix C API's

Guidelines Minimum global variables Divide code in simple and self contained modules Add error checking at every place Meaningful names of all variables, functions, macros, input arguments. Not like tmp, tmp1. Call debug log at appropriate place File header, function header and block comments and line comments. Comments for each variable Give meaningful, grammatically correct error and debug messages.

Man Page of the Tool Syntax c_sort_tutorial -f <file name> -D <option> e.g.:- c_sort_tutorial –f students.csv -D 1/2 -f argument must be present along with the file name. -D argument is optional. If present, Debug log must be enabled. If given with 1, only high level log must be present If given with 2, detailed log must occur.

Input / Output Input File Name given by the user with the arguments.(e.g.:-students.csv) The file contains the Name and Percentage marks of the students separated by comma Output File named suffixed .sorted to input file (e.g.:-students.csv.sorted) containing the record sorted by the name.

Sample Input / Output Input File – students.csv Nick Massa,99.99 Mohit Garg,100.00 Anil Kumar,95.55 Output File – students.csv.sorted

Methods

main(int argc, char *argv[]) Purpose Main function of program Input int argc :– Number of argument. char *argv[]:-Array of arguments. Tasks Take start time stamp using get_cur_date_time() It will call check_argument(). It will call sort_data(). Take end time stamp using get_cur_date_time() Output Exit with 0 on success.

void check_arguments(int argc,char *argv[]) Purpose Parse, extract and validate input arguments. Input Same as main function. int argc :– Number of argument. char ** argv:-Array of arguments. Task Check for -f argument Must be present only once. followed by filename. (e.g. -f students.csv) Save filename in a global variable Check for –D argument is present with related option. No other argument is there. If any error occurs call usage() to show error and exit.

static void sort_data() Purpose To sort the data according the field Name. Input None Task Open input file using fopen() in read mode. If any error occurs call usage() with following input. “Error Occurs in opening file” Call strerror() with concerned error no and exit with -1.

sort_data() continued In a loop, read line by line using fgets() Check for error if any in fgets() Tokenize line using get_token() If number of fields is not 2, give warning and skip that line Add this record in array using add_record() Close file using fclose() If any error occurs print error message and call strerror(). Sort the data using Unix qsort() call. Save sorted data using save_sorted_data() method Output None

static int get_tokens(char *line, char *fields[], char *token ) Purpose Parse the token to each records of file and check for the valid records. Input Pointer to the string as input. Array of fields. Character token. Task Parse the field as token and insert these tokens to fields array. Output Returns the no. of tokens parsed.

void add_record(char *student_name, double student_percentage) Purpose Add record in dynamic memory array Input Pointer to the char array name and double marks. Name: Char array of name. e.g.:-Ram Marks: Double marks. e.g.:-78.90 Task Call create_table_entry to check array space. Add entries in the array. Output None

int create_table_entry(int. row_num, int. total, int. max, char int create_table_entry(int *row_num, int *total, int *max, char **ptr, int size, char *name) Purpose To create array entries dynamically. Input Integer Row Number ,Total entries ,Maximum Entries ,Size Char Array ptr , name Task Check for space in array. Reallocate array size if required. Check for debug_flag argument If greater than 1 than detail log must be enabled. Output Returns 0 on success.

static void save_sorted_data() Purpose To store the sorted entries in the output file. Input None Task Open the output file(e.g.:-student.csv) using fopen(). If any error occurs call usage(). Store all entries from array to the file one by one. Close the file using fclose(). If any error occur print error message. Output

void usage(char *err_msg) Purpose To print error and show usage of command Input Error message as input. Output Prints the error message and usage.

void debug_log(char *file_name, int line, char *fname, char *format, ...) Purpose Create debug_log file and print log to it. Input string file_name ,int line ,string fname ,string formate … Task Call open_log to open/create debug.log file Print log to the file. Output None

static void open_log(char. name, int. fd, unsigned long max_size, char static void open_log(char *name, int *fd, unsigned long max_size, char *header) Purpose Create debug.log.prev and print log to it. Input string file_name ,int *fd ,long max_size ,string header Task Close debug.log if open and rename it to debug.log.prev Create new debug.log file to print log to it. Output None

static char *get_cur_date_time() Purpose To print the date and time at the time of function call. Input None Task Fetch the current system date and time. Output Returns date and time in string formate.

int comparator (const void *rec1,const void *rec2) Purpose To compare the entries and find greater one. Input Pointers to the two element to compare. Task Compares these values and gives output Output An integer output with the following aspects. If rec1<rec2 then returns -1 integer. If rec1=rec2 then return zero. If rec1>rec2 then returns +1 integer.

Data Structure Dynamic Array Dynamic Array is used to store and sort data from input file. Dynamic Array An array data structure ,can be resized during runtime Any number of elements can be added and removed during execution.

C API’s getopt(int argc, char * const argv[], const char *optstring) This function parses the command line arguments. Its argument argc and argv are the argument count and array as passed to the main() function on program innovation. When getopt() is called repeatedly, it returns successively each of the option characters from each of the option elements. If there are no more option characters then getopt() returns -1 and exit.

fopen(const char *path, const char *mode) Opens the file named as the string pointed to by path and associates a stream with it. Opens or create a new file ,overwrite it, append it depends on the mode string. If there was no error, the output file handle the file opened. If error occurs, it returns a NULL pointer. int fclose(FILE *stream) Closes the file associated with the stream and disassociates it. On successful completion it returns 0 otherwise EOF is returns. The global variable errno is set to indicate the error.

char *strerror(int errnum) Returns a pointer to the string describing the error code passed in the argument errnum . an unknown error message if the error code is unknown. The error strings produced by strerror depends on the developing platform and compiler char *fgets(char *s, int size, FILE *stream) It reads the characters from stream and stores them into the buffer pointed to by s Reading stops after an EOF or a newline occurs. If a newline is read, it is stored into the buffer automatically.

int fputs(const char *s, FILE *stream) It writes the string s to stream, without its trailing ’\0’. Returns a non-negative number on success, or EOF on error. void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *)) Sorts an array with nmem elements of size size. The base argument points to the start of the array. This function returns no value.

int fprintf ( FILE * stream, const char * format, ... ) Place output to a named output stream. Returns the no. of character outputted or a negative no. if an error occurs. int fscanf(FILE *stream, const char *format, ...) It reads input from the stream pointer stream. Returns the number of input items assigned.

Testing

Validation Different set of activities ensures the program is traceable to the client requirement. Quality assurance process ,provides the high degree of assurance of a program. There are two basic field where validation can be done File Validation Data Validation

File and Arguments test Cases Case 1:-File not present. Explanation:-Should produce an error that File is not present at the location. Case 2:- -f argument is not present. Explanation:-Should produce an error that -f argument must be present an print the usage again. Case 3:-File name not given with -f argument. Explanation:-Should produce an error that file name must be given with the -f argument. Case 4:-File does not have read permission Explanation:-Should produce an error that Remove read permission and input again.

Case 5:-Extra argument given(e.g.:- -t). Explanation:-Should produce an error that only -f and –D parameters are allowed. Case 6:- -D given with extra parameters(e.g.:- -D 3) Explanation:-Should produce an error that -D is allowed with only 1 / 2 argument.

Sorting Test Cases(Valid Records) Case 1:- File is empty. Explanation:-Should produce an warning that file is empty. Case 2:-File has one record Explanation:-Output file should also have one record. Case 3:-File has two record that are already sorted. Eplanation:-Output file must contains only two record. Case 4:-File has ten record. Explanation:-Output must also have ten sorted record.

Sorting Test Cases( Some Invalid Records) Case 1:- File has one invalid record. Explanation:-Output file must be empty. Case 2:- File has one valid and one invalid record. Explanation:-Output file should contain one valid record Case 3:-File has more than one invalid record and more than one valid records. Explanation:-Output file must have all valid record in sorted form.

Performance Testing Cases Case 1:-Check performance if file contains 1000 records. Expected Results:-Program must work properly for 1000 of entries and should not produce any arror. Case 2:-Check performance if file contains 10000 records. Expected Results:- Program must work properly for 10000 users and should not fail. Case 3:-Check performance if file contains 100000 records or more. Expected Results:- Program must work properly and consistently for any number of file entries with respect to time and space.