Character Processing How characters can be treated as small integers? How characters are stored and manipulated in a machine? How use is made of certain standard header files?
Data Type char Fundamental Data Type stored in one Byte. 28=256 different combos. Letters, Digits, Special Characters. A character constant is written between single quotes. char c = ‘s’;
In C a char has the integer value corresponding to the Binary Coding Scheme in effect (ASCII). See page 174. Letters: ’a’, ‘b’ ……….’z’ 97 98 ……….112 ‘A’, ‘B’ ………’Z’ 65 66 ………. 90 Digits: ‘0’, ‘1’ ……….. ‘9’ 48 49 57 Others: ‘&’ ‘+’ ……… 38 43
printf ( “%c”, ‘a’) ? printf ( “%d”, ‘a’) ? printf ( “%c”, 98) ? A “ %c” is used to designate the char format. The content of the Byte can be thought of as either a character or as a small integer. printf ( “%c”, ‘a’) ? printf ( “%d”, ‘a’) ? printf ( “%c”, 98) ?
Nonprinting and Hard-to-print Characters Escape sequence: (see pg. 176). ‘\n’ - new line, ‘\t’ - tab ‘\\’ - get backslash ‘\”’ - get double quote mark ‘\” - get single quote mark printf(“\“ABC\””); “ABC” printf(“\‘ABC\’”); ‘ABC’ printf(“ \n\t This is a test”);
I/O of char Returns the ORDINAL VALUE of the character read. getchar( ) reads a character from the keyboard. int getchar(void); Returns the ORDINAL VALUE of the character read. putchar( ) writes a character to the screen. int putchar(int c); Functional Prototype C is an Integer Valued Expression representing the code of the character to be output.
p.g. 178 char c; while (1){ c = getchar(); putchar(c);} How Terminate? ctrl+c ctrl+d(unix), ctrl+z(dos) int c; while ((c = getchar()) != EOF){ putchar(c);}
EOF stdio.h contains: #define EOF (-1) How interpret EOF? Negative integer stored in 2’s complement form. 1. Take Pos value (+1) 00000001 2. Reverse Bits 11111110 3. Add 1 +1 11111111
Capitalize Lowercase Letters p.g. 183 #include <stdio.h> #include <ctype.h> int c; while ((c = getchar()) != EOF){ if (islower(c)) putchar( toupper(c) ); }
# include <ctype.h> ctype.h Header file Contains macros & prototypes of functions that are often used when processing characters: macros - The C preprocessor recognizes lines of the source text that begin with ________? The macros in ctype.h are used to test chars & includes a set of FP’s of functions that are used to convert characters. # include <ctype.h> #
Use of Macros Improve I/O: # define READ(c) c = getchar( ) If (READ(c) == ‘x’) Expand to : If ((c = getchar( )) ==‘x’)
printf(“%d %d %d\n”, i, j, temp); 2) Define some often used operations: # define SWAP(val1, val2, temp) {temp = val1; \ val1=val2; \ val2 = temp;} int i = 4, j = 8, temp; SWAP( i, j, temp); printf(“%d %d %d\n”, i, j, temp);
Conditional Compilation Directives: # define IBMPC 1 # if IBMPC # include <IBM.h> # else # include <generic.h> # endif IF target machine is IBMPC use IBM specific routines else use machine independent routines Do what if not using IBMPC _________? Change 1 to Zero
Class Functions They examine a character and tell if it belongs to a given class: int is… (int testchar); General Prototype
iscntrl- The ASCII control chars are all the values below the space(32) and the delete char(127). True if one of these-false otherwise. isprint- Is printable, the comple ment of iscntrl. True if > 31 and < 127. isspace- Checks for whitespace(blank(32), tab(9), line feed(10), vertical tab(11), form feed(12), carriage return(13)).
isgraph- All ASCII chars > 32(space) and less than 127(delete) are considered graphic characters. isalnum- The alphabetic characters and the numeric digits are considered the alphanumeric set. ispunct- The graphic complement of the alphanumeric chars. If testchar is > 32 (space) and < 127 (delete) but NOT an alphanumeric, it returns true.
isalpha- The upper and lower case alphabetic characters. islower- Lower case letter. isupper- Upper case letters. isdigit- The decimal digits. isxdigit- Test for hexidecimal digits, (0..9, a..f, A..F).