Presentation is loading. Please wait.

Presentation is loading. Please wait.

Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:

Similar presentations


Presentation on theme: "Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:"— Presentation transcript:

1 Numbers in Codes GCNU 1025 Numbers Save the Day

2 Coding Converting information into another form of representation (codes) based on a specific rule Encoding: information to symbols Decoding: symbols back to information

3 Binary Codes Two symbols are used to represent data Example: Morse code, ASCII code

4 https://www.youtube.com/watch?v=u493fX2hYgU

5 Morse Codes On-off tones, lights, clicks, dots and dashes, etc. Click to learn Morse codes

6 Binary codes Language of computers: 0 and 1 (binary system) Codeword: a string of 0’s and 1’s representing a character ASCII code: American Standard Code for Information Interchange 128 characters, of which 33 are control characters Enables the use of same codewords in different machines

7 Binary Codes: Play http://www.binaryhexconverter.com/binary-to-ascii-text-converter http://www.binaryhexconverter.com/ascii-text-to-binary-converter

8 Coding of Chinese characters (optional) Example: Chinese telegraph code (non-binary) 4-digit: 0000-9999 Decoding (relatively easy for documentation): 3413  法 Encoding (more difficult for documentation): 法  3413 Four-corner method: method for documentation for encoding

9 Coding of Chinese characters (optional) Example: Chinese telegraph code (non-binary) 4-digit: 0000-9999 Four-corner method: method for documentation for encoding

10 Coding of Chinese characters (optional) Example: Chinese telegraph code (non-binary) 4-digit: 0000-9999 Four-corner method: method for documentation for encoding

11 Announcement In-class Assignment #1 on Sep 19 (Friday) 10% of final score Coverage: up to Section 2.2 Books, notes, other materials and discussions all allowed Help from instructor and teaching assistant Assignments submitted after class subject to penalty

12 Numbers in Codes GCNU 1025 Numbers Save the Day

13 Error-detection for binary codes Rule: every valid codeword has a special property Parity check: a validity check concerning the parity (i.e. being odd or even) of the number of 1’s in a codeword Example: 1001 sent as 1011 Original number of 1’s: 2 (even) Number of 1’s in 1011: 3 (odd) 1 error leads to a change of parity of the number of 1’s Error detected if all valid codewords consist of an even number of 1’s

14 Simple parity check Rule: the last digit of a codeword is a check digit appended to the original message (to be sent) so that the total number of 1’s in the codeword is even! Example: sending a message 1000001 Check digit to be appended: 0 Codeword for the message: 10000010 (total number of 1’s: 2) Example: sending a message 1001001 Check digit to be appended: 1 Codeword for the message: 10010011 (total number of 1’s: 4)

15 Error-correction in codes Is it possible to detect AND correct an error (without re- transmission/data re-entry)? Error-correction by multiple entries Sending all messages 3 times regardless of existence/absence of errors Example: 1100 sent as 1100 1100 1100 Error-correction power: a received message of 1100 1100 1000 can be automatically corrected to 1100 1100 1100 without further data re-entry High resources demand: tripling message length Error-correction by multiple parity check digits

16 Error-correction in codes Example: transmit 1001 by multiple parity check digits  1001101

17 Error-correction in codes 1-error correction: if the message received is 1000101 How can we detect/correct the error? (assume at most 1 error)

18 Lengths of codes Basic question: how many digits do we need? How many digits are needed to encode 2 characters (e.g A, B)? How many digits are needed to encode 4 characters? How many digits are needed to encode 26 characters (A-Z)?

19 Numbers in Codes GCNU 1025 Numbers Save the Day

20 Efficiency of data transmission Run-length encoding (RLE): reduce number of characters transmitted (data compression) Example: black-and-white documents

21 Efficiency of data transmission Run-length encoding (RLE): reduce number of characters transmitted (data compression) Example: black-and-white documents

22 Efficiency of data transmission Run-length encoding (RLE): reduce number of characters transmitted (data compression) Example: black-and-white documents Reduce length of duplicated characters Common for faxed documents and files containing runs Size increased if runs are absent

23 Efficiency of data transmission Example: two different ways of encoding Type 1 Type 2

24 Efficiency of data transmission Example: two different ways of encoding Type 1: fixed number of digits used Type 2: different numbers of digits used

25 Efficiency of data transmission Different coding methods Fixed length code: fixed number of digits used Variable length code: different numbers of digits used

26 Efficiency of data transmission Variable length code Shorter code for frequently used characters: efficiency enhanced Is there anything wrong with the following code? Is there anything wrong in encoding BIT? Is there anything wrong in decoding 0000001101?

27 Efficiency of data transmission Variable length code Is there anything wrong with the following code? Is there anything wrong in encoding BIT? No! Is there anything wrong in decoding 0000001101? Yes! Possible multiple interpretations (BIT or FET)! Prefix property: no codeword can be a prefix of another codeword Uniquely decipherable code: code satisfying the prefix property

28 Efficiency of data transmission Variable length code Prefix property: no codeword can be a prefix of another codeword Uniquely decipherable code: code satisfying the prefix property Example: the code is not uniquely decipherable as the codeword of B is a prefix of the codeword of F (this set of code does not satisfy the prefix property)

29 Efficiency of data transmission Variable length code Example: do these two codes satisfy the prefix property?

30 Numbers in Codes GCNU 1025 Numbers Save the Day

31 Efficiency of data transmission Variable length code Which of the two uniquely decipherable codes is more efficient?

32 Efficiency of data transmission Variable length code Which of the two uniquely decipherable codes has a shorter average length?

33 Efficiency of data transmission Variable length code Which of the two uniquely decipherable codes has a shorter average length? Scheme 1! Example: DELETE THE FILE Scheme 1: 52 digits in total Scheme 2: 49 digits in total

34 Efficiency of data transmission Variable length code Which of the two uniquely decipherable codes has a shorter average length? Scheme 1! Example: DELETE THE FILE Scheme 1: 52 digits in total Scheme 2: 49 digits in total E is a very common (heavy) character Frequencies (weights) also important! Weighted average should be considered instead

35 Efficiency of data transmission

36

37 Variable length code Weighted average code length Choice of frequency tables: Choice #1: frequency table from specific message Choice #2: general frequency table for typical English passages

38 Efficiency of data transmission Variable length code Weighted average code length (Partial) example: Morse code

39 Classwork: Calculate the weighted average code length for the Morse codes, using the general frequency table Answer: 2.544

40 Numbers in Codes GCNU 1025 Numbers Save the Day

41 Huffman code Aim: produce a code with the smallest weighted average code length for a given frequency table Basic principle: shorter codewords for more frequent characters Tool: a tree built from bottom to top with characters being the “leaves”

42 Huffman code Example: a code for 4 characters Step 1: combine the 2 with lowest probabilities

43 Huffman code Example: a code for 4 characters Step 2: combine the 2 among “D”, “E” and “LT” with lowest probabilities

44 Huffman code Example: a code for 4 characters Step 3: combine the 2 among “E” and “LTD” with lowest probabilities

45 Huffman code Example: a code for 4 characters Step 4: assign “0” to the branch with the bigger probability and “1” to the branch with the smaller probability

46 Huffman code Example: a code for 4 characters Step 4: assign “0” to the branch with the bigger probability and “1” to the branch with the smaller probability

47 Huffman code Example: a code for 4 characters Step 5: read out the codewords from the top of the tree

48 Huffman code Example: a code for 4 characters Does the code constructed this way always satisfy the prefix property? If “11” is a codeword for D, is it possible for other codewords to begin with “11”? No (as the branch for D stops at “11”)!

49 Classwork: Constructing Huffman code

50 Numbers in Codes GCNU 1025 Numbers Save the Day

51 Constructing Huffman code

52

53

54

55

56

57

58

59

60

61 Huffman code: remarks Multiple possible Huffman codes for same frequency table Different number of layers possible Are the weighted average code lengths the same? Different Huffman codes for same frequency table have same weighted average code length Smallest weighted average code length guaranteed (proof out of scope)

62 Huffman codes: comparison

63 Numbers in Codes GCNU 1025 Numbers Save the Day

64 Arithmetic coding No one-to-one correspondence between characters and codewords (unlike Huffman code) Encode whole message into one number Example: “DELETE” encoded as 0.11633 (decimal number)

65 Arithmetic coding Example: “DELETE” encoded as 0.11633 (decimal number) Step 1: Divide the interval (0, 1) into portions

66 Arithmetic coding Example: “DELETE” encoded as 0.11633 (decimal number) Step 2: Choose (zoom into) portion of first character “D” and divide the portion according to the probabilities (as in Step 1)

67 Arithmetic coding Step 2: Choose (zoom into) portion of first character “D” and divide the portion according to the probabilities (as in Step 1)

68 Arithmetic coding Example: “DELETE” encoded as 0.11633 (decimal number) Step 3: Choose (zoom into) portion of second character “E” and divide the portion according to the probabilities

69 Arithmetic coding Step 4: Keep choosing (zooming into) portions in correct order and dividing the chosen portion according to the probabilities

70 Arithmetic coding Step 5: Choose the portion of “END” when the message ends Step 6: Choose any number within the range of “END” as the codeword for the message (e.g. 0.11633)

71 Arithmetic coding Example: decoded 0.11633 with the frequency table Step 1: Divide into portions Step 2: Where is 0.11633? Zoom in! 0.11633 is in Section D: first character of message is “D” Step 3: Repeat Step 1 and 2. Stop when it hits “END”!

72 Arithmetic coding

73 Numbers in Codes GCNU 1025 Numbers Save the Day

74 Units in daily life Examples of prefixes: Mega-pixel Nano-meter Giga-watt

75 SI prefixes International system of units Examples: km, mm, cm, mL Some common SI prefixes:

76 Units in data transmission

77 SI prefixes commonly used for transmission speed Example: 56kbps kbps: kilo-bit per second kilo (SI prefix): 1000 Bit: binary digit

78 Binary prefixes Different from SI prefixes: same letter, different meaning 1024 used instead of 1000 Comparison:

79 Units in computer systems (file size)

80 Units in telecommunication

81 Example: How long does it take to download a 4 MB song via a 56K modem? 4 MB: 4 x 1024 x 1024 x 8 bits 56k modem: 56 Kbps transfer rate 56 Kbps: 56 x 1000 bits per second (Minimum) Time needed for downloading: ~600 secondsTime needed for downloading

82 Classwork 10: telecommunication

83 Units in hard disk packaging Confusion in units: SI prefixes used in packaging of hard disks/flash drives True capacity of disk/computer memory (e.g. RAM)/file size expressed by binary prefixes

84 Units in hard disk packaging

85 Numbers in Codes -End-


Download ppt "Numbers in Codes GCNU 1025 Numbers Save the Day. Coding Converting information into another form of representation (codes) based on a specific rule Encoding:"

Similar presentations


Ads by Google