Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCE 3110 Data Structures & Algorithm Analysis

Similar presentations


Presentation on theme: "CSCE 3110 Data Structures & Algorithm Analysis"— Presentation transcript:

1 CSCE 3110 Data Structures & Algorithm Analysis
Rada Mihalcea Trees Applications

2 Trees: A Review (again? )
General trees one parent, N children Binary tree ISA General tree + max 2 children Binary search tree ISA Binary tree + left subtree < parent < right subtree AVL tree ISA Binary search tree + | height left subtree – height right subtree |  1

3 Trees: A Review (cont’d)
Multi-way search tree ISA General tree + Each node has K keys and K+1 children + All keys in child K < key K < all keys in child K+1 2-4 Tree ISA Multi-way search tree + All nodes have at most 3 keys / 4 children + All leaves are at the same level B-Tree + All nodes have at least T keys, at most 2T(+1) keys

4 Tree Applications Data Compression Automatic Learning Huffman tree
Decision trees

5 Huffman code Very often used for text compression
Do you know how gzip or winzip works?  Compression methods ASCII code uses codes of equal length for all letters  how many codes? Today’s alternative to ASCII? Idea behind Huffman code: use shorter length codes for letters that are more frequent

6 Huffman Code Build a list of letters and frequencies
“have a great day today” Build a Huffman Tree bottom up, by grouping letters with smaller occurrence frequencies

7 Huffman Codes Write the Huffman codes for the strings “abracadabra”
“Veni Vidi Vici”

8 Huffman Code Running time?
Suppose N letters in input string, with L unique letters What is the most important factor for obtaining highest compression? Compare: [assume a text with a total of 1000 characters] I. Three different characters, each occurring the same number of times II. 20 different characters, 19 of them occurring only once, and the 20st occurring the rest of the time

9 One More Application Heuristic Search
Decision Trees Given a set of examples, with an associated decision (e.g. good/bad, +/-, pass/fail, caseI/caseII/caseIII, etc.) Attempt to take (automatically) a decision when a new example is presented Predict the behavior in new cases!

10 Data Records Name A B C D E F G 1. Jeffrey B. 1 0 1 0 1 0 1 -
2. Paul S 3. Daniel C 4. Gregory P 5. Michael N 6. Corinne N 7. Mariyam M 8. Stephany D 9. Mary D 10. Jamie F

11 Fields in the Record A: First name ends in a vowel?
B: Neat handwriting? C: Middle name listed? D: Senior? E: Got extra-extra credit? F: Google brings up home page? G: Google brings up reference?

12 Build a Classification Tree
Internal nodes: features Leaves: classification F 1 A D A 2,3,7 1,4,5,6 10 Error: 30% 8,9

13 Different Search Problem
Given a set of data records with their classifications, pick a decision tree: search problem! Challenges: Scoring function? Large space of trees. What’s a good tree? Low error on given set of records Small

14 “Perfect” Decision Tree
middle name? 1 E EEC? 1 F Google? B Neat? 1 1 Training set Error: 0% (can always do this?)

15 Search For a Classification
Classify new records New1. Mike M ? New2. Jerry K ?

16 The very last tree for this class


Download ppt "CSCE 3110 Data Structures & Algorithm Analysis"

Similar presentations


Ads by Google