1 / 16

CSCE 3110 Data Structures & Algorithm Analysis

CSCE 3110 Data Structures & Algorithm Analysis. Rada Mihalcea http://www.cs.unt.edu/~rada/CSCE3110 Trees Applications. Trees: A Review (again?  ). General trees one parent, N children Binary tree ISA General tree + max 2 children Binary search tree ISA Binary tree

siran
Download Presentation

CSCE 3110 Data Structures & Algorithm Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCE 3110Data Structures & Algorithm Analysis Rada Mihalcea http://www.cs.unt.edu/~rada/CSCE3110 Trees Applications

  2. Trees: A Review (again? ) • General trees • one parent, N children • Binary tree • ISA General tree • + max 2 children • Binary search tree • ISA Binary tree • + left subtree < parent < right subtree • AVL tree • ISA Binary search tree • + | height left subtree – height right subtree |  1

  3. Trees: A Review (cont’d) • Multi-way search tree • ISA General tree • + Each node has K keys and K+1 children • + All keys in child K < key K < all keys in child K+1 • 2-4 Tree • ISA Multi-way search tree • + All nodes have at most 3 keys / 4 children • + All leaves are at the same level • B-Tree • ISA Multi-way search tree • + All nodes have at least T keys, at most 2T(+1) keys • + All leaves are at the same level

  4. Tree Applications • Data Compression • Huffman tree • Automatic Learning • Decision trees

  5. Huffman code • Very often used for text compression • Do you know how gzip or winzip works? •  Compression methods • ASCII code uses codes of equal length for all letters  how many codes? • Today’s alternative to ASCII? • Idea behind Huffman code: use shorter length codes for letters that are more frequent

  6. Huffman Code • Build a list of letters and frequencies “have a great day today” • Build a Huffman Tree bottom up, by grouping letters with smaller occurrence frequencies

  7. Huffman Codes • Write the Huffman codes for the strings • “abracadabra” • “Veni Vidi Vici”

  8. Huffman Code • Running time? • Suppose N letters in input string, with L unique letters • What is the most important factor for obtaining highest compression? • Compare: [assume a text with a total of 1000 characters] • I. Three different characters, each occurring the same number of times • II. 20 different characters, 19 of them occurring only once, and the 20st occurring the rest of the time

  9. One More Application • Heuristic Search • Decision Trees • Given a set of examples, with an associated decision (e.g. good/bad, +/-, pass/fail, caseI/caseII/caseIII, etc.) • Attempt to take (automatically) a decision when a new example is presented • Predict the behavior in new cases!

  10. Data Records Name A B C D E F G 1. Jeffrey B. 1 0 1 0 1 0 1 - 2. Paul S. 0 1 1 0 0 0 1 - 3. Daniel C. 0 0 1 0 0 0 0 - 4. Gregory P. 1 0 1 0 1 0 0 - 5. Michael N. 0 0 1 1 0 0 0 - 6. Corinne N. 1 1 1 0 1 0 1 + 7. Mariyam M. 0 1 0 1 0 0 1 + 8. Stephany D. 1 1 1 1 1 1 1 + 9. Mary D. 1 1 1 1 1 1 1 + 10. Jamie F. 1 1 1 0 0 1 1 +

  11. Fields in the Record A: First name ends in a vowel? B: Neat handwriting? C: Middle name listed? D: Senior? E: Got extra-extra credit? F: Google brings up home page? G: Google brings up reference?

  12. Build a Classification Tree Internal nodes: features Leaves: classification F 0 1 A D A 2,3,7 1,4,5,6 10 Error: 30% 8,9

  13. Different Search Problem Given a set of data records with their classifications, pick a decision tree: search problem! Challenges: • Scoring function? • Large space of trees. What’s a good tree? • Low error on given set of records • Small

  14. “Perfect” Decision Tree C middle name? 0 1 E EEC? 0 1 F B Google? Neat? 0 0 1 1 Training set Error: 0% (can always do this?)

  15. Search For a Classification • Classify new records New1. Mike M. 1 0 1 1 0 0 1 ? New2. Jerry K. 0 1 0 1 0 0 0 ?

  16. The very last tree for this class

More Related