1 / 15

Storage by Hashing

Travis Roe Topics of Computer Science Chapter 43 2-5-2006 . Storage by Hashing. Outline. A Problem A Solution: Hashing Questions Q & A. A Problem. Company organizing data using social security numbers, or similar.

ayoka
Download Presentation

Storage by Hashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Travis Roe Topics of Computer Science Chapter 43 2-5-2006 Storage by Hashing

  2. Outline • A Problem • A Solution: Hashing • Questions • Q & A

  3. A Problem • Company organizing data using social security numbers, or similar. • Need to add and search through collections of identifiers to find objects.

  4. Potential Solution: 1-1 • One index per potential location • Adding: O(1) • Searching: O(1) • Pros: Very fast, very easy to implement • Cons: Far too much memory, much of it unused

  5. Potential Solution: Unsorted Array • Adding. O(1) • Searching. O(n) • Pros: Easy to implement, fast adding. • Cons: Everything else. O(n) ridiculously slow.

  6. Potential Solution: Sorted Array • Adding. O(n) • Searching. O(lg n) • Pros: Fast searching. • Cons: Slow adding.

  7. Potential Solution: Balanced BST • Adding. O(lg n) • Searching. O(lg n) • Pros: Fast speed for adding, searching. • Cons: Hard to program. Not O(1).

  8. A New Solution: Hashing • Adding. • Use the keys to choose an index. • Place the object at the index. • O(1) • Searching. • Use the keys to find the index. • Get the object from that index. • O(1)‏

  9. Hashing: An Example 154-38-1287 1287 • Social Security Numbers are the key • The hash-key is based off the last 4 digits of the number ... 987-65-4321 4321 ... 123-45-6789 6789 ... 192-83-7465 7465

  10. Collisions • Expected problems: • Two objects with the same key • Two keys, after hashing, with same value. • Ways to solve the problems: • Chaining • Probing

  11. Collision Handling: Chaining • Every node is a list of some sort. • Whenever there is a collision, put the new item into the list.

  12. Collision Handling: Probing • Whenever there is a collision, go to another location some distance away and attempt to fill that location. • Can cause grouping. • h(k) + a * x; a = 2 123-45-6789 543-21-6789

  13. Reducing Collisions • Use prime numbers for array sizes • Take more space than you'll need • Choose a better hash function

  14. References • Dewdney, A.K. “Storage By Hashing”. The New Turing Omnibus. 1993. Computer Science Press. • “Hash Tables”. Recording My Programming Path. http://qiang-ma.blogspot.com/2007/10/hash-tables.html <Accessed last 2-5-2008> • Standish, Thomas. Data Structures, Algorithms & Software Principles in C. 1995. Addison-Wesley Publishing Company, Inc. pp450-475(ish)‏

  15. Questions • What are the two methods for handling collisions that were discussed in this lecture? • What is one situation where hash-tables are not useful in?

More Related