1 / 25

Data Structures

Data Structures. Queues and Stacks. Queues. Queue. Can receive multiple requests from multiple sources How do we services these requests? First come, first serve processing Priority based processing Buffering of requests, as they might arrive faster than they can be processed

baird
Download Presentation

Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Structures Queues and Stacks

  2. Queues

  3. Queue • Can receive multiple requests from multiple sources • How do we services these requests? • First come, first serve processing • Priority based processing • Buffering of requests, as they might arrive faster than they can be processed • You could always use a List structure, with an integer value associated with the item, and then append it to the List using the Add() method • Inefficient

  4. Why List is problematic • nextJobPost keeps track of the “next” job to be processed in the List • Two jobs added to List • Job 1 is processed and slot becomes available • Job 3 grabs first available slot, and Job 4 gets the next available slot

  5. Why List is problematic • List will continue to grow, even if jobs are processed right away • The default is to double the size, when the list requires additional “slots” • No reclaiming of the already used slots is done with Lists • If you do reclaim the “used” slots in the List, then your first-come, first-serve processing scheme will not work • A List represents a linear order

  6. A “circular” List • When adding an item, once the last item is used, the “next” will “wrap around” to the 0th item in the array/list • A “modulus” function is used to “wrap around” • What happens if all items are filled, and you still need another item? • Resize the circular array…! • This is done in the Queue class

  7. Queue • Add / remove buffer items • First-come, first-serve (FIFO) • Manage space utilization • Uses Generics • Type-safe • Methods • Enqueue() • Adds elements at the “tail” index • If not enough space, default growth factor of 2.0 is used to resize • Class constructor can specify other growth factor • Dequeue() • Returns the current element from the “head” index • Sets the “head” element to null and increments the “head” index • Peek() • Allows you to see the head element, without a dequeue, or increasing the head index counter • Contains() • Determine if a specific item exists in the Queue • ToArray() • Returns an array containing the Queue’s elements

  8. Stacks • LIFO structure • Uses a circular array, as does the Queue • Methods • Push() • Adds an item to the stack • Pop() • Removes and returns the item on the “top” of the stack • Size is increased, as required (same as the Queue’s growth factor) • Call Stack as used by the CLR is an example of this structure • When calling a function, Push its information onto the stack • When returning from that routine, Pop it from the stack and expose the routine to which it returns control

  9. Hashtables • Problem: We often don’t know the “position” of an element within an array • Potentially we process all elements before finding the one we need • Reduce the O-time to O(1) • Build an array capable of holding all SS#’s • Each element would hold a record based on the SS# as a “key” • Waste • 109 possible values, but you only have 1,000 employees • Utilization would be 0.0001% of the array • Hashing allows us to “compress” this ordinal indexing

  10. Hashtables • Use the last 4 digits (or 3, or 5) of the SS# • Mathematical transformation (mapping) of a nine-digit value to a four-digit value • Array ranges from 0000 to 9999 • Constant lookup time (O-time) • Better utilization of space • Hash table • Array which uses hashing to compress the indexers • Hash function • Function which performs the hashing

  11. Hashing • H(x) = last four digits of x • Collisions • When multiple inputs to a hash function result in identical outputs • 105 collisions for SS#’s ending in “0000” • Collision of hash value results in attempting to store into a “slot” already occupied by a prior hash result

  12. Collision avoidance / resolution • Collision frequency is directly correlated to the hash function • SS# assumes that the last four digits are uniformly distributed • If year of birth, or geographical location alters the distribution • Increases collisions • Collision avoidance is the selection of an appropriate hashing algorithm • Collision resolution is locating another slot in the hashtable for entry placement

  13. Collision resolution • Linear probing • If collision in slot ioccurs, proceed to the next available slot (i+1), theni+2 and so on, if required • Alice = 1234, Bob=1234, Cal=1237, Danny=1235, Edward=1235 • Insert Alice • Insert Bob • Insert Cal • Insert Danny • Insert Edward

  14. Collision resolution • Searching • Start at the hash location, and then perform a linear search from there until the value is located • When/if you reach an empty slot your search value is NOT in that hashtable • Linear probing not very good resolution • Leads to clustering of values • Ideally you’d like a uniform distribution of values • Quadratic probing • Slot s is taken • Probe s+12, then s-12, then s+22, then s-22, and so on… • Can still lead to clustering

  15. Collision resolution • Rehashing • Used by the .NET Framework Hashtable class • Adding an item to the table • Provide item and unique key to access the item • Item and key can be of any type • Retrieving item • Index the Hashtable by key

  16. Hashtable Code Example //Note the use of the ContainsKey() Method, which returns a Boolean using System; using System.Collections; public class HashtableDemo { private static Hashtable employees = new Hashtable(); public static void Main() { // Add some values to the Hashtable, indexed by a string key employees.Add("111-22-3333", "Scott"); employees.Add("222-33-4444", "Sam"); employees.Add("333-44-5555", "Jisun"); // Access a particular key if (employees.ContainsKey("111-22-3333")) { string empName = (string) employees["111-22-3333"]; Console.WriteLine("Employee 111-22-3333's name is: " + empName); } else Console.WriteLine("Employee 111-22-3333 is not in the hash table..."); } }

  17. Hashtable Code Example // Step through all items in the Hashtable foreach(string key in employees.Keys) Console.WriteLine("Value at employees[\"" + key + "\"] = " + employees[key].ToString()); • The order of insertion and order of keys are not necessarily the same • Depends on the slot the key was stored in • depends on the hash value of the key • Depends on the collision resolution used • The output from the above code results in: Value at employees["333-44-5555"] = Jisun Value at employees["111-22-3333"] = Scott Value at employees["222-33-4444"] = Sam

  18. Hashtable Class: Hash Function • Function returns an ordinal value • Slot # for the key • Function can accept a key of any type • GetHashCode() • Any object can be represented as a unique number

  19. Hashtable Class: Collision Resolution • Rehashing (double hashing) • Set of hash functions H1… Hn • H1 is initially used • If collision, then H2 is used, and so on • They differ by multiplicative factors • Each slot in the hash table is visited exactly once when hashsize number of probes are made • For a given key, Hi and Hj cannot hash to the same slot in the table • This can work if the results of (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)) and hashsize are “relatively prime” • They share no common factors • Guaranteed to be prime if hashsize is a prime number • Better collision avoidance than linear or quadratic probing

  20. Hashing: Load Factors • Hashtable class • Property: loadFactor • Max ratio of items in the Hash to the total slots in the table • 0.5 at most, half the slots can be used, and the other half must remain empty • Values range from 0.1 to 1.0 • Microsoft has a default “scaling factor” of 72% • If you pass 1.0 to the loadFactor property, it’s still only 0.72 behind the scenes • Performance issue

  21. Hashing • Hashtable class • Add() method • Performs a check against the loadFactor • If exceeded, the Hashtable is expanded • Expansion • Slot count is approximately doubled • From the current prime number to the next largest prime number value • Hash value depends on the number of total slots • All values in the table need to be rehashed when the table expands • Occurs behind the scenes during Add() method

  22. Hashtable • loadFactor • Affects the size of the hash table and number of probes required on a collision • High load factor • Denser hash table, but more collisions • Expected number of probes needed when a collision happens • 1/(1-loadFactor) • Default 0.72 loadFactor results in 3.5 probes per collision on average • Does not vary based on number of items in the table • Asymptotic access time is O(1) • Much more desirable that the O(n) search time for an array

  23. Dictionary Class • Hashtable is “loosely-typed” structure • Developer can add keys and values of any type to the table • Generics allow us to have type-safe implementations of a class • Dictionary class is a “type-safe” class • Types the keys and the values • You must specify the types for keys/values when creating the Dictionary instance • Once created, you can add and remove items, just like the Hashtable

  24. Dictionary class • Collision resolution • Different from the Hashtable • Chaining is used • Secondary data structure is used for the collisions • Each slot in the Dictionary contain an array of elements • A collision prepends the element to the bucket’s list

  25. Dictionary class • 8 buckets (example) • Employee object is added to the bucket that its key hashes to • If already occupied, item is prepended • Searching and removing items from a chained hashtable • Time proportional to total items and number of buckets • O(n/m) • n=total elements • m= total buckets • Dictionary class implemented • n=m at all times

More Related