data structures specification and implementation n.
Skip this Video
Loading SlideShow in 5 Seconds..
Data Structures Specification and Implementation PowerPoint Presentation
Download Presentation
Data Structures Specification and Implementation

Loading in 2 Seconds...

play fullscreen
1 / 60

Data Structures Specification and Implementation - PowerPoint PPT Presentation

  • Uploaded on

CSE 5350/7350 Introduction to Algorithms. Data Structures Specification and Implementation. Textbook readings: Cormen: Part III, Chapters 10-14 Mihaela Iridon , Ph.D. Objectives. Understand what dynamic sets are Learn basic techniques for Representing &

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Data Structures Specification and Implementation' - rangle

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data structures specification and implementation

CSE 5350/7350

Introduction to Algorithms

Data StructuresSpecification and Implementation

Textbook readings:

Cormen: Part III, Chapters 10-14

Mihaela Iridon, Ph.D.

Data Structures

  • Understand what dynamic sets are
  • Learn basic techniques for
    • Representing &
    • Manipulating finite dynamic set
  • Elementary Data Structures
    • Stacks, queues, heaps, linked lists
  • More Complex Data Structures
    • Hash tables, binary search trees
  • Data Structures in C#.NET 2.0

Data Structures

high level structure 1
High-Level Structure (1)
  • Arrays
    • System.Collections.ArrayList
    • System.Collections.Generic.List
  • Queue
    • System.Collections.Generic.Queue
  • Stack
    • System.Collections.Generic.Stack

Data Structures

high level structure 2
High-Level Structure (2)
  • Hashtable
    • System.Collections.Hashtable
    • System.Collections.Generic.Dictionary
  • Trees
    • Binary Trees, BST, Self-Balancing BST
    • Linked Lists
      • System.Collections.Generic.LinkedList
  • Graphs

Data Structures

dynamic data sets
Dynamic Data Sets


Why dynamic

General examples

Data structures and the .NET framework

“An Extensive Examination of Data Structures Using C# 2.0” – Scott Mitchell

Data Structures

data structure design
Data Structure Design

Impact on efficiency/running time

The data structure used by an algorithm can greatly affect the algorithm's performance

Important to have rigorous method by which to compare the efficiency of various data structures

Data Structures

example file extension search
Example: file extension search

public bool DoesExtensionExist(string [] fileNames, string extension)


int i = 0;

for (i = 0; i < fileNames.Length; i++)

if (String.Compare(Path.GetExtension(fileNames[i]), extension, true) == 0)

return true;

return false; // If we reach here, we didn't find the extension }


Search is of O(n)

Data Structures

the array
The Array



Direct Access


Most widely used

Data Structures

the array 2
The Array (2)

The contents of an array are stored in contiguous memory.

All of the elements of an array must be of the same type or of a derived type; hence arrays are referred to as homogeneous data structures.

Array elements can be directly accessed. With arrays if you know you want to access the ith element, you can simply use one line of code: arrayName[i].

Data Structures

array operations
Array Operations
  • Allocation
  • Accessing
    • Declaring an array in C#:

string[] myArray;

(initially myArray reference is null)

    • Creating an array in C#:

myArray = new string[5];

Data Structures

array allocation
Array Allocation
    • string[] myArray = new string[someIntegerSize];
  •  this allocates a contiguous block of memory on the heap (CLR-managed)

Data Structures

array accessing
Array Accessing
  • Accessing an element at index i: O(1)
  • Searching through and array
    • Unsorted: O(n)
    • Sorted: O(log n)
  • Array class: static method:
    • Array.BinarySearch(Array input, object val)

Data Structures

array resizing
Array Resizing
  • When the size needs to change:
    • Must create a new array instance
    • Copy old array into new array:

Array1.CopyTo(Array2, 0)

  • Time consuming
  • Also, inserting into an array is problematic

Data Structures

multi dimensional arrays
Multi-Dimensional Arrays
  • Rectangular
    • n x n
    • n x n x n x …
    • Accessing: O(1)
    • Searching: O(nk)
  • Jagged/Ragged
    • n1 x n2 x n3 x …

Data Structures





Example: payroll application

Data Structures

system collections arraylist

Can hold any data type: (hybrid)

Internally: array object

Automatic resizing

Not type safe: casting  errors detected only at runtime

Boxing/unboxing: extra-level of indirection  affects performance

Loose homogeneity

Data Structures

  • Remedy for Typing and Performance
  • Type-safe collections
  • Reusability
  • Example:

public class MyTypeSafeList<T>


T[] innerArray = new T[0];


Data Structures

  • Homogeneous
  • Self-Re-dimensioning Array
  • System.Collections.Generic.List

List<string> studentNames = new List<string>();


string name = studentNames[3];

studentNames[2] = “Mike”;

Data Structures

list methods
List Methods
  • Contains()
  • IndexOf()
  • BinarySearch()
  • Find()
  • FindAll()
  • Sort()
    • Asymptotic Running Time: same as array but with extra overhead

Data Structures

ordered requests processing
Ordered Requests Processing

First-come, First-serve (FIFO)

Priority-based processing

Inefficient to use List<T>

List will continue to grow (internally, the size is doubled every time)

Solution: circular list/array

Problem: initial size??

Data Structures

  • System.Collections.Generic.Queue
  • Operations:
    • Enqueue()
    • Dequeue()
    • Contains()
    • ToArray()
    • Peek()
  • Does not allow random access
  • Type-safe; maximizes space utilization

Data Structures

queue continued
Queue (continued)
  • Applications:
    • Web servers
    • Print queues
  • Rate of growth:
    • Specified in the constructor
    • Default: double initial size

Data Structures

  • LIFO
  • System.Collections.Generic.Stack
  • Operations:
    • Push()
    • Pop()
  • Doubles in size when more space is needed
  • Applications:
    • CLR call stack (functions invocation)

Data Structures

limitations of ordinal indexing
Limitations of Ordinal Indexing
  • Ideal access time: O(1)
  • If index is unknown
    • O(n) if not sorted
    • O(log n) if sorted
  • Example: SSN: 10 ^ 9 possible combinations
  • Solution: compress the ordinal indexing domain with a hash function; e.g. use only 4 digits

Data Structures

hash table
Hash Table
  • Hashing:
    • Math transformation of one representation into another representation
  • Hash table:
    • The array that uses hashing to compress the indexers space
  • Cryptography (information security)
  • Hash function:
    • Non-injective (not a one-to-one function)
    • “Fingerprint” of initial data

Data Structures

  • Fast access of items in large amounts of data
  • Few collisions as possible
    • collision avoidance
  • Avalanche effect:
    • Minor changes to input  major changes to output

Data Structures

collision resolution 1
Collision Resolution (1)
  • Probability to map to a given location:

1/k (k = size = number of slots)

  • (1) Linear Probing

Is H[i] empty?

      • YES: place item at location I
      • NO: i = i + 1; repeat
    • Deficiency: clustering
    • Access and Insertion: no longer O(1)

Data Structures

collision resolution 2
Collision Resolution (2)
  • (2) Quadratic Probing
    • Check s + 12
    • Check s – 12
    • Check s + 22
    • Check s – 22
    • Check s +/- i2
    • Clustering a problem as well

Data Structures

collision resolution 3
Collision Resolution (3)
  • (3) Rehashing – used by Hashtable (C#)
  • System.Collections.Hashtable
  • Operations:
    • Add(key, item)
    • ContainsKey()
    • Keys()
    • ContainsValue()
    • Values()
  • Key, Value: any type  not type safe

Data Structures

hashtable data type example
Hashtable Data Type – Example

using System;

using System.Collections;

public class HashtableDemo


private static Hashtable employees = new Hashtable();

public static void Main()


// Add some values to the Hashtable, indexed by a string key

employees.Add("111-22-3333", "Scott");

employees.Add("222-33-4444", "Sam");

employees.Add("333-44-55555", "Jisun");

// Access a particular key

if (employees.ContainsKey("111-22-3333"))


string empName = (string) employees["111-22-3333"];

Console.WriteLine("Employee 111-22-3333's name is: " + empName);



Console.WriteLine("Employee 111-22-3333 is not in the hash table...");



Data Structures

  • Key = any type
  • Key is transformed into an index via GetHashCode() function
  • Object class defines GetHashCode()
  • H(key) = [GetHash(key) + 1 +

(((GetHash(key) >> 5) + 1) %

(hashsize – 1))] % hashsize

Values = 0 .. hashsize-1

Data Structures

collision resolution 3 cont d
Collision Resolution (3 – cont’d)
  • Rehashing = double hashing
  • Set of hash functions: H1, H2, …, Hn
  • Hk(key) = [GetHash(key) + k *

(1 + (((GetHash(key) >> 5) + 1) %

(hashsize – 1)))] % hashsize

  • Hashsize must be PRIME

Data Structures

  • Load Factor = MAX ( # items / # slots)
  • Optimal: 0.72
  • Expanding the hashtable: 2 steps: (costly)
    • Double # slots (crt prime  next prime which is about twice bigger)
    • Rehash
  • High LoadFactor  Dense Hashtable
    • Less space
    • More probes on collision (1/(1-LF))
    • If LF = 0.72  expected # probes = 3.5  O(1)

Data Structures

  • Costly to expand
  • Set the size in constructor if size is known
  • Asymptotic running times:
    • Access: O(1)
    • Add, Remove: O(1)
    • Search: O(1)

Data Structures

system collections generic dictionary
  • Typesafe
  • Strongly typed KEYS + VALUES
  • Operations:
    • Add(key, value)
    • ContainsKey(key)
  • Collision Resolution: CHAINING
    • Uses linked lists from an entry where collision occurs

Data Structures

dictionary example
Dictionary Example

Dictionary<keyType, valueType> variableName =

new Dictionary<keyType, valueType>();

Dictionary<int, Employee> employeeData = new Dictionary<int, Employee>();

// Add some employees

employeeData.Add(455110189) = new Employee("Scott Mitchell");

employeeData.Add(455110191) = new Employee("Jisun Lee");


// See if employee with SSN 123-45-6789 works here

if (employeeData.ContainsKey(123456789))


Data Structures

chaining in the dictionary type
Chaining in the Dictionary type
  • Efficiency:
    • Add: O(1)
    • Remove: O (n/m)
    • Search: O(n/m)


n = hash table size

m = number of buckets/slots

  • Implemented s.t. n=m at ALL times
    • The total # of chained elements can never exceed the number of buckets

Data Structures

  • = set of linked nodes where no cycle exists
  • (GT) a connected acyclic graph
  • Nodes:
    • Root
    • Leaf
    • Internal
  • |E| = ?
  • Forrest = { trees }

Data Structures

popular tree type data structures
Popular Tree-Type Data Structures
  • BST: Binary Search Tree
  • Heap
  • Self-balancing binary search trees
    • AVL
    • Red-black
  • Radix tree

Data Structures

binary trees
Binary Trees
  • Code example for defining a tree data object
  • Tree Traversal
    • In-order: L Ro R
    • Pre-order: Ro L R
    • Post-order: L R Ro
    • Ө(n)

Data Structures

tree operations
Tree Operations
  • Search: Recursive: O(h)
    • h = height of the tree
  • Max & Min Search: search right/left
  • Successor & Predecessor Search
  • Insertion (easy: always add a new leaf) & Deletion (more complicated as it may cause the tree structure to change)
  • Running time:
    • function of the tree topology

Data Structures

binary search tree
Binary Search Tree
  • Improves the search time (and lookup time) over the binary tree in general
  • BST property:
    • for any node n, every descendant node's value in the left subtree of n is less than the value of n, and every descendant node's value in the right subtree is greater than the value of n

Data Structures

non bst vs bst
Non-BST vs BST
  • Non-BST
  • BST

Data Structures

linear search time in bst
Linear Search Time in BST

The search time for a BST depends upon its topology.

Data Structures

bst continued
BST continued
  • Perfectly balanced BST:
    • Search: O(log n) [ height = log n]
      • Sub-linear search running time
  • Balanced Binary Tree:
    • Exhibits a good ration: breadth/width
  • Self-balancing trees

Data Structures

the heap
The Heap
  • Specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B). [max-heap]
  • Operations:
    • delete-max or delete-min: removing the root node of a max- or min-heap, respectively
    • increase-key or decrease-key: updating a key within a max- or min-heap, respectively
    • insert: adding a new key to the heap
    • merge: joining two heaps to form a valid new heap containing all the elements of both

Data Structures

max heap example
Max Heap Example

Example of max-heap:

Data Structures

linked lists
Linked Lists
  • No resizing necessary
  • Search: O(n)
  • Insertion
    • O(1) if unsorted
    • O(n) is sorted
  • Access: O(n)
  • System.Collections.Generic.LinkedList
    • Doubly-linked; type safe (value  Generics)
    • Element: LinkedListNode

Data Structures

skip list
Skip List

Link list with self-balancing BST-like property

The elements are sorted

Height = log n

Problems with insert & delete

Solution: randomized distribution

Overall: O(log n)

Worst case: O(n) – but very, very, slim changes to reach worst case

Data Structures

skip list examples
Skip List Examples

Data Structures

  • A collection of interconnected nodes
  • A graph or undirected graphG is an ordered pair G: = (V,E) that is subject to the following conditions:
        • V is a set, whose elements are called vertices or nodes,
        • E is a set of pairs (unordered) of distinct vertices, called edges or lines.
  • Edges (1):
    • Directed - Weighted
    • Undirected - Unweighted

Data Structures

graph cont d
Graph (cont’d)
  • Sparse: |E| << |Emax| or |E| ≤ n2
  • Representation:
    • Adjacency List
    • Adjacency Matrix
    • (Packed Edge List)
  • Problems applicable to graphs:
    • Minimum spanning tree (Kruskal, Prim)
    • Shortest Path (Dijkstra)

Data Structures

distance graph example
Distance Graph Example

Data Structures

graph representation
Graph Representation

Data Structures

minimum spanning tree
Minimum Spanning Tree

Spanning Tree of a connected, undirected graph = some subset of the edges that connect all the nodes, and does not introduce a cycle

Data Structures

kruskal s algorithm
Kruskal’s Algorithm

Data Structures

prim s algorithm
Prim’s Algorithm

Data Structures