CS 2110 Final Review

1 / 86

CS 2110 Final Review - PowerPoint PPT Presentation

CS 2110 Final Review. Topics. Will be covered Important Programming Concepts Types, Recursion/Induction, Asymptotic Complexity The Toolbox (Data Structures) Arrays, Linked Lists, Trees (BSTs, Heaps), Hashtables Practical Things Searching, Sorting, Abstract Data Types Graphs

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'CS 2110 Final Review' - mariko

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

CS 2110 Final Review

Topics
• Will be covered
• Important Programming Concepts
• Types, Recursion/Induction, Asymptotic Complexity
• The Toolbox (Data Structures)
• Arrays, Linked Lists, Trees (BSTs, Heaps), Hashtables
• Practical Things
• Searching, Sorting, Abstract Data Types
• Graphs
Topics
• Sort of Important
• Don’t have a panic attack over these topics!
• GUIS
• Just do the problems on old prelims/finals
• Software engineering
• Don’t write horrible code on coding questions
• But they are both important in the real world!
Topics
• Will Not Be Covered
• Lectures 24 and above
• Java Virtual Machine
• Distributed Systems and Cloud Computing
• Balancing Trees (e.g.: AVL trees)
• But do know what a balanced and unbalanced tree is
• Recurrences
• Network Flow
Typing
• Primitive Types
• boolean, int, double, etc…
• Test equality with == and !=
• Compare with <, <=, >, and >=
Pass by Value

void f(int x) {

x--;

}

int x = 10;

f(x);

// x == 10

f

10

f

9

main

10

Typing
• Reference types
• Actual object is stored elsewhere
• Variable contains a reference to the object
• == tests equality for the reference
• equals() tests equality for the object
• x == y implies that x.equals(y)
• x.equals(y) does not imply x == y
• How do we compare objects of type T?
• Implement the Comparable<T> interface
Pass by Reference

{}

void f(ArrayList<Integer> l) {

l = new ArrayList<Integer>();

}

ArrayList<Integer> l =

new ArrayList<Integer >();

f(l);

// l contains 1, 2

f

l

main

l

{}

{1}

{1,2}

Typing
• We know that type B can implement/extend A
• B is a subtype of A; A is a supertype of B
• The real type of the object is its dynamic type
• This type is known only at run-time
• Object can act like asupertype of its dynamic type
• Cannot act like a subtype of its dynamic type
• Variables/arguments of type A accept any subtype of A
• A is a supertype of the static type
• Static type is a supertypeof the dynamic type
Typing
• Static type is compile-time, determined by code
• Dynamic type might be a subtype of the static type
• Casting temporarily changes the static type
• Upcasts are always safe
• Always cast to a supertype of the dynamic type
• Downcasts may not be safe
• Can downcast to a supertype of the dynamic type
• Cannot downcast to a subtype of the dynamic type
Typing
• If B extends A, and B and A both have function foo()
• Which foo() gets called?
• Answer depends on the dynamic type
• If the dynamic type is B, B’s foowill() be called
• Even iffoo() is invoked inside a function of A
• Exception: static functions
• Static functions are not associated with any object
• Thus, they do not have any type
Induction and Recursion
• Recursion
• Basic examples
• Factorial : n! = n(n-1)!
• Combinations Pascal’s triangle
• Recursive structure
• Tree (tree t = root with right/left subtree)
• Depth first search
• Don’t forget base case (in proof, in your code)
Induction and Recursion
• Induction
• Can do induction on previous recursive problems
• Algorithm correctness proof (DFS)
• Math equation proof
Induction and Recursion
• Step 1
• Base case
• Step 2
• suppose n is the variable you’re going to do induction on. Suppose the equation holds when n=k
• Strong induction: suppose it holds for all n<=k
• Step 3
• prove that when n = k+1, equation still holds, by making use of the assumptions in Step 2.
Asymptotic Complexity
• f(n) is O(g(n)) if:
• ∃ (c, n0) such that ∀ n ≥ n0, f(n) ≤ c⋅g(n)
• ∃ - there exists; ∀ - for all
• (c, n0) is called the witness pair
• f(n) is O(g(n)) roughly means f(n) ≤ g(n)
• Big-O notation is a model for running time
• Don’t need to know the computer’s specs
• Models usually but do not always work in real life
Asymptotic Complexity
• Meaning of n0
• We can compare two integers
• How can we compare two functions?
• Answer is which function grows faster
• Fast vs. slow car
• 60-mph car with no headstart
• 40-mph car with a headstart
• n0 is when the fast car overtakes the slow one
• Functions have a dominant term
• 2n3 + 4n2 + n + 2: 2n3 is the dominant term
Asymptotic Complexity
• Meaning of c
• Cannot get a precise measurement
• Famous election recounts (Bush/Gore, Coleman/Franken)
• Algorithm’s speed on a 2 GHz versus a 1 GHz processor
• Hard to measure constant for the dominant term
• c is the fudge factor
• Change the speed by a constant factor
• 2GHz is at most twice as fast as a 1 GHz (c = 2)
• 2n3 is O(n3)
• Fast and slow car have asymptotically equal speed
Asymptotic Complexity
• (Assume c, d are constants)
• Logarithmic vs. Logarithmic: log(nc), log(nd) “equal”
• Difference is a constant factor ( log(nc) = c log(n) )
• Logarithm’s base also does not matter
• Logarithmic vs. Polynomial: log n is O(nc)
• Corollary: O(n log n) better than O(n2)
• Polynomial vs. Polynomial: nc is O(nd), c ≤ d
• Polynomial vs. Exponential: nc is O(dn)
• Exponential running time almost always too slow!
Arrays
• Arrays have a fixed capacity
• If more space needed…
• Allocate larger array
• Copy smaller array into larger one
• Entire operations takes O(n) time
• Arrays have random access
• Can read any element in O(1) times
• Each node has three parts
• Value stored in node
• Next node
• Previous node
• Very easy to grow linked list in both directions
• Downside: sequential access
• O(n) time to access something in the middle
Trees
• Recursive data structure
• A tree is…
• A single node
• Zero or more subtrees below it
• Every node (except the root) has one parent
• Properties of trees must hold for all nodes in the tree
• Each node is the root of some tree
• Makes sense for recursive algorithms
Binary Trees
• Each node can have at most two children
• We usually distinguish between left, right child
• Trees we study in CS 2110 are binary trees
Binary Search Trees
• Used to sort data inside a tree
• For every node with value x:
• Every node in the left subtree has a value < x
• Every node in the right subtree has a value > x
• Binary search tree is not guaranteed to be balanced
• If it is balanced, we can find a node in O(log n) time
Binary Search Trees

Completely unbalanced, but still a BST

Not a Binary Search Tree

8 is in the left subtree of 5

Not a Binary Search Tree

3 is the left child of 2

Adding to a Binary Search Tree
• Adding 4 to the BST
• Start at root (5)
• 4 < 5
• Go left to 2
• 4 > 2
• Go right to 3
• 4 > 3
• Add 4 as right child of 3
Tree Traversals
• Converts the tree into a list
• Works on any binary tree
• Do not need a binary search tree
• Traverse node and its left and right subtrees
• Subtreesare traversed recursively
• Preorder: node, left subtree, right subtree
• Inorder: left subtree, node, right subtree
• Produces a sorted list for binary search trees
• Postorder: left subtree, rightsubtree, node
Tree Traversals
• Inorder Traversal
• In(5)
• In(2), 5, In(7)
• 1, 2, In(3), 5, In(7)
• 1, 2, 3, 4, 5, In(7)
• 1, 2, 3, 4, 5, 6, 7, 9
Binary Heaps
• Weaker condition on each node
• A node is smaller than its immediate children
• Don’t care about entire subtree
• Don’t care if right child is smaller than left child
• Smallest node guaranteed to be at the root
• No guarantees beyond that
• Guarantee also holds for each subtree
• Heaps grow top-to-bottom, left-to-right
• Shrink in opposite direction
Binary Heaps
• Find lowest unfilled level of tree
• Find leftmost empty spot, add node there
• New node could be smaller than its parent
• Swap it up if its smaller
• If swapped, could still be smaller than its new parent
• Keep swapping up until the parent is smaller
Binary Heaps
• Removing from a heap
• Take out element from the top of the heap
• Find rightmost element on lowest level
• Make this the new root
• New root could be larger than one/both children
• Swap it with the smallest child
• Keep swapping down…
Binary Heaps
• Can represent a heap as an array
• Root element is at index 0
• For an element at index i
• Parent is at index (i – 1)/2
• Children are at indices 2i + 1, 2i + 2
• n-element heap takes up first n spots in the array
• New elements grow heap by 1
• Removing an element shrinks heap by 1
Hashtables
• Motivation
• Sort n numbers between 0 and 2n – 1
• Sorting integers within a certain range
• More specific than comparable objects with unlimited range
• General lower bound of O(n log n) may not apply
• Can be done in O(n) time with counting sort
• Create an array of size 2n
• The ith entry counts all the numbers equal to i
• For each number, increment the correct entry
• Can also find a given number in O(1) time
Hashtables
• Cannot do this with arbitrary data types
• Integer alone can have over 4 billion possible values
• For a hashtable, create an array of size m
• Hash function maps each object to an array index between 0 and m – 1 (in O(1) time)
• Hash function makes sorting impossible
• Quality of hash function is based on how many elements map to same index in the hashtable
• Need to expect O(1) collisions
Hashtables
• Dealing with collisions
• In counting sort, one array entry contains only element of the same value
• The hash function can map different objects to the same index of the hashtable
• Chaining
• Each entry of the hashtable is a linked list
• Linear Probing
• If h(x) is taken, try h(x) + 1, h(x) + 2, h(x) + 3, …
• Quadratic probing: h(x) + 1, h(x) + 4, h(x) + 9, …
Hashtables
• Table Size
• If too large, we waste space
• If too small, everything collides with each other
• Probing falls apart if number of items (n) is almost the size of the hashtable (m)
• Typically have a load factor 0 < λ≤ 1
• Resize table when n/m exceeds λ
• Resizing changes m
• Have to reinsert everything into new hashtable
Hashtables
• Table doubling
• Double the size every time we exceed our load factor
• Worst case is when we just doubled the hashtable
• Consider all prior times we doubled the table
• n + n/2 + n/4 + n/8 + … < 2n
• Insert n items in O(n) time
• Average O(1) time to insert one item
• Some operations take O(n) time
• This also works for growing an ArrayList
Hashtables
• Java, hashCode(), and equals()
• hashCode() assigns an object an integer value
• Java maps this integer to anumber between 0 and m – 1
• If x.equals(y), x and y should have the same hashCode()
• Insert objectwith one hashCode()
• Won’t find it if you look it up with a different hashCode()
• If you override equals(), you must also override hashCode()
• Different objects can have the same hashCode()
• If this happens too often, we have too many collisions
• Only equals() can determine if they are equal
Searching
• Unsorted lists
• Element could be anywhere, O(n) search time
• Sorted lists – binary search
• Try middle element
• Search left half if too large, right half if too small
• Each step cuts search space in half, O(log n) steps
• Binary search requires random access
• Searching a sorted linked list is still O(n)
Sorting
• Many ways to sort
• Same high-level goal, same result
• What’s the difference?
• Algorithm, data structures used dictates running time
• Also dictate space usages
• Each algorithm has its own flavor
Sorting
• Swap operation: swap(x, i, j)
• temp = x[i]
• x[i] = x[j]
• x[j] = temp
• Many sorts are a fancy set of swap instructions
• Modifies the array in place, very space-efficient
• Not space efficient to copy a large array
Insertion Sort
• for (i = 0…n-1)
• Take element at index i, swap it back one spot until…
• you hit the beginning of the list
• previous element is smaller than this one
• Ex.: 4, 7, 8, 6, 2, 9, 1, 5 – swap 6 back twice
• After k iterations, first k elements relatively sorted
• 4 iterations: 4, 6, 7, 8, 2, 9, 1, 5
• O(n2) time, in-place sort
• O(n) for sorted lists, good for nearly sorted lists
Selection Sort
• for (i = 0…n-1)
• Scan array from index i to the end
• Swap smallest element into index i
• Ex.: 1, 3, 5, 8, 9, 6, 7 – swap 6, 8
• After k iterations, first k elements absolutely sorted
• 4 iterations: 1, 3, 5, 6, 9, 8, 7
• O(n2) time, in-place sort
• Also O(n2) for sorted lists
Merge Sort
• Copy left half, right half into two smaller arrays
• Recursively run merge sort each half
• Base case: 1 or 2 element array
• Merge two sorted halves back into original array
• Ex. (1, 3, 6, 9), (2, 5, 7, 8) – (1, 2, 3, 5, 6, 7, 8, 9)
• Running Time: O(n log n)
• Merge takes O(n) time
• Split the list in half about log(n) times
• Also uses O(n) extra space!
Quick Sort
• Randomly pick a pivot
• Partition into two (unequal) halves
• Left partition smaller than pivot, right partition larger
• Recursively run quick sort on both partitions
• Expected Running Time: O(n log n)
• Partition takes O(n) time
• Pivot could be smallest, largest element (n partitions)
• Expected to split list O(log n) times
• In-place sort: partition can be done in-place
Heap Sort
• Use a max-heap (represented as an array)
• Can move items up/down the heap with swaps
• for (i = 0…n-1)
• Add element at index i to the heap
• First pass builds the heaps
• for (i = n-1…0)
• Remove largest element, put it in spot i
• O(n log n), in-place sort
Abstract Data Types
• Lists
• Stacks
• LIFO
• Queues
• FIFO
• Sets
• Dictionaries (Maps)
• Priority Queues
• Java API
• E.g.: ArrayList is an ADT list backed by an array
Abstract Data Types
• Priority Queue
• Implemented as heap
• peek() -look at heap root : O(1)
• poll() -heap “delete” op : O(log n)
What is a graph?
• A graph has vertices
• A graph has edges between two vertices
• n – number of vertices; m – number of edges
• Directed vs. undirected graph
• Directed edges can only be traversed one way
• Undirected edges can be traversed both way
• Weighted vs. unweighted graph
• Edges could have weights/costs assigned to them
What is a graph?
• What makes a graph special?
• Cycles!!!
• What is a graph without a cycle?
• Undirected graphs
• Trees
• Directed graphs
• Directed acyclic graph (DAG)
Topological Sort
• Topological sort is for directed graphs
• Indegree: number of edges entering a vertex
• Outdegree: number of edges leaving a vertex
• Topological sort algorithm
• Delete a vertex with an indegree of 0
• Delete its outgoing edges, too
• Repeat until no vertices have an indegree of 0
Topological Sort
• What can’ttopological sort cannot delete?
• Cycles!!!
• Every node in a cycle has an indegree of 1
• Need to delete another node in the cycle first
• A graph is DAG iff a topological sort deletes it
• iff - if and only if
Graph Searching
• Works on directed and undirected graphs
• Components – set of vertices connected to edges
• Component starts as a single vertex
• No edges are selected yet
• Travel edges to add vertices to this component
• From a vertex in the component to a new vertex
Graph Searching
• Implementation details
• Why is choosing a random path risky?
• Cycles!!!
• Could traverse a cycle forever
• Need to keep track of vertices in your component
• No cycles if you do not visit a vertex twice
• Graph search algorithms need:
• A set of vertices in the component (visited)
• A collection of vertices to visit (toVisit)
Graph Searching
• Add the start vertex to toVisit
• Pick a vertex from toVisit
• If the vertex is in visited, do nothing
• Could add same vertex to toVisit twice before visiting it
• If the vertex is not in visited:
• Add vertex (and the edge to it) to visited
• Follow its edges to neighboring vertices
• Add each neighboring vertices to toVisit if it is not in visited
• Repeat until there are no more vertices to visit
Graph Searching
• Running time analysis
• Can check if vertex is in visited in O(1) time
• Use an array or HashSet
• Each edge added to toVisit once in a directed graph
• Algorithm has at most m iterations
• Running time determined by ADT used for toVisit
• Stack, queue: O(1) to add, delete; O(m) total
• Priority queue: O(log m) to add, delete; O(m log m) total
Graph Searching
• Depth-first search
• Uses a stack
• Goes as deep as it can before taking a new path
• Uses a queue
Graph Searching: DFS

B

A

B

E

D

C

A

E

C

B-E

E-D

∅-A

A-B

B-C

D

Graph Searching: BFS

B

A

B

C

E

D

E-D

A

E

C

C-D

B-E

∅-A

B-C

A-B

D

Minimum Spanning Tree
• MSTs apply to undirected graphs
• Take only some of the edges in the graph
• Spanning – all vertices connected together
• Tree – no cycles connected
• For all spanning trees, m = n – 1
• All unweighted spanning trees are MSTs
• Need to find MST for a weighted graph
Minimum Spanning Tree
• Initially, we have no edges selected
• We have n 1-vertex components
• Find edge between two components
• Don’t add edge between vertices in same component
• Creates a cycle
• Pick smallest edge between two components
• This is a greedy strategy, but it somehow works
• If edge weights are distinct, only one possible MST
• Doesn’t matter which algorithm you choose
Minimum Spanning Trees
• Kruskal’s algorithm
• Process edges from least to greatest
• Each edge either
• Connects two vertices in two different components (take it)
• Connects two vertices in the same component (throw it out)
• O(m log m) running time
• O(m log m) to sort the edges
• Need union-find to track components and merge them
• Does not alter running time
Minimum Spanning Trees

B

E

7

5

9

8

A

D

G

12

4

3

2

C

F

1

10

Minimum Spanning Trees
• Prim’s algorithm
• Graph search algorithm
• Like BFS, but it uses a priority queue
• Priority is the edge weight
• Size of heap is O(m); running time is O(m log m)
Minimum Spanning Trees

B

E

7

5

9

8

A

D

G

12

4

3

2

C

F

1

10

C-D

3

A-C

2

∅-A

0

C-B

4

A-B

5

B-E

7

D-E

8

E-G

9

G-F

1

D-F

10

D-G

12

Shortest Path Algorithm
• Fordirected and undirected graphs
• Find shortest path between two vertices
• Start vertex is the source
• End vertex is the sink
• Use Dijkstra’s algorithm
• Assumes positive edge weights
• Graph search algorithm using a priority queue
• Weight is now entire path length
• Cost of path to node + cost of edge to next vertex
Shortest Path Algorithm

5

12

B

E

7

5

9

8

A

D

G

15

4

3

2

0

C

5

F

1

16

10

2

15

∅-A

0

A-C

2

A-B

5

C-D

5

C-B

6

B-E

12

D-E

13

F-G

16

D-F

15

D-G

20

E-G

21

• Splitting the Work
• Less Obvious: Not all threads are active
• Multi-task: Study for CS 2110 while you wait!
• Operating system runs another thread while one waits
• Code – Set of instructions, stored as data
• A paper with instructions does not do anything
• Code by itself does not run or execute
• Multiple people can follow the same instructions
• Multiple threads can run the same code
• Making the Code
• Write a class implementing the Runnable interface
• Or use an anonymous inner class on the fly…
• Code to execute goes in the run() method
• Multiple threads could be constructed with the same target
Concurrency
• Two or more active threads
• Hard to predict each thread’s “speed”
• Nondeterministic behavior
• Two active threads, one data structure
• Both perform writes, or one reads while another writes
• No problems if both read data
• E.g.: One heap operation (such as add) takes multiple steps
• Wait until update completes to read/write again
• Data in a bad state until this happens
Concurrency
• Avoiding race conditions (write/write, read/write)
• Imagine a room with a key
• Must acquire the key before entering the room
• Enter room, do many things, leave room, drop off key
• Mutual exclusion, or mutexes
• Only one thread can acquire a mutex
• Others threads wait until mutex is released
Synchronization
• Could use Java’s Semaphore class with a count of 1
• Primitive, many corner cases like exceptions
• Better: synchronized (x) { /* body*/ }
• Thread 1 acquires a lock on object
• x is a reference type; lock acquired on its memory location
• Thread 2 must wait if it wants a lock on x
• Also waits if it wants a lock on y if x == y
• Synchronized functions: synchronized (this)
• Only one synchronized function running per object
Synchronization
• Only synchronize when you have to
• Get key, enter room, chat with BFF on phone for 1 hour…
• People waiting for key are very angry!