450 likes | 600 Views
LING 408/508: Computational Techniques for Linguists. Lecture 10 9/12/2012. Outline. Selection sort Optimization problems Greedy algorithms Uses of None List comprehensions. Selection sort. Ideas: Find the minimum value in the list Swap it with the value in the first position
E N D
LING 408/508: Computational Techniques for Linguists Lecture 10 9/12/2012
Outline • Selection sort • Optimization problems • Greedy algorithms • Uses of None • List comprehensions
Selection sort • Ideas: • Find the minimum value in the list • Swap it with the value in the first position • Repeat the steps above for the remainder of the list (starting at the second position and advancing each time)
Selection sort: example • Pass 1 • Input: • Procedure: find min value in min value is 1, swap with value at index 0, which is 4 • Result: 4 2 1 5 3 4 2 1 5 3 1 2 4 5 3
Selection sort: example • Pass 2 • Input: • Procedure: find min value in min value is 2, swap with value at index 1, which is 2 • Result: 1 2 4 5 3 2 4 5 3 1 2 4 5 3
Selection sort: example • Pass 3 • Input: • Procedure: find min value in min value is 3, swap with value at index 2, which is 4 • Result: 1 2 4 5 3 4 5 3 1 2 3 5 4
Selection sort: example • Pass 4 • Input: • Procedure: find min value in min value is 4, swap with value at index 3, which is 5 • Result: • We are done. 1 2 3 5 4 5 4 1 2 3 4 5
Finding the minimum value # assumes L is not empty list def find_min_val(L): min_val = L[0] for val in L: if val < min_val: min_val = val return min_val
Find the index of the minimum value # assumes L is not empty list def find_min_val(L): min_idx = 0 min_val = L[min_idx] for i in range(len(L)): val = L[i] if val < min_val: min_val = val min_idx = i return min_val, min_idx
Coding selection sort • Let i be the index of the value to be swapped • Find the min value in the rest of the list (inclusive), L[i:] • Let min_val be the current minimum value, and min_idx the index of this value, initialized to L[i] • Loop through L[i+1:], revise min_val and min_idx as smaller values are encountered # find min val and its index min_idx = i min_val = L[i] for j in range(i+1, len(L)): if L[j] < min_val: min_val = L[j] min_idx = j # swap with L[i] L[i], L[min_idx] = L[min_idx], L[i]
Outer loop • In each pass through the list in selection sort, we insert the smallest value in the rest of the list at position i • Let N be the length of the input list. • Need N-1 iterations, since the (N-1)th iteration places the last two items of the list in sorted order for i in range(len(L)-1): <inner loop>
Full code for selection sort def selection_sort(L): print('\napply selection sort to', L) for i in range(len(L)-1): print('at iteration {0}'.format(i+1), L) # find min val and its index min_idx = i min_val = L[i] for j in range(i+1, len(L)): if L[j] < min_val: min_val = L[j] min_idx = j # swap with L[i] L[i], L[min_idx] = L[min_idx], L[i] print(' end result:', L)
Test it on a few different cases selection_sort([4,2,1,5,3,0]) selection_sort([1,2,3]) selection_sort([])
Output apply selection sort to [4, 2, 1, 5, 3, 0] at iteration 1: [4, 2, 1, 5, 3, 0] at iteration 2: [0, 2, 1, 5, 3, 4] at iteration 3: [0, 1, 2, 5, 3, 4] at iteration 4: [0, 1, 2, 5, 3, 4] at iteration 5: [0, 1, 2, 3, 5, 4] end result: [0, 1, 2, 3, 4, 5] apply selection sort to [1, 2, 3] at iteration 1: [1, 2, 3] at iteration 2: [1, 2, 3] end result: [1, 2, 3] apply selection sort to [] end result: []
Analysis of selection sort def selection_sort(L): print('\napply selection sort to', L) for i in range(len(L)-1): print('at iteration {0}'.format(i+1), L) # find min val and its index min_idx = i min_val = L[i] for j in range(i+1, len(L)): if L[j] < min_val: min_val = L[j] min_idx = j # swap with L[i] L[i], L[min_idx] = L[min_idx], L[i] print(' end result:', L) Two nested loops with constant-time operations inside. Running time is O(N2).
Faster sorting • Bubble sort and insertion sort are both O(N2) algorithms. Can we do better? • We’ll see faster sorting in O(N log N) through recursive algorithms
Outline • Selection sort • Optimization problems • Greedy algorithms • Uses of None • List comprehensions
Optimization problems • Some problem to solve • Finite amount of resources • Optimize some quantity • Minimize resources used • Maximize value under constraints of resources • etc.
Example 1: making change • You have lots of quarters, dimes, nickels, and pennies. • What is the least number of coins it takes to produce $1.88 in change? (and how many of each coin?) • Optimal solution • 7 quarters: $1.75 • 1 dime: $0.10 • 0 nickels: $0.00 • 3 pennies: $0.03 • Total: $1.88 11 coins A suboptimal solution: 4 quarters: $1.00 6 dimes: $0.30 3 nickels: $0.15 43 pennies: $0.43 Total: $1.88 56 coins
Example 2: knapsack • You have a knapsack that holds up to 100 pounds of items. • You have a set of items that you want to put in the knapsack of weights { 3, 7, 23, 1, 2, 5, 6, 43, 6, 8, 19, 53 }. • Optimization problems: • Minimize the quantity of items in the knapsack, such that the knapsack is completely full. • Maximize the weight of items in the knapsack.
Outline • Selection sort • Optimization problems • Greedy algorithms • Uses of None • List comprehensions
Greedy algorithms • http://en.wikipedia.org/wiki/Greedy_algorithm • Solve the problem through a succession of locally optimal choices, hoping that this strategy will lead to a globally optimum result • Solutions take multiple steps. • Locally optimal: a single step that takes you as close as possible to solving the problem • Globally optimal: best of all possible solutions • A greedy strategy only works for some optimization problems
What is the least number of coins it takes to produce $1.88 in change? • Step 1: • least number of some coin to produce as close to $1.88 as possible in change • Try most-valuable coin, the quarter • 7 quarters produces $1.75, $0.13 remaining, can’t use any more quarters • Step 2: • least number of some coin to produce as close to $0.13 as possible in change • Can’t use any more quarters. Use next most-valuable coin, the dime • 1 dime produces $0.10, $0.03 remaining, can’t use any more quarters or dimes • Step 3: • least number of some coin to produce as close to $0.03 as possible in change • Can’t use any more dimes. Use next most-valuable coin, the nickel • $0.03 remaining, can’t use any more quarters, dimes, or nickels • Step 4: • least number of some coin to produce as close to $0.03 as possible in change • Can’t use any more dimes. Use next most-valuable coin, the penny • 3 pennies produces $0.03, $0.00 remaining • DONE: 7 quarters, 1 dime, 3 pennies
Turn problem into subproblem • The greedy strategy makes the assumption that if a locally optimal choice is made at the current step, the resulting subproblem can also be solved by a greedy strategy. • Original problem: what is the least number of coins it takes to produce $1.88 in change? • Stated greedily: least number of some coin it takes to produce as close to $1.88 as possiblein change • 7 quarters = $1.75 • Subproblem: least number of some coin it takes to produce as close to $0.13 as possible in change • 1 dime = $0.10 • Subproblem: least number of some coin it takes to produce as close to $0.03 as possible in change • 3 pennies = $0.03 • Subproblem: least number of some coin it takes to produce as close to $0.00 as possible in change • Done, no subproblem, have found a solution
Why doesn’t this work? def minimal_change(amount): num_quarters = amount // 0.25 amount = amount % 0.25 num_dimes = amount // 0.10 amount = amount % 0.10 num_nickels = amount // 0.05 amount = amount % 0.05 num_pennies = amount // 0.01 print('num_quarters:', num_quarters) print('num_dimes:', num_dimes) print('num_nickels:', num_nickels) print('num_pennies:', num_pennies) minimal_change(1.88) # Output: # num_quarters: 7.0 # num_dimes: 1.0 # num_nickels: 0.0 # num_pennies: 2.0
Precision of floating-point numbers • Computers cannot represent most floating-point numbers exactly >>> 1.88 1.88 >>> 1.75 1.75 >>> 1.88-1.75 # should be 0.13 0.1299999999999999 >>> >>> .03//.01 2.0 >>> .03%.01 # should be zero 0.009999999999999998
Multiply by 100, convert to integers def minimal_change2(amount): amt = int(100*amount) num_quarters = amt // 25 amt = amt % 25 num_dimes = amt // 10 amt = amt % 10 num_nickels = amt // 5 amt = amt % 5 num_pennies = amt // 1 print('num_quarters:', num_quarters) print('num_dimes:', num_dimes) print('num_nickels:', num_nickels) print('num_pennies:', num_pennies) minimal_change2(1.88) # num_quarters: 7 # num_dimes: 1 # num_nickels: 0 # num_pennies: 3
Apply greedy approach to knapsack • You have a knapsack that holds up to 100 pounds of items. • You have a set of items that you want to put in the knapsack of weights { 3, 7, 23, 1, 2, 5, 6, 43, 6, 8, 19, 53 }. • Optimization problems: • Minimize quantity of items in the knapsack, such that the knapsack is completely full • Maximize the weight of items in the knapsack.
Minimize quantity of items in knapsack • Weights { 3, 7, 38, 1, 2, 5, 6, 43, 6, 8, 19, 53 }. • Sorted: [53, 43, 38, 19, 8, 7, 6, 6, 5, 3, 2, 1] • Greedy approach: pick one at a time, choose heaviest item • Empty knapsack holds 100 pounds. • Choose 53; 47 pounds remaining. • Choose 43; 4 pounds remaining. • Choose 3; 1 pound remaining. • Choose 1. • Not the optimal solution: • 53 + 43 + 3 + 1 = 100, but • 43 + 38 + 19 = 100 • Greedy approach doesn’t work for this knapsack problem!
Maximize the weight of items in the knapsack • New set of object weights: • Weights { 7, 38, 5, 6, 43, 6, 8, 19, 53 } • Sorted: [53, 43, 38, 19, 8, 7, 6, 6, 5] • Greedy strategy: lightest first • 5 + 6 + 6 + 7 + 8 + 19 + 38 = 89 • Greedy strategy: heaviest first • 53 + 43 = 96 • Optimal solution: not found by either greedy strategy • 43 + 38 + 19 = 100 • How can we find a solution? What are all the possible solutions?
Brute-force approach: find all subsets(represent as lists because we want repeated elements) • [53, 43, 38, 19, 8, 7, 6, 6, 5] • [] • [53] • [43] • [38] • [53, 43] • [53, 38] • [43, 38] • [53, 43, 38] • etc. • Then sum each subset, see if it equals 100 • We’ll see how to do this in the next class
Outline • Selection sort • Optimization problems • Greedy algorithms • Uses of None • List comprehensions
Return value of None • Functions that don’t have explicit return values return None >>> def f(): print('hello') >>> a = f() 'hello' >>> a None
Use None as a “non-useful” value • Try alternative representation of data [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] • Represent as: [None,None,2,3,None,5,None,7,None,None,None,11,None,13,None,None] • Everything that isn’t None is a prime • Wouldn’t quite make sense to have a different value in place of None, such as -1, or False
None as an initial value • Use None when you want to declare a variable, but do not want to assign it a value • Subsequent computation (usually with a loop) will assign a value to that variable >>> first_list_with_zero= None >>> L = [[1,2], [2,3,4], [], [0,6,7,8]] >>> for mylist in L: if 0 in mylist: first_list_with_zero = mylist[:] >>> first_list_with_zero [0, 6, 7, 8]
Works in this case if variable is not declared initially • Since there is a list with zero in it, the variable first_list_with_zerois created • It has global scope, so it’s available outside the loop >>> # first_list_with_zero= None >>> L = [[1,2], [2,3,4], [], [0,6,7,8]] >>> for mylist in L: if 0 in mylist: first_list_with_zero = mylist[:] >>> first_list_with_zero [0, 6, 7, 8]
Should initialize the variable, to be safe • In this case, there is no list with a zero in it • Therefore the variable first_list_with_zero is never created >>> L = [[1,2], [2,3,4], [], [6,7,8]] >>> for mylist in L: if 0 in mylist: first_list_with_zero = mylist[:] >>> first_list_with_zero Traceback (most recent call last): File "<pyshell#16>", line 1, in <module> first_list_with_zero NameError: name 'first_list_with_zero' is not defined
Outline • Selection sort • Optimization problems • Greedy algorithms • Uses of None • List comprehensions
List comprehensions • Unique to Python language, used very often • Syntax: [<f(x)> for <x> in <mylist> (if <bool>)] • Creates a new list from an existing list • Equivalent to using explicit loops, but optimized to run faster
List comprehension:operation on list elements • Create a new list containing elements of L, squared L = [1, 2, 3, 4, 5] • Using for loop: L_sq = [] for x in L: L_sq.append(x**2) • Using list comprehension: L_sq = [x**2 for x in L]
List comprehension: condition • Create a list containing odd elements of L L = [1, 2, 3, 4, 5] • Using for loop: L_odd = [] for x in L: if x%2==1: L_odd.append(x) • Using list comprehension: L_odd = [x for x in L if x%2==1]
List comprehension: condition and operation on list elements • Create a list containing odd elements of L, squared L = [1, 2, 3, 4, 5] • Using for loop: L_odd_sq = [] for x in L: if x%2==1: L_odd_sq.append(x**2) • Using list comprehension: L_odd_sq = [x**2 for x in L if x%2==1]
List comprehensionwhere element is a list >>> L = [[1,2,3], [4,5], [6,7,8,9]] >>> L2 = [x[::-1] for x in L] >>> L2 [[3, 2, 1], [5, 4], [9, 8, 7, 6]]
List comprehension where element is a list, and list is modified with list comprehension >>> L = [[1,2,3], [4,5], [6,7,8,9]] >>> L2 = [[y for y in x if y%2==1] for x in L] >>> L2 [[1, 3], [5], [7, 9]] • Explanation: L2 = [] for x in L: for y in x: if y%2==1: L2.append(y) [[y for y in x if y%2==1] for x in L] [1,2,3] [4,5] [6,7,8,9] [ ] , [5] [1, 3] , [7,9]
List comprehensionsover multiple, separate lists • Left-to-right nesting of list comprehensions is like top-to-bottom nesting in for loops • Hard to read, not recommended!!! L1 = [x+y for x in [10, 20, 30] for y in [1, 2, 3]] L2 = [] for x in [10, 20, 30]: for y in [1, 2, 3]: L2.append(x+y) # both produce: # [11, 12, 13, 21, 22, 23, 31, 32, 33]