LING 408/508: Computational Techniques for Linguists

LING 408/508: Computational Techniques for Linguists Lecture 10 9/12/2012

Outline • Selection sort • Optimization problems • Greedy algorithms • Uses of None • List comprehensions

Selection sort • Ideas: • Find the minimum value in the list • Swap it with the value in the first position • Repeat the steps above for the remainder of the list (starting at the second position and advancing each time)

Selection sort: example • Pass 1 • Input: • Procedure: find min value in min value is 1, swap with value at index 0, which is 4 • Result: 4 2 1 5 3 4 2 1 5 3 1 2 4 5 3

Selection sort: example • Pass 2 • Input: • Procedure: find min value in min value is 2, swap with value at index 1, which is 2 • Result: 1 2 4 5 3 2 4 5 3 1 2 4 5 3

Selection sort: example • Pass 3 • Input: • Procedure: find min value in min value is 3, swap with value at index 2, which is 4 • Result: 1 2 4 5 3 4 5 3 1 2 3 5 4

Selection sort: example • Pass 4 • Input: • Procedure: find min value in min value is 4, swap with value at index 3, which is 5 • Result: • We are done. 1 2 3 5 4 5 4 1 2 3 4 5

Finding the minimum value # assumes L is not empty list def find_min_val(L): min_val = L[0] for val in L: if val < min_val: min_val = val return min_val

Find the index of the minimum value # assumes L is not empty list def find_min_val(L): min_idx = 0 min_val = L[min_idx] for i in range(len(L)): val = L[i] if val < min_val: min_val = val min_idx = i return min_val, min_idx

Coding selection sort • Let i be the index of the value to be swapped • Find the min value in the rest of the list (inclusive), L[i:] • Let min_val be the current minimum value, and min_idx the index of this value, initialized to L[i] • Loop through L[i+1:], revise min_val and min_idx as smaller values are encountered # find min val and its index min_idx = i min_val = L[i] for j in range(i+1, len(L)): if L[j] < min_val: min_val = L[j] min_idx = j # swap with L[i] L[i], L[min_idx] = L[min_idx], L[i]

Outer loop • In each pass through the list in selection sort, we insert the smallest value in the rest of the list at position i • Let N be the length of the input list. • Need N-1 iterations, since the (N-1)th iteration places the last two items of the list in sorted order for i in range(len(L)-1): <inner loop>

Full code for selection sort def selection_sort(L): print('\napply selection sort to', L) for i in range(len(L)-1): print('at iteration {0}'.format(i+1), L) # find min val and its index min_idx = i min_val = L[i] for j in range(i+1, len(L)): if L[j] < min_val: min_val = L[j] min_idx = j # swap with L[i] L[i], L[min_idx] = L[min_idx], L[i] print(' end result:', L)

Test it on a few different cases selection_sort([4,2,1,5,3,0]) selection_sort([1,2,3]) selection_sort([])

Output apply selection sort to [4, 2, 1, 5, 3, 0] at iteration 1: [4, 2, 1, 5, 3, 0] at iteration 2: [0, 2, 1, 5, 3, 4] at iteration 3: [0, 1, 2, 5, 3, 4] at iteration 4: [0, 1, 2, 5, 3, 4] at iteration 5: [0, 1, 2, 3, 5, 4] end result: [0, 1, 2, 3, 4, 5] apply selection sort to [1, 2, 3] at iteration 1: [1, 2, 3] at iteration 2: [1, 2, 3] end result: [1, 2, 3] apply selection sort to [] end result: []

Analysis of selection sort def selection_sort(L): print('\napply selection sort to', L) for i in range(len(L)-1): print('at iteration {0}'.format(i+1), L) # find min val and its index min_idx = i min_val = L[i] for j in range(i+1, len(L)): if L[j] < min_val: min_val = L[j] min_idx = j # swap with L[i] L[i], L[min_idx] = L[min_idx], L[i] print(' end result:', L) Two nested loops with constant-time operations inside. Running time is O(N2).

Faster sorting • Bubble sort and insertion sort are both O(N2) algorithms. Can we do better? • We’ll see faster sorting in O(N log N) through recursive algorithms

Optimization problems • Some problem to solve • Finite amount of resources • Optimize some quantity • Minimize resources used • Maximize value under constraints of resources • etc.

Example 1: making change • You have lots of quarters, dimes, nickels, and pennies. • What is the least number of coins it takes to produce $1.88 in change? (and how many of each coin?) • Optimal solution • 7 quarters: $1.75 • 1 dime: $0.10 • 0 nickels: $0.00 • 3 pennies: $0.03 • Total: $1.88 11 coins A suboptimal solution: 4 quarters: $1.00 6 dimes: $0.30 3 nickels: $0.15 43 pennies: $0.43 Total: $1.88 56 coins

Example 2: knapsack • You have a knapsack that holds up to 100 pounds of items. • You have a set of items that you want to put in the knapsack of weights { 3, 7, 23, 1, 2, 5, 6, 43, 6, 8, 19, 53 }. • Optimization problems: • Minimize the quantity of items in the knapsack, such that the knapsack is completely full. • Maximize the weight of items in the knapsack.

Greedy algorithms • http://en.wikipedia.org/wiki/Greedy_algorithm • Solve the problem through a succession of locally optimal choices, hoping that this strategy will lead to a globally optimum result • Solutions take multiple steps. • Locally optimal: a single step that takes you as close as possible to solving the problem • Globally optimal: best of all possible solutions • A greedy strategy only works for some optimization problems

What is the least number of coins it takes to produce $1.88 in change? • Step 1: • least number of some coin to produce as close to $1.88 as possible in change • Try most-valuable coin, the quarter • 7 quarters produces $1.75, $0.13 remaining, can’t use any more quarters • Step 2: • least number of some coin to produce as close to $0.13 as possible in change • Can’t use any more quarters. Use next most-valuable coin, the dime • 1 dime produces $0.10, $0.03 remaining, can’t use any more quarters or dimes • Step 3: • least number of some coin to produce as close to $0.03 as possible in change • Can’t use any more dimes. Use next most-valuable coin, the nickel • $0.03 remaining, can’t use any more quarters, dimes, or nickels • Step 4: • least number of some coin to produce as close to $0.03 as possible in change • Can’t use any more dimes. Use next most-valuable coin, the penny • 3 pennies produces $0.03, $0.00 remaining • DONE: 7 quarters, 1 dime, 3 pennies

Turn problem into subproblem • The greedy strategy makes the assumption that if a locally optimal choice is made at the current step, the resulting subproblem can also be solved by a greedy strategy. • Original problem: what is the least number of coins it takes to produce $1.88 in change? • Stated greedily: least number of some coin it takes to produce as close to $1.88 as possiblein change • 7 quarters = $1.75 • Subproblem: least number of some coin it takes to produce as close to $0.13 as possible in change • 1 dime = $0.10 • Subproblem: least number of some coin it takes to produce as close to $0.03 as possible in change • 3 pennies = $0.03 • Subproblem: least number of some coin it takes to produce as close to $0.00 as possible in change • Done, no subproblem, have found a solution

Why doesn’t this work? def minimal_change(amount): num_quarters = amount // 0.25 amount = amount % 0.25 num_dimes = amount // 0.10 amount = amount % 0.10 num_nickels = amount // 0.05 amount = amount % 0.05 num_pennies = amount // 0.01 print('num_quarters:', num_quarters) print('num_dimes:', num_dimes) print('num_nickels:', num_nickels) print('num_pennies:', num_pennies) minimal_change(1.88) # Output: # num_quarters: 7.0 # num_dimes: 1.0 # num_nickels: 0.0 # num_pennies: 2.0

Precision of floating-point numbers • Computers cannot represent most floating-point numbers exactly >>> 1.88 1.88 >>> 1.75 1.75 >>> 1.88-1.75 # should be 0.13 0.1299999999999999 >>> >>> .03//.01 2.0 >>> .03%.01 # should be zero 0.009999999999999998

Multiply by 100, convert to integers def minimal_change2(amount): amt = int(100*amount) num_quarters = amt // 25 amt = amt % 25 num_dimes = amt // 10 amt = amt % 10 num_nickels = amt // 5 amt = amt % 5 num_pennies = amt // 1 print('num_quarters:', num_quarters) print('num_dimes:', num_dimes) print('num_nickels:', num_nickels) print('num_pennies:', num_pennies) minimal_change2(1.88) # num_quarters: 7 # num_dimes: 1 # num_nickels: 0 # num_pennies: 3

Apply greedy approach to knapsack • You have a knapsack that holds up to 100 pounds of items. • You have a set of items that you want to put in the knapsack of weights { 3, 7, 23, 1, 2, 5, 6, 43, 6, 8, 19, 53 }. • Optimization problems: • Minimize quantity of items in the knapsack, such that the knapsack is completely full • Maximize the weight of items in the knapsack.

Minimize quantity of items in knapsack • Weights { 3, 7, 38, 1, 2, 5, 6, 43, 6, 8, 19, 53 }. • Sorted: [53, 43, 38, 19, 8, 7, 6, 6, 5, 3, 2, 1] • Greedy approach: pick one at a time, choose heaviest item • Empty knapsack holds 100 pounds. • Choose 53; 47 pounds remaining. • Choose 43; 4 pounds remaining. • Choose 3; 1 pound remaining. • Choose 1. • Not the optimal solution: • 53 + 43 + 3 + 1 = 100, but • 43 + 38 + 19 = 100 • Greedy approach doesn’t work for this knapsack problem!

Maximize the weight of items in the knapsack • New set of object weights: • Weights { 7, 38, 5, 6, 43, 6, 8, 19, 53 } • Sorted: [53, 43, 38, 19, 8, 7, 6, 6, 5] • Greedy strategy: lightest first • 5 + 6 + 6 + 7 + 8 + 19 + 38 = 89 • Greedy strategy: heaviest first • 53 + 43 = 96 • Optimal solution: not found by either greedy strategy • 43 + 38 + 19 = 100 • How can we find a solution? What are all the possible solutions?

Brute-force approach: find all subsets(represent as lists because we want repeated elements) • [53, 43, 38, 19, 8, 7, 6, 6, 5] • [] • [53] • [43] • [38] • [53, 43] • [53, 38] • [43, 38] • [53, 43, 38] • etc. • Then sum each subset, see if it equals 100 • We’ll see how to do this in the next class

Return value of None • Functions that don’t have explicit return values return None >>> def f(): print('hello') >>> a = f() 'hello' >>> a None

Use None as a “non-useful” value • Try alternative representation of data [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] • Represent as: [None,None,2,3,None,5,None,7,None,None,None,11,None,13,None,None] • Everything that isn’t None is a prime • Wouldn’t quite make sense to have a different value in place of None, such as -1, or False

None as an initial value • Use None when you want to declare a variable, but do not want to assign it a value • Subsequent computation (usually with a loop) will assign a value to that variable >>> first_list_with_zero= None >>> L = [[1,2], [2,3,4], [], [0,6,7,8]] >>> for mylist in L: if 0 in mylist: first_list_with_zero = mylist[:] >>> first_list_with_zero [0, 6, 7, 8]

Works in this case if variable is not declared initially • Since there is a list with zero in it, the variable first_list_with_zerois created • It has global scope, so it’s available outside the loop >>> # first_list_with_zero= None >>> L = [[1,2], [2,3,4], [], [0,6,7,8]] >>> for mylist in L: if 0 in mylist: first_list_with_zero = mylist[:] >>> first_list_with_zero [0, 6, 7, 8]

Should initialize the variable, to be safe • In this case, there is no list with a zero in it • Therefore the variable first_list_with_zero is never created >>> L = [[1,2], [2,3,4], [], [6,7,8]] >>> for mylist in L: if 0 in mylist: first_list_with_zero = mylist[:] >>> first_list_with_zero Traceback (most recent call last): File "<pyshell#16>", line 1, in <module> first_list_with_zero NameError: name 'first_list_with_zero' is not defined

List comprehensions • Unique to Python language, used very often • Syntax: [<f(x)> for <x> in <mylist> (if <bool>)] • Creates a new list from an existing list • Equivalent to using explicit loops, but optimized to run faster

List comprehension:operation on list elements • Create a new list containing elements of L, squared L = [1, 2, 3, 4, 5] • Using for loop: L_sq = [] for x in L: L_sq.append(x**2) • Using list comprehension: L_sq = [x**2 for x in L]

List comprehension: condition • Create a list containing odd elements of L L = [1, 2, 3, 4, 5] • Using for loop: L_odd = [] for x in L: if x%2==1: L_odd.append(x) • Using list comprehension: L_odd = [x for x in L if x%2==1]

List comprehension: condition and operation on list elements • Create a list containing odd elements of L, squared L = [1, 2, 3, 4, 5] • Using for loop: L_odd_sq = [] for x in L: if x%2==1: L_odd_sq.append(x**2) • Using list comprehension: L_odd_sq = [x**2 for x in L if x%2==1]

List comprehensionwhere element is a list >>> L = [[1,2,3], [4,5], [6,7,8,9]] >>> L2 = [x[::-1] for x in L] >>> L2 [[3, 2, 1], [5, 4], [9, 8, 7, 6]]

List comprehension where element is a list, and list is modified with list comprehension >>> L = [[1,2,3], [4,5], [6,7,8,9]] >>> L2 = [[y for y in x if y%2==1] for x in L] >>> L2 [[1, 3], [5], [7, 9]] • Explanation: L2 = [] for x in L: for y in x: if y%2==1: L2.append(y) [[y for y in x if y%2==1] for x in L] [1,2,3] [4,5] [6,7,8,9] [ ] , [5] [1, 3] , [7,9]

List comprehensionsover multiple, separate lists • Left-to-right nesting of list comprehensions is like top-to-bottom nesting in for loops • Hard to read, not recommended!!! L1 = [x+y for x in [10, 20, 30] for y in [1, 2, 3]] L2 = [] for x in [10, 20, 30]: for y in [1, 2, 3]: L2.append(x+y) # both produce: # [11, 12, 13, 21, 22, 23, 31, 32, 33]

LING 408/508: Computational Techniques for Linguists

LING 408/508: Computational Techniques for Linguists

Presentation Transcript

Computational Chemistry and Molecular Modeling

Computational methods in phylogenetic analysis

CS506/606: Computational Linguistics Fall 2009 Unit 1

Dynamic computational networks

How to deal with insurance carrier for a denied or mishandled claim ---by Ling Zheng, L.Ac.

Control Device Technology

Techniques of Indoor Positioning

LING / C SC 439/539 Statistical Natural Language Processing

Computational Toxicology

Computational Complexity:

Psych 156A/ Ling 150: Acquisition of Language II

Some evolutionary tree reconstruction problems in computational biology

TYPES OF ADVERTISING TECHNIQUES

Molecular Techniques

Grid Computing

Computational Social Choice

Creating an English Environment

Introduction to Computational Chemistry

What can computational models tell us about face processing?

LING 180 SYMBSYS 138 Intro to Computer Speech and Language Processing

Computational Electromagnetics