Parameter Learning

# Parameter Learning

## Parameter Learning

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Parameter Learning

2. Announcements • Midterm 24th 7-9pm, NVIDIA • Midterm review in class next Tuesday • Extra study material for midterm (after class). • Homework back • Regrade process • Looking into reflex agent on pacman • Some changes to the schedule • Want to hear your song before class?

3. Happy with how you did

4. > 23 / 20 > 20 / 20 >= 17 / 20 >= 15 / 20 >= 12 / 20 >= 9 / 20 >= 2 / 20 Pac Man Grades CS221 Grade Book 16% A lot of class left

5. Yay Good Good Good Ok? Talk Talk How we see it CS221 Grade Book 16% A lot of class left

6. Yay Good Good Good Ok? Talk Talk Good job! CS221 Grade Book 16% A lot of class left

7. Yay Good Good Good Ok? Talk Talk Alright CS221 Grade Book 16% A lot of class left

8. Yay Good Good Good Ok? Talk Talk Rethink CS221 Grade Book 16% A lot of class left

10. Common Error: Formalize a problem Real World Problem Model the problem Formal Problem Apply an Algorithm Evaluate Solution

11. Modeling Discrete Search : what makes a state : possible actions from state s Succ: states that could result from taking action a from state s : reward for taking action a from state s : starting state :whether to stop :the value of reaching a given stopping point

12. Modeling Markov Decision : what makes a state : possible actions from state s : probability distribution of states that could result from taking action a from state s : reward for taking action a from state s : starting state :whether to stop :the value of reaching a given stopping point

13. Modeling Bayes Net Definition: Bayes Net = DAG DAG: directed acyclic graph (BN’s structure) • Nodes: random variables (typically discrete, but methods also exist to handle continuous variables) • Arcs: indicate probabilistic dependencies between nodes. Go from cause to effect. • CPDs: conditional probability distribution (BN’s parameters) Conditional probabilities at each node, usually stored as a table (conditional probability table, or CPT) Root nodes are a special case – no parents, so just use priors in CPD:

14. Modeling Hidden Markov Model X1 X2 X3 X4 X5 Formally: (1) State variables and their domains (2) Evidence variables and their domains (3) Probability of states at time 0 (4) Transition probability (5) Emission probability E1 E2 E3 E4 E5

15. Formally, we want to get our model inside the python

16. Scary?

18. Previously on CS221 In Class Research

19. Previously on CS221 In Class Research

20. Previously on CS221

21. Hidden Markov Model X1 X2 X3 X4 X5 Formally: (1) State variables and their domains (2) Evidence variables and their domains (3) Probability of states at time 0 (4) Transition probability (5) Emission probability E1 E2 E3 E4 E5

22. Filtering X1 X1 X2 E1

23. Tracking Other Cars

24. Track a Car! Pos2 Pos1 Dist1 Dist2

25. Track a Robot! Pos1 Probability Density Dist1 Value of d

26. Track a Robot! μ = True distance from x to your car Pos1 Probability Density Dist1 Value of d

27. Track a Robot! μ = True distance from x to your car Pos1 σ = Const.SONAR_STD Probability Density Dist1 Value of d

28. Track a Robot! Pos2 Pos1

29. Particle Filters A particle is a hypothetical instantiation of a variable. Store a large number of particles. Elapse time by moving each particle given transition probabilities. When we get new evidence we weight each particle and create a new generation. The density of particles for any given value is an approximation of the probability that our variable equals that value

30. 0.0 0.1 0.0 0.0 0.0 0.2 0.0 0.2 0.5 Particle Filtering Sometimes |X| is too big to use exact inference • |X| may be too big to even store B(X) • E.g. X is continuous • E.g. X is a real world map Solution: approximate inference • Track samples of X, not all values • Samples are called particles • Time per step is linear in the number of samples • But: number needed may be large • In memory: list of particles, not states This is how robot localization works in practice

31. Elapse Time Each particle is moved by sampling its next position from the transition model • Reflect the transition probs • Here, most samples move clockwise, but some move in another direction or stay in place This captures the passage of time • If we have enough samples, close to the exact values before and after (consistent)

32. Observe Step Slightly trickier: • We downweight our samples based on the evidence • Note that, as before, the probabilities don’t sum to one, since most have been downweighted (in fact they sum to an approximation of P(e))

33. Resample Old Particles: (3,3) w=0.1 (2,1) w=0.9 (2,1) w=0.9 (3,1) w=0.4 (3,2) w=0.3 (2,2) w=0.4 (1,1) w=0.4 (3,1) w=0.4 (2,1) w=0.9 (3,2) w=0.3 Rather than tracking weighted samples, we resample N times, we choose from our weighted sample distribution (i.e. draw with replacement) This is analogous to renormalizing the distribution Now the update is complete for this time step, continue with the next one New Particles: (2,1) w=1 (2,1) w=1 (2,1) w=1 (3,2) w=1 (2,2) w=1 (2,1) w=1 (1,1) w=1 (3,1) w=1 (2,1) w=1 (1,1) w=1

34. Track a Robot! Pos2 Pos1 Walls1 Walls2 Sometimes sensors are wrong Sometimes motors don’t work

35. Transition Prob Start

36. Emission Prob Laser sensor Sense walls

37. Original Particles

38. Observation

39. Reweight…

40. Resample + Pass Time

41. Observation

42. Reweight…

43. Resample

44. Pass Time

45. Observation

46. Reweight…

47. Resample

48. Pass Time

49. Observation

50. Reweight…