430 likes | 924 Views
Machine Learning and Robotics. Lisa Lyons 10/22/08. Outline. Machine Learning Basics and Terminology An Example: DARPA Grand/Urban Challenge Multi-Agent Systems Netflix Challenge (if time permits). Introduction. Machine learning is commonly associated with robotics
E N D
Machine Learning and Robotics Lisa Lyons 10/22/08
Outline • Machine Learning Basics and Terminology • An Example: DARPA Grand/Urban Challenge • Multi-Agent Systems • Netflix Challenge (if time permits)
Introduction • Machine learning is commonly associated with robotics • When some think of robots, they think of machines like WALL-E (right) – human-looking, has feelings, capable of complex tasks • Goals for machine learning in robotics aren’t usually this advanced, but some think we’re getting there • Next three slides outline some goals that motivate researchers to continue work in this area
Household Robot to Assist Handicapped • Could come preprogrammed with general procedures and behaviors • Needs to be able to learn to recognize objects and obstacles and maybe even its owner (face recognition?) • Also needs to be able to manipulate objects without breaking them • May not always have all information about its environment (poor lighting, obscured objects)
Flexible Manufacturing Robot • Configurable robot that could manufacture multiple items • Must learn to manipulate new types of parts without damaging them
Learning Spoken Dialog System for Repairs • Given some initial information about a system, a robot could converse with a human and help to repair it • Speech understanding is a very hard problem in itself
Machine Learning Basics and Terminology With applications and examples in robotics
Learning Associations • Association Rule – probability that an event will happen given another event already has (P(Y|X))
Classification • Classification – model where input is assigned to a class based on some data • Prediction – assuming a future scenario is similar to a past one, using past data to decide what this scenario would look like • Pattern Recognition – a method used to make predictions • Face Recognition • Speech Recognition • Knowledge Extraction – learning a rule from data • Outlier Detection – finding exceptions to the rules
Regression • Linear regression is an example • Both Classification and Regression are “Supervised Learning” strategies where the goal is to find a mapping from input to output • Example: Navigation of autonomous car • Training Data: actions of human drivers in various situations • Input: data from sensors (like GPS or video) • Output: angle to turn steering wheel
Unsupervised Learning • Only have input • Want to find regularities in the input • Density Estimation: finding patterns in the input space • Clustering: find groupings in the input
Reinforcement Learning • Policy: generating correct actions to reach the goal • Learn from past good policies • Example: robot navigating unknown environment in search of a goal • Some data may be missing • May be multiple agents in the system
Possible Applications • Exploring a world • Learning object properties • Learning to interact with the world and with objects • Optimizing actions • Recognizing states in world model • Monitoring actions to ensure correctness • Recognizing and repairing errors • Planning • Learning action rules • Deciding actions based on tasks
What We Expect Robots to Do • Be able to react promptly and correctly to changes in environment or internal state • Work in situations where information about the environment is imperfect or incomplete • Learn through their experience and human guidance • Respond quickly to human interaction • Unfortunately, these are very high expectations which don’t always correlate very well with machine learning techniques
Differences Between Other Types of Machine Learning and Robotics Other ML Applications Robotics • Planning can frequently be done offline • Actions usually deterministic • No major time constraints • Often require simultaneous planning and execution (online) • Actions could be nondeterministic depending on data (or lack thereof) • Real-time often required
The Challenge • Defense Advanced Research Projects Agency (DARPA) • Goal: to build a vehicle capable of traversing unrehearsed off-road terrain • Started in 2003 • 142 mile course through Mojave • No one made it through more than 5% of the course in 2004 race • In 2005, 195 teams registered, 23 teams raced, 5 teams finished
The Rules • Must traverse a desert course up to 175 miles long in under 10 h • Course kept secret until 2h before the race • Must follow speed limits for specific areas of the course to protect infrastructure and ecology • If a faster vehicle needs to overtake a slower one, the slower one is paused so that vehicles don’t have to handle dynamic passing • Teams given data on the course 2h before race so that no global path planning was required
A DARPA Grand Challenge Vehicle that Did Not Crash • …namely Stanley, the winner of the 2005 challenge
Terrain Mapping and Obstacle Detection • Data from 5 laser scanners mounted on top of the car is used to generate a point cloud of what’s in front of the car • Classification problem • Drivable • Occupied • Unknown • Area in front of vehicle as grid • Stanley’s system finds the probability that ∆h > δ where ∆h is the observed height of the terrain in a certain cell • If this probability is higher than some threshold α, the system defines the cell as occupied
(cont.) • A discriminative learning algorithm is used to tune the parameters • Data is taken as a human driver drives through a mapped terrain avoiding obstacles (supervised learning) • Algorithm uses coordinate ascent to determine δ and α
Computer Vision Aspect • Lasers only make it safe for car to drive < 25 mph • Needs to go faster to satisfy time constraint • Color camera is used for long-range obstacle detection • Still the same classification problem • Now there are more factors to consider – lighting, material, dust on lens • Stanley takes adaptive approach
Vision Algorithm • Take out the sky • Map a quadrilateral on camera video corresponding with laser sensor boundaries • As long as this region is deemed drivable, use the pixels in the quad as a training set for the concept of drivable surface • Maintain Gaussians that model the color of drivable terrain • Adapt by adjusting previous Gaussians and/or throwing them out and adding new ones • Adjustment allows for slow adjustment to lighting conditions • Replacement allows for rapid change in color of the road • Label regions as drivable if their pixel values are near one or more of the Gaussians and they are connected to laser quadrilateral
Road Boundaries • Best way to avoid obstacles on a desert road is to find road boundaries and drive down the middle • Uses low-pass one-dimensional Kalman Filters to determine road boundary on both sides of vehicle • Small obstacles don’t really affect the boundary found • Large obstacles over time have a stronger effect
Slope and Ruggedness • If terrain becomes too rugged or steep, vehicle must slow down to maintain control • Slope is found from vehicle’s pitch estimate • Ruggedness is determined by taking data from vehicle’s z accelerometer with gravity and vehicle vibration filtered out
Path Planning • No global planning necessary • Coordinate system used is base trajectory + lateral offset • Base trajectory is smoothed version of driving corridor on the map given to contestants before the race
Path Smoothing • Base trajectory computed in 4 steps: • Points are added to the map in proportion to local curvature • Least-squares optimization is used to adjust trajectories for smoothing • Cubic spline interpolation is used to find a path that can be resampled efficiently • Calculate the speed limit
Online Path Planning • Determines the actual trajectory of vehicle during race • Search algorithm that minimizes a linear combination of continuous cost functions • Subject to dynamic and kinematic constraints • Max lateral acceleration • Max steering angle • Max steering rate • Max acceleration • Penalize hitting obstacles, leaving corridor, leaving center of road
Recursive Modeling Method (RMM) • Agents model the belief states of other agents • Beyesian methods implemented • Useful in homogeneous non-communicating Multi-Agent Systems (MAS) • Has to be cut off at some point (don’t want a situations where agent A thinks that agent B thinks that agent A thinks that…) • Agents can affect other agents by affecting the environment to produce a desired reaction
Heterogeneous Non-Communicating MAS • Competitive and cooperative learning possible • Competitive learning more difficult because agents may end up in “arms race” • Credit-assignment problem • Can’t tell if agent benefitted because it’s actions were good or if opponent’s actions were bad • Experts and observers have proven useful • Different agents may be given different roles to reach the goal • Supervised learning to “teach” each agent how to do its part
Communication • Allowing agents to communicate can lead to deeper levels of planning since agents know (or think they know) the beliefs of others • Could allow one agent to “train” another to follow it’s actions using reinforcement learning • Negotiations • Commitment • Autonomous robots could understand their position in an environment by querying other robots for their believed positions and making a guess based on that (Markov localization, SLAM)
Netflix Challenge (if time permits)
References • Alpaydin, E. Introduction to Machine Learning. Cambridge, Mass. : MIT Press, 2004. • Kreuziger, J. “Application of Machine Learning to Robotics – An Analysis.” In Proceedings of the Second International Conference on Automation, Robotics, and Computer Vision (ICARCV '92). 1992. • Mitchell et. al. “Machine Learning.” Annu. Rev. Coput. Sci. 1990. 4:417-33. • Stone, P and Veloso, M. “Multiagent Systems: A Survey from a Machine Learning Perspective.” Autonomous Robots 8, 345-383, 2000. • Thrun et. al. “Stanley: The Robot that Won the DARPA Grand Challenge.” Journal of Field Robotics 23(9), 661-692, 2006.