1 / 17

CAP6938 Neuroevolution and Developmental Encoding Neural Network Weight Optimization

CAP6938 Neuroevolution and Developmental Encoding Neural Network Weight Optimization. Dr. Kenneth Stanley September 6, 2006. Review. Remember, the values of the weights and the topology determine the functionality Given a topology, how are weights optimized?

Download Presentation

CAP6938 Neuroevolution and Developmental Encoding Neural Network Weight Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CAP6938Neuroevolution and Developmental EncodingNeural Network Weight Optimization Dr. Kenneth Stanley September 6, 2006

  2. Review • Remember, the values of the weights and the topology determine the functionality • Given a topology, how are weights optimized? • Weights are just parameters on a structure ? ? ? ? ? ? ? ? ?

  3. Two Cases • Output targets are known • Output targets are not known out1 out2 H1 H2 w11 w21 w12 X1 X2

  4. Decision Boundaries • OR is linearly separable • Linearly separable problems do not require hidden nodes (nonlinearities) OR function: + + Input Output 1 1 1 1 -1 1 -1 1 1 -1 -1 -1 - + Bias

  5. Decision Boundaries • XOR is not linearly separable • Requires at least one hidden node XOR function: + - Input Output 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 - + Bias

  6. Hebbian Learning • Change weights based on correlation of connected neurons • Learning rules are local • Simple Hebb Rule: • Works best when relevance of inputs to outputs is independent • Simple Hebb Rule grows weights unbounded • Can be made incremental:

  7. More Complex Local Learning Rules • Hebbian Learning with a maximum magnitude: • Excitatory: • Inhibitory: • Second terms are decay terms: forgetting • Happens when presynaptic node does not affect postsynaptic node • Other rules are possible • Videos: watch the connections change

  8. Bias Perceptron Learning • Will converge on correct weights • Single layer learning rule: • Rule is applied until boundary is learned

  9. Backpropagation • Designed for at least one hidden layer • First, activation propagates to outputs • Then, errors are computed and assigned • Finally, weights are updated • Sigmoid is a common activation function t1 t2 x’s are inputs z’s are hidden units y’s are outputs t’s are targets v’s are layer 1 weights w’s are layer 2 weights y1 y2 w21 w11 w22 w12 z1 z2 v11 v22 v21 v12 X1 X2

  10. Backpropagation Algorithm • Initialize weights • While stopping condition is false, for each training pair • Compute outputs by forward activation • Backpropagate error: • For each output unit, error • Weight correction • Send error back to hidden units • Calculate error contribution for each hidden unit: • Weight correction • Adjust weights by adding weight corrections (target minus output times slope) (Learning rate times error times hidden output)

  11. Example Applications • Anything with a set of examples and known targets • XOR • Character recognition • NETtalk: reading English aloud • Failure predicition • Disadvantages: trapped in local optima

  12. Output Targets Often Not Available (Stone, Sutton, and Kuhlmann 2005)

  13. One Approach: Value Function Reinforcement Learning • Divide the world into states and actions • Assign values to states • Gradually learn the most promising states and actions Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Start

  14. Learning to Navigate T=56 T=1 Goal Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Start Start T=703 T=350 Goal Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0.9 1 1 1 1 1 1 1 1 1 Start Start

  15. How to Update State/Action Values • Q learning rule: • Exploration increases Q-values’ accuracy • The best actions to take in different states become known • Works only in Markovian domains

  16. Backprop In RL • The state/action table can be estimated by a neural network • The target learned by the network is the Q-value: Value NN Action State_description

  17. Next Week: Evolutionary Computation • EC does not require targets • EC can be a kind of RL • EC is policy search • EC is more than RL For 9/11: Mitchell ch.1 (pp. 1-31) and ch.2 (pp. 35-80) Note Section 2.3 is "Evolving Neural Networks"

More Related