1 / 25

Fuzzy Reinforcement Learning Agents

Fuzzy Reinforcement Learning Agents. By Ritesh Kanetkar Systems and Industrial Engineering Lab Presentation May 23, 2003. What is a agent?.

miette
Download Presentation

Fuzzy Reinforcement Learning Agents

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fuzzy Reinforcement Learning Agents By Ritesh Kanetkar Systems and Industrial Engineering Lab Presentation May 23, 2003

  2. What is a agent? • An agent is a computer system situated in some environment, and that is capable of autonomous action in this environment in order to meet its design objectives. • An autonomous agent should be able to act without the direct intervention of humans or other agents, and should have control over its own actions and internal state. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  3. Why Agents? • Ability to act autonomously • Flexibility, scalability and modularity characteristics • Real-time performance • Suitability for distributed applications • Ability to work co-operatively in teams COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  4. Learning in Agents • Supervised Learning • Neural Network • Unsupervised Learning • Reinforcement Learning COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  5. Supervised vs. Unsupervised • Supervised Learning • Learning under a skilled teacher • Learning through presentation of input-output pairs • Given a set of inputs attempts to predict the output values • Unsupervised Learning • No supervisor present • Only data available is through feedback • Learning through evaluation of actions COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  6. Reinforcement Learning • Maps states to actions • Input is current state S1 • Output is selected action • Action change the state to S2 • After evaluating the mapping a reinforcement signal is given to the agent COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  7. Reinforcement Learning • Advantages • Less environment oriented programming • Works in changing environment • Problems • Large number of possible states • Consider only discrete events ( Real world problems are continuous) COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  8. T=30 T=30 M1 M1 a1 R=0. 5 b1 R=0. 5 T=20 T=20 M2 M2 a2 b2 S1 R=0. 5 S2 S2 R=0. 5 S3 T=10 T=10 a3 b3 M3 M3 R=0. 5 R=0. 5 How RL works? COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  9. Continued • Aim 1 : To find the shortest processing time. • Ideal Actions : a3 – b3. • Assumptions : • Action with highest utility is chosen • Each machine bids for the part as per its utility value (initially all 0. 5). • The winning machine gives a part of its utility to the previous winning agent for successfully creating the state for him. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  10. Continued ( Rule for reward) • Rule for giving reward to previous winning agent t (min) r 10 0.3 20 0.2 30 0.1 • Reward from state S0 and S1, say 0.25 for our model. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  11. Continued (Calculations of utility value) COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  12. Continued (Changes in utility value) COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  13. Use of Fuzzy Logic • Fuzzy logic to map states (environment) to actions. Problem tackled is of the elimination of discrete events by use of fuzzy logic. • Fuzzy logic to integrate the multiple rewards into a single feedback signal. • Due to large action space we cannot use traditional lookup tables. So generalization of mapping is required. • Incorporation of human language. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  14. Problems • Agents as dynamical systems interacting with the environment • Network of agents (Multi-agent system) • Multiple reward system • Multiple criteria systems • Continues events system • Large state space in real world problems • Bargaining problems COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  15. Fuzzy Inference System (FIS) COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  16. FIS • FIS rule base is made of N rules : Ri : If s1 = L1i and ……and sn = LN1i then y1 = O1o and ……and yn = ON1o Where, Si = input vector Ri = i’th rule Lji = Fuzzy label Yi = Output vector COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  17. Fuzzy Inference System (FIS) • Layer 1: Input layer • Defines the input variables needed to describe the states completely. • Layer 2: Linguistic Labels • This layer does the fuzzification process. • Layer 3: Rules • This layer defines the if-else rules giving rule truth values. • Layer 4: Output layer • Gives the FIS output. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  18. Assumptions • Number of input variables and fuzzy labels are selected depending on problem • Number of rules is determined by numbers of elements in first two layers. (Product of labels for each input variable) • Each have a predefined number of outputs • So only most difficult part left is the conclusion of all possible combinations (Rule conclusion) COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  19. What it does? • Maps states to actions. • Rules can be formulated in human language. • Each rule contains: • Value Vi to approx. optimal evaluation function. • Action set Ui • Parameter vector wi giving the weight of different action in a rule to approximate policy. • Final output is the weighted average of all the actions. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  20. FIS Output (Primary reinforcement) (Internal reinforcement through critic) COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  21. Procedure • Estimate the evaluation function corresponding to current state. Vt(St+1)= vt .Фt+1 • Compute the TD error єt+1. • Tune the parameters v and w. • Estimate the new evaluation function with new conclusion vector vt+1. • Learning rate updating. • Computing and triggering of global action Ut+1 COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  22. Problem • Single machine scheduling problem • 3 parts • Each part with individual earliness-tardiness penalties, due dates and processing times • 19 time slots on machine • Minimize the deviation from due dates reducing the penalties COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  23. Work in progress • Currently working with the single machine scheduling problem with earliness/tardiness penalty and due dates. • Identifying the various parameters. • Understanding the mathematics behind the FIS. • Incorporating bargaining model in FIS. COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  24. References • Fuzzy Inference System Learning by Reinforcement Methods – Lionel Jouffe (IEEE) • Dynamic single machine scheduling under distributed decision making – Pooja Dewan, Sanjay Joshi (IJPR) • Evolutionary Learning agents for shop floor control- Bruno Maione, David Naso (IEEE) • A fuzzy logic based methodology to rank shop floor dispatching rules – Albert Petroni (IJPE) • Multi Agent Reinforcement Learning with bidding for automatic segmentation of action sequence – Ron Sun (IEEE) • AI depot - http://ai-depot.com/ (for RL) • RL – An Introduction (Suttons and Barto) • Matlab fuzzy logic toolbox tutorials COMPUTER INTEGRATED MANUFACTURING LAB Department of Systems and Industrial Engineering

  25. Thank You

More Related