slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Learning Bayesian Network using Genetic Algorithms PowerPoint Presentation
Download Presentation
Learning Bayesian Network using Genetic Algorithms

Loading in 2 Seconds...

play fullscreen
1 / 23

Learning Bayesian Network using Genetic Algorithms - PowerPoint PPT Presentation


  • 130 Views
  • Uploaded on

Learning Bayesian Network using Genetic Algorithms. Dhirubhai Ambani Institute of Information & Communication Technology. Mentor Prof Suman Mitra DA-IICT, Gujarat. 200701195. Ashish Kalya. Introduction to Bayesian Network. DAG represents Bayesian Structure

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Learning Bayesian Network using Genetic Algorithms


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Learning Bayesian Network using Genetic Algorithms Dhirubhai Ambani Institute of Information & Communication Technology Mentor Prof Suman Mitra DA-IICT, Gujarat 200701195 Ashish Kalya

    2. Introduction to Bayesian Network • DAG represents Bayesian Structure • Conditional Probabilities distributions form Bayesian parameters Image Source: 1may,2011,”bayesiangraph.png” http://www.ra.cs.uni-tuebingen.de/software/JCell/images/docbook/bayesianGraph.png

    3. Two Approaches for Learning Bayesian Structure • Constraint based • Finds a Bayesian network structure whose conditional independence constraints match those found in the data. • Heuristic Search methods • Traverse the search space heuristically to find the DAG that can best explain the data (i.e. could have generated the data). Traverse space looking for high-scoring structures • Example : K2 algorithm

    4. The Need for Heuristic Search Algorithms Ideally we would search the space of all DAGs exhaustively and find the DAG which maximizes the Bayesian scoring criterion. However, for a large (not that large!) number of nodes this becomes infeasible [6]: Number of Nodes Number of DAGs 0 1 1 1 2 3 3 25 4 543 5 29281 6 3781503 7 1138779265 8 783702329343

    5. Genetic Algorithms (GAs) OVERVIEW • Inspired by the biological evolution process • Encoding: each individual coded as a string of certain finite length called chromosome, generally a binary string • Fitness function : gives fitness value Space of strings Fitness Function Set of Rational Numbers

    6. Components of GAs • Selection of individual as parents is inversely proportional to its fitness value • Crossover: strings(chromosome) are randomly mixed to form new offspring. • Mutation: randomly changes a string(chromosome) Parents: 00000000Offspring: 11100000 11111111 00011111 Parents: 00000000 Offspring: 00100000

    7. The Evolutionary Cycle parents crossover & mutation selection modified offspring initiate & evaluation population evaluate evaluated offspring Introduction to Genetic Algorithms

    8. Related Work • a genetic algorithm based upon the score-based greedy algorithm approach has been proposed in[2]. • Semantic crossover and mutation operator have been introduced in [3]. • Encoding of individuals using dual chromosome has been done [7], we have used that approach for generating our initial population

    9. Scoring Metric == Fitness Function • Maximum Likely hood Estimate (MLE) • Bayesian Information Criteria (BIC) • BIC punishes network complexity as simple networks are desirable [1]

    10. Stimulation Details • Algorithm: GA with elitist model • Encoding : DAG represented by a string A11 A12 . . . A21 A22 . . . A1N A2N . . ANN where A is the adjacency matrix • Fitness function : -1 * Bayesian Information criteria metric • One point crossover and crossover rate=0.9 • Mutation rate = 0.01, 0.1 and variable rate (see fig.)

    11. Stimulation Details • Initial population: A set of random DAGs • How do you generate Random DAGs • Upper triangular matrix is always acyclic • Permute the order of nodes and then rearrange the matrix correspondingly • Example: permute (1 2 3)= (1 3 2) 1 2 3 1 3 2 1 1 3 2 2 3

    12. ASIA Network Structure “A very small belief network for a fictitious medical example about whether a patient has tuberculosis, lung cancer or bronchitis, related to their X-ray, dyspnea, visit-to-Asia and smoking status.” [8] Image scorce:http:1 may,2011,”asia.png”,//www.stanford.edu/class/cs221/project2_files/asia.png

    13. Stimulation Details • Algorithms stimulated for 500, 1000, 2000 and 5000 cases • Number of generations considered: 50 , 100 , 150 and 250 • Size of population considered: 10, 20 , 50 and 100

    14. Issues faced • Both crossover and mutation operators generate individuals which are not DAGs. • Need to find cyclic directed graphs and remove cycles

    15. Modified GA with elitist model • A simple directed graph G has a directed cycle if and only if there is a back-edge in DFS(G) [5] • Ones all new individuals have been generated then we check if there is any back edge present and if found remove them. parents crossover & mutation selection remove cycle initiate & evaluation population evaluate evaluated offspring

    16. Observations and Analysis(1) Observed values: • (AF)average of fitness value of the best individual of each run of GA over 10 runs • (AH) average of Hamming distance between the string representing best individual of each run of GA and string representing the original structure over 10 runs • Example: Hamming distance is 2 0000011 0010010

    17. Observations and Analysis (2) • For 500 and 1000 cases, AF values lesser than that for original network were reached quite frequently. • For higher number of cases larger population size gives better results but for 500 cases no such difference was observed between population sizes of 50 and 100. • Smaller data size reduces the impact of population size and number of generations

    18. Observations and Analysis (3) • For very close or similar values of AF, we have very different AH • For normal GA, low and variable mutation gave comparatively better results. For modified GA , high mutation gave better results very clearly. • Modified GA performed better

    19. Conclusions • Less than 0.00000032 fraction of the search space was explored • Results for AH using modified GA are comparable with those obtained from K2 algorithm for ASIA network[9] • GAs make sense because approximate answers are acceptable specially so when number of cases is not large.

    20. Future Work • Remove cycles by making informed choice about which edge to remove • Need to carry these stimulations with more complex and large networks

    21. Acknowledgements • I would like to thank Prof. Suman Mitra for the initial conceptualization of idea. His regular inputs and study material provided by him were of great help.

    22. References • [1] Richard E. Neapolitan, Learning Bayesian Networks, Prentice Hall Series in Artificial Intelligence, Prentice Hall, December 2000. • [2] P. Larrañaga, M. Poza, Y. Yurramendi, R. H. Murga, and C. M. H. Kuijpers, “Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol 18, no 9, 1996. • [3] S. Shetty, M. Song, Structure learning of Bayesian networks using a semantic genetic algorithm-based approach, in: Third International Conference on Information Technology: Research and Education, 2005, ITRE 2005, pp. 454–458. • [4] Etxeberria, R, Larranaga, P, and Pikaza, J M 1997. "Analysis of the behaviour of the genetic algorithms when searching Bayesian networks from data", Pattern Recognition Letters Vol. 18 No 11-13 pp 1269-1273. • [5] Jorgen Bang-Jensen, Gregory Z. Gutin, Digraphs theory algorithm and applications, 2nd ed.,Springer, 2010. • [6] McKay, B. D.; Royle, G. F.; Wanless, I. M.; Oggier, F. E.; Sloane, N. J. A.; Wilf, H. (2004), "Acyclic digraphs and eigenvalues of (0,1)-matrices", Journal of Integer Sequences 7, http://www.cs.uwaterloo.ca/journals/JIS/VOL7/Sloane/sloane15.html, Article 04.3.3. • [7] J. Lee, W. Chung and E. Kim, “Structure Learning of Bayesian Networks Using Dual Genetic Algorithm,” IEICE Trans. Inf. & Syst., 2007 • [8] (22 April 2011) “Norsys Bayes Net Library” [online] http://www.norsys.com/networklibrary.html# • [9] Murphy,K.P. (2002) Bayes Net Toolbox. Technical Report, MIT Artificial Intelligence Laborator

    23. Thank You