Learning Bayesian Network using Genetic Algorithms

Learning Bayesian Network using Genetic Algorithms Dhirubhai Ambani Institute of Information & Communication Technology Mentor Prof Suman Mitra DA-IICT, Gujarat 200701195 Ashish Kalya

Introduction to Bayesian Network • DAG represents Bayesian Structure • Conditional Probabilities distributions form Bayesian parameters Image Source: 1may,2011,”bayesiangraph.png” http://www.ra.cs.uni-tuebingen.de/software/JCell/images/docbook/bayesianGraph.png

Two Approaches for Learning Bayesian Structure • Constraint based • Finds a Bayesian network structure whose conditional independence constraints match those found in the data. • Heuristic Search methods • Traverse the search space heuristically to find the DAG that can best explain the data (i.e. could have generated the data). Traverse space looking for high-scoring structures • Example : K2 algorithm

The Need for Heuristic Search Algorithms Ideally we would search the space of all DAGs exhaustively and find the DAG which maximizes the Bayesian scoring criterion. However, for a large (not that large!) number of nodes this becomes infeasible [6]: Number of Nodes Number of DAGs 0 1 1 1 2 3 3 25 4 543 5 29281 6 3781503 7 1138779265 8 783702329343

Genetic Algorithms (GAs) OVERVIEW • Inspired by the biological evolution process • Encoding: each individual coded as a string of certain finite length called chromosome, generally a binary string • Fitness function : gives fitness value Space of strings Fitness Function Set of Rational Numbers

Components of GAs • Selection of individual as parents is inversely proportional to its fitness value • Crossover: strings(chromosome) are randomly mixed to form new offspring. • Mutation: randomly changes a string(chromosome) Parents: 00000000Offspring: 11100000 11111111 00011111 Parents: 00000000 Offspring: 00100000

The Evolutionary Cycle parents crossover & mutation selection modified offspring initiate & evaluation population evaluate evaluated offspring Introduction to Genetic Algorithms

Related Work • a genetic algorithm based upon the score-based greedy algorithm approach has been proposed in[2]. • Semantic crossover and mutation operator have been introduced in [3]. • Encoding of individuals using dual chromosome has been done [7], we have used that approach for generating our initial population

Scoring Metric == Fitness Function • Maximum Likely hood Estimate (MLE) • Bayesian Information Criteria (BIC) • BIC punishes network complexity as simple networks are desirable [1]

Stimulation Details • Algorithm: GA with elitist model • Encoding : DAG represented by a string A11 A12 . . . A21 A22 . . . A1N A2N . . ANN where A is the adjacency matrix • Fitness function : -1 * Bayesian Information criteria metric • One point crossover and crossover rate=0.9 • Mutation rate = 0.01, 0.1 and variable rate (see fig.)

Stimulation Details • Initial population: A set of random DAGs • How do you generate Random DAGs • Upper triangular matrix is always acyclic • Permute the order of nodes and then rearrange the matrix correspondingly • Example: permute (1 2 3)= (1 3 2) 1 2 3 1 3 2 1 1 3 2 2 3

ASIA Network Structure “A very small belief network for a fictitious medical example about whether a patient has tuberculosis, lung cancer or bronchitis, related to their X-ray, dyspnea, visit-to-Asia and smoking status.” [8] Image scorce:http:1 may,2011,”asia.png”,//www.stanford.edu/class/cs221/project2_files/asia.png

Stimulation Details • Algorithms stimulated for 500, 1000, 2000 and 5000 cases • Number of generations considered: 50 , 100 , 150 and 250 • Size of population considered: 10, 20 , 50 and 100

Issues faced • Both crossover and mutation operators generate individuals which are not DAGs. • Need to find cyclic directed graphs and remove cycles

Modified GA with elitist model • A simple directed graph G has a directed cycle if and only if there is a back-edge in DFS(G) [5] • Ones all new individuals have been generated then we check if there is any back edge present and if found remove them. parents crossover & mutation selection remove cycle initiate & evaluation population evaluate evaluated offspring

Observations and Analysis(1) Observed values: • (AF)average of fitness value of the best individual of each run of GA over 10 runs • (AH) average of Hamming distance between the string representing best individual of each run of GA and string representing the original structure over 10 runs • Example: Hamming distance is 2 0000011 0010010

Observations and Analysis (2) • For 500 and 1000 cases, AF values lesser than that for original network were reached quite frequently. • For higher number of cases larger population size gives better results but for 500 cases no such difference was observed between population sizes of 50 and 100. • Smaller data size reduces the impact of population size and number of generations

Observations and Analysis (3) • For very close or similar values of AF, we have very different AH • For normal GA, low and variable mutation gave comparatively better results. For modified GA , high mutation gave better results very clearly. • Modified GA performed better

Conclusions • Less than 0.00000032 fraction of the search space was explored • Results for AH using modified GA are comparable with those obtained from K2 algorithm for ASIA network[9] • GAs make sense because approximate answers are acceptable specially so when number of cases is not large.

Future Work • Remove cycles by making informed choice about which edge to remove • Need to carry these stimulations with more complex and large networks

Acknowledgements • I would like to thank Prof. Suman Mitra for the initial conceptualization of idea. His regular inputs and study material provided by him were of great help.

References • [1] Richard E. Neapolitan, Learning Bayesian Networks, Prentice Hall Series in Artificial Intelligence, Prentice Hall, December 2000. • [2] P. Larrañaga, M. Poza, Y. Yurramendi, R. H. Murga, and C. M. H. Kuijpers, “Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters,” IEEE Trans. On Pattern Analysis and Machine Intelligence, vol 18, no 9, 1996. • [3] S. Shetty, M. Song, Structure learning of Bayesian networks using a semantic genetic algorithm-based approach, in: Third International Conference on Information Technology: Research and Education, 2005, ITRE 2005, pp. 454–458. • [4] Etxeberria, R, Larranaga, P, and Pikaza, J M 1997. "Analysis of the behaviour of the genetic algorithms when searching Bayesian networks from data", Pattern Recognition Letters Vol. 18 No 11-13 pp 1269-1273. • [5] Jorgen Bang-Jensen, Gregory Z. Gutin, Digraphs theory algorithm and applications, 2nd ed.,Springer, 2010. • [6] McKay, B. D.; Royle, G. F.; Wanless, I. M.; Oggier, F. E.; Sloane, N. J. A.; Wilf, H. (2004), "Acyclic digraphs and eigenvalues of (0,1)-matrices", Journal of Integer Sequences 7, http://www.cs.uwaterloo.ca/journals/JIS/VOL7/Sloane/sloane15.html, Article 04.3.3. • [7] J. Lee, W. Chung and E. Kim, “Structure Learning of Bayesian Networks Using Dual Genetic Algorithm,” IEICE Trans. Inf. & Syst., 2007 • [8] (22 April 2011) “Norsys Bayes Net Library” [online] http://www.norsys.com/networklibrary.html# • [9] Murphy,K.P. (2002) Bayes Net Toolbox. Technical Report, MIT Artificial Intelligence Laborator

Thank You

Learning Bayesian Network using Genetic Algorithms

Learning Bayesian Network using Genetic Algorithms

Presentation Transcript

Genetic Algorithms

Genetic Algorithms

Genetic Algorithms

Genetic Algorithms

Genetic Algorithms by using MapReduce

Genetic Algorithms

Genetic Algorithms

Coevolutionary Learning with Genetic Algorithms

Genetic Algorithms

Scaling Genetic Algorithms using MapReduce

Reinforcement Learning and Genetic Algorithms

Genetic Algorithms

Genetic Algorithms

Genetic Algorithms

Genetic Algorithms

Genetic Algorithms

A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures

Analysing Microarray Data Using Bayesian Network Learning

MEMS Design using Genetic Algorithms

Bayesian Network Modeling for Evolutionary Genetic Structures

Object Recognition Using Genetic Algorithms

Genetic Algorithms