Create Presentation
Download Presentation

Download Presentation

Two Approaches to Bayesian Network Structure Learning

Download Presentation
## Two Approaches to Bayesian Network Structure Learning

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Two Approaches to Bayesian Network Structure Learning**Yael Kinderman & Tali Goren Goal:Compare an algorithm that learns a BN tree structure (TAN) with an algorithm that learns a constraints-free structure – (Build-BN). Problem Definition: Finding an exact BN structure for complete discrete data. • Known to be NP-hard. • maximization problem over defined score. Build-BN Algorithm: Algorithm’s Attributes: • No structural constraints. • Straight Forward approach – not avoiding any computation. • Feasible only for small networks (<30 variables). Crucial Facts lying in the core of the Algorithm: • There are scoring-functions which are decomposable to local scores (we used BIC for the algorithm) • Every DAG has at least one node with no outgoing arcs (=sink). • Implementation Note:Build-BN requires a lot of memory.Therefore, implementation strongly utilizes the file-system.**BIC(x,vs) =**Where: k iterates over all possible values of x J iterates over all possible values of Pa(x), N = number of samples Nj = number of samples where par(X)= j, Njk = “ “ “ “ “ “ “ and X=k. Algorithm’s Flow Step I: Find Local Scores: , (V = set of all variables), calculate ‘local BIC’: " Î x V , vs ( V \ x ) All in all, n2n-1 scores are calculated in this step. Step II: Find Best Parents: , find best parents of x in the var-set. Traversing var-sets by lexicographic order (smaller to larger), Results in time complexity of O((n-1)2n-1).**)**Sink*(W) arg max ( skore ( W , s ) Algorithm’s Flow – cont. Step III: Find Best Sinks For each 2n var-sets we find a best sink. Let Sink*(W) be the best sinkof a var-set W. Then Sink*(W) canbe found by: = Î s W • Where: g*s(var-set) = the best set of parents for s in the var-set. • G*(var-set) = the highest scoring network for a var-set. • We traverse var-sets by lexicographic order, and use scores that were calculated in previous iterations.**Algorithm’s Flow – cont.**Step IV:Find Best Order Best sinks immediately yield the best ordering (in reverse order). Step V: Find best network Having best order (ordi*(V)) and best parents (g*(W)) for each W V, we can find the network as following: { } | | V = * * * U ord ( V ) sin k ( V \ ord ( V ) ) i j = + 1 j i In other words: the ith var in the optimal ordering, picks best parents from the var-set that contains all the variables that are predecessors in the ordering.**Using the BN for Prediction**• 5-fold cross validation:80% of the data used for building structure & CPDs,20% “ “ “ “ “ ‘label prediction’. • Predicting the label ‘C’ of a given sample is done using:**Prediction Success Rate: 0.836**Prediction Success Rate: 0.85 Test over the Famous ‘Student’ Model Testing our implementation over ‘synthetic’ data: • We simulated 300 samples according to the BN and the CPDs as were presented in class. • Prediction performed using TAN and build-BN. Build-BN result TAN result Note: In Build-BN, 4 out of 5 fold cross validation gave the above net.**Experimental Results**Data taken from: UCI machine learning DB • Possible explanation for the last 2 results: • Zoo – only 101 instances… • Vehicle – what’s wrong with this data ?! • Note the low in-degrees (model induced by data-sets are by nature close to trees).**TAN result**Prediction Success Rate: 0.969 Prediction Success Rate: 0.937 Example I: CorralBuild-BN does not force‘Irrelevant’ variables to be linked into the BN Build-BN result**Build-BN result**TAN result Prediction Success Rate: 0.844 Prediction Success Rate: 0.653 Example II – TIC TAC TOENo constraints on the structure enables better prediction References: Tomi Silander, Petri Myllymaki, HIIT. A Simple Approach for Finding the Globally Optimal Bayesian Network Structure.