Knowledge Engineering for Bayesian Networks

1 / 24

# Knowledge Engineering for Bayesian Networks - PowerPoint PPT Presentation

Knowledge Engineering for Bayesian Networks. Ann Nicholson. School of Computer Science and Software Engineering Monash University. Overview. Representing uncertainty Introduction to Bayesian Networks Syntax, semantics, examples The knowledge engineering process Case Studies

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Knowledge Engineering for Bayesian Networks' - diedrick

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Knowledge Engineering for Bayesian Networks

Ann Nicholson

School of Computer Science

and Software Engineering

Monash University

Overview
• Representing uncertainty
• Introduction to Bayesian Networks
• Syntax, semantics, examples
• The knowledge engineering process
• Case Studies
• Seabreeze prediction
• Intelligent Tutoring
• Open research questions
Sources of Uncertainty
• Ignorance
• Inexact observations
• Non-determinism
• AI representations
• Probability theory
• Dempster-Shafer
• Fuzzy logic
Probability theory for representing uncertainty
• Assigns a numerical degree of belief between 0 and 1 to facts
• e.g. “it will rain today” is T/F.
• P(“it will rain today”) = 0.2 prior probability (unconditional)
• Posterior probability (conditional)
• P(“it wil rain today” | “rain is forecast”) = 0.8
• Bayes’ Rule: P(H|E) = P(E|H) x P(H)

P(E)

Bayesian networks
• Directed acyclic graphs
• Nodes: random variables,
• R: “it is raining”, discrete values T/F
• T: temperature, cts or discrete variable
• C: colour, discrete values {red,blue,green}
• Arcs indicate dependencies (can have causal interpretation)

X

Flu

Y

Te

Q

Th

Bayesian networks
• Conditional Probability Distribution (CPD)
• Associated with each variable
• probability of each state given parent states

“Jane has the flu”

P(Flu=T) = 0.05

Models causal relationship

“Jane has a

high temp”

P(Te=High|Flu=T) = 0.4

P(Te=High|Flu=F) = 0.01

Models possible sensor error

“Thermometer

P(Th=High|Te=H) = 0.95

P(Th=High|Te=L) = 0.1

Flu

Flu

TB

Flu

Flu

Y

Te

Te

Te

Y

Te

Th

Th

Th

Diagnostic

inference

Causal

inference

Mixed

inference

Intercausal

inference

BN inference
• Evidence: observation of specific state
• Task: compute the posterior probabilities for query node(s) given evidence.

Flu

BN software
• Commerical packages: Netica, Hugin, Analytica (all with demo versions)
• Free software: Smile, Genie, JavaBayes, …

http://HTTP.CS.Berkeley.EDU/~murphyk/Bayes/bnsoft.html

• Examples
Decision networks
• Extension to basic BN for decision making
• Decision nodes
• Utility nodes
• EU(Action) =  p(o|Action,E) U(o)

o

• choose action with highest expect utility
• Example
Elicitation from experts
• Variables
• important variables? values/states?
• Structure
• causal relationships?
• dependencies/independencies?
• Parameters (probabilities)
• quantify relationships and interactions?
• Preferences (utilities)

BN

EXPERT

Domain

EXPERT

BN TOOLS

Expert Elicitation Process
• These stages are done iteratively
• Stops when further expert input is no longer cost effective
• Process is difficult and time consuming.
• Current BN tools
• inference engine
• GUI
• Next generation of BN tools?
Knowledge discovery
• There is much interest in automated methods for learning BNS from data
• parameters, structure (causal discovery)
• Computationally complex problem, so current methods have practical limitations
• e.g. limit number of states, require variable ordering constraints, do not specify all arc directions
• Evaluation methods
The knowledge engineering process

1. Building the BN

• variables, structure, parameters, preferences
• combination of expert elicitation and knowledge discovery

2. Validation/Evaluation

• case-based, sensitivity analysis, accuracy testing

3. Field Testing

• alpha/beta testing, acceptance testing

4. Industrial Use

• collection of statistics

5. Refinement

• Updating procedures, regression testing
Case Study: Intelligent tutoring
• Tutoring domain: primary and secondary school students’ misconceptions about decimals
• Based on Decimal Comparison Test (DCT)
• student asked to choose the larger of pairs of decimals
• different types of pairs reveal different misconceptions
• ITS System involves computer games involving decimals
• This research also looks at a combination of expert elicitation and automated methods
The ITS architecture

Bayesian

Network

Inputs

Student

Generic BN model of student

Decimal comparison

test (optional)

Item

• Diagnose misconception
• Predict outcomes
• Identify most useful information

Information about student e.g. age (optional)

Computer Games

Hidden

number

Classroom

diagnostic test

results (optional)

Feedback

Flying

photographer

• Select next item type
• Decide to present help
• Decide change to new game
• Identify when expertise gained

System

Controller

Module

Item type

Item

Decimaliens

New game

Sequencing

tactics

Number between

Help

Help

….

Report

on student

Classroom

Teaching

Activities

Teacher

Expert Elicitation
• Variables
• two classification nodes: fine and coarse (mut. ex.)
• item types: (i) H/M/L (ii) 0-N
• Structure
• arcs from classification to item type
• item types independent given classification
• Parameters
• careless mistake (3 different values)
• expert ignorance: - in table (uniform distribution)
Evaluation process
• Case-based evaluation
• experts checked individual cases
• sometimes, if prior was low, ‘true’ classification did not have highest posterior (but usually had biggest change in ratio)
• priors changes after each set of evidence
• Comparison evaluation
• Differences in classification between BN and expert rule
• Differences in predictions between different BNs
Comparison evaluation
• Development of measure: same classification, desirable and undesirable re-classification
• Use item type predictions
• Investigation of effect of item type granularity and probability of careless mistake
Investigation by Automated methods
• Classification (using SNOB program, based on MML)
• Parameters
• Structure (using CaMML)
Case Study: Seabreeze prediction
• 2000 Honours project, joint with Bureau of Meteorology (PAKDD’2001 paper, TR)
• BN network built based on existing simple expert rule
• Several years data available for Sydney seabreezes
• CaMML and Tetrad-II programs used to learn BNs from data
• Comparative analysis showed automated methods gave improved predictions.
Open Research Questions
• Tools needed to support expert elicitation
• Combining expert elicitation and automated methods
• Evaluation measures and methods
• Industry adoption of BN technology