Conditional Random Fields

1 / 14

# Conditional Random Fields - PowerPoint PPT Presentation

Conditional Random Fields. A form of discriminative modelling Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks Processes evidence bottom-up Combines multiple features of the data

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Conditional Random Fields' - barto

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Conditional Random Fields
• A form of discriminative modelling
• Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks
• Processes evidence bottom-up
• Combines multiple features of the data
• Builds the probability P( sequence | data)

between transitions from

one label to another

State functions help determine the

identity of the state

Conditional Random Fields

/k/

/k/

/iy/

/iy/

/iy/

• CRFs are based on the idea of Markov Random Fields
• Modelled as an undirected graph connecting labels with observations
• Observations in a CRF are not modelled as random variables

X

X

X

X

X

State Feature Weight

λ=10

One possible weight value

for this state feature

(Strong)

Transition Feature Weight

μ=4

One possible weight value

for this transition feature

State Feature Function

f([x is stop], /t/)

One possible state feature function

For our attributes and labels

Transition Feature Function

g(x, /iy/,/k/)

One possible transition feature

function

Indicates /k/ followed by /iy/

Conditional Random Fields
• Hammersley-Clifford Theorem states that a random field is an MRF iff it can be described in the above form
• The exponential is the sum of the clique potentials of the undirected graph
Conditional Random Fields
• Conceptual Overview
• Each attribute of the data we are trying to model fits into a feature function that associates the attribute and a possible label
• A positive value if the attribute appears in the data
• A zero value if the attribute is not in the data
• Each feature function carries a weight that gives the strength of that feature function for the proposed label
• High positive weights indicate a good association between the feature and the proposed label
• High negative weights indicate a negative association between the feature and the proposed label
• Weights close to zero indicate the feature has little or no impact on the identity of the label
Experimental Setup
• Attribute Detectors
• ICSI QuickNet Neural Networks
• Two different types of attributes
• Phonological feature detectors
• Place, Manner, Voicing, Vowel Height, Backness, etc.
• Features are grouped into eight classes, with each class having a variable number of possible values based on the IPA phonetic chart
• Phone detectors
• Neural networks output based on the phone labels – one output per label
• Classifiers were applied to 2960 utterances from the TIMIT training set
Experimental Setup
• Output from the Neural Nets are themselves treated as feature functions for the observed sequence – each attribute/label combination gives us a value for one feature function
• Note that this makes the feature functions non-binary features.
Experiment 1
• Goal: Implement a Conditional Random Field Model on ASAT-style phonological feature data
• Perform phone recognition
• Compare results to those obtained via a Tandem HMM system
Experiment 1 - Results
• CRF system trained on monophones with these features achieves accuracy superior to HMM on monophones
• CRF comes close to achieving HMM triphone accuracy
Experiment 2
• Goals:
• Apply CRF model to phone classifier data
• Apply CRF model to combined phonological feature classifier data and phone classifier data
• Perform phone recognition
• Compare results to those obtained via a Tandem HMM system
Experiment 2 - Results

Note that Tandem HMM result is best result with only top 39 features following a principal components analysis

Experiment 3
• Goal:
• Previous CRF experiments used phone posteriors for CRF, and linear outputs transformed via a Karhunen-Loeve (KL) transform for the HMM sytem
• This transformation is needed to improve the HMM performance through decorellation of inputs
• Using the same linear outputs as the HMM system, do our results change?
Experiment 3 - Results

Also shown – Adding both feature sets together and giving the system supposedly redundant information leads to a gain in accuracy

Experiment 4
• Goal:
• Previous CRF experiments did not allow for realignment of the training labels
• Boundaries for labels provided by TIMIT hand transcribers used throughout training
• HMM systems allowed to shift boundaries during EM learning
• If we allow for realignment in our training process, can we improve the CRF results?
Experiment 4 - Results

Allowing realignment gives accuracy results for a monophone trained CRF that are superior to a triphone trained HMM, with fewer parameters