1 / 14

# Conditional Random Fields - PowerPoint PPT Presentation

Conditional Random Fields. A form of discriminative modelling Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks Processes evidence bottom-up Combines multiple features of the data

Related searches for Conditional Random Fields

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Conditional Random Fields' - barto

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

• A form of discriminative modelling

• Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks

• Processes evidence bottom-up

• Combines multiple features of the data

• Builds the probability P( sequence | data)

between transitions from

one label to another

State functions help determine the

identity of the state

Conditional Random Fields

/k/

/k/

/iy/

/iy/

/iy/

• CRFs are based on the idea of Markov Random Fields

• Modelled as an undirected graph connecting labels with observations

• Observations in a CRF are not modelled as random variables

X

X

X

X

X

Î»=10

One possible weight value

for this state feature

(Strong)

Transition Feature Weight

ÎĽ=4

One possible weight value

for this transition feature

State Feature Function

f([x is stop], /t/)

One possible state feature function

For our attributes and labels

Transition Feature Function

g(x, /iy/,/k/)

One possible transition feature

function

Indicates /k/ followed by /iy/

Conditional Random Fields

• Hammersley-Clifford Theorem states that a random field is an MRF iff it can be described in the above form

• The exponential is the sum of the clique potentials of the undirected graph

• Conceptual Overview

• Each attribute of the data we are trying to model fits into a feature function that associates the attribute and a possible label

• A positive value if the attribute appears in the data

• A zero value if the attribute is not in the data

• Each feature function carries a weight that gives the strength of that feature function for the proposed label

• High positive weights indicate a good association between the feature and the proposed label

• High negative weights indicate a negative association between the feature and the proposed label

• Weights close to zero indicate the feature has little or no impact on the identity of the label

• Attribute Detectors

• ICSI QuickNet Neural Networks

• Two different types of attributes

• Phonological feature detectors

• Place, Manner, Voicing, Vowel Height, Backness, etc.

• Features are grouped into eight classes, with each class having a variable number of possible values based on the IPA phonetic chart

• Phone detectors

• Neural networks output based on the phone labels â€“ one output per label

• Classifiers were applied to 2960 utterances from the TIMIT training set

• Output from the Neural Nets are themselves treated as feature functions for the observed sequence â€“ each attribute/label combination gives us a value for one feature function

• Note that this makes the feature functions non-binary features.

• Goal: Implement a Conditional Random Field Model on ASAT-style phonological feature data

• Perform phone recognition

• Compare results to those obtained via a Tandem HMM system

• CRF system trained on monophones with these features achieves accuracy superior to HMM on monophones

• CRF comes close to achieving HMM triphone accuracy

• Goals:

• Apply CRF model to phone classifier data

• Apply CRF model to combined phonological feature classifier data and phone classifier data

• Perform phone recognition

• Compare results to those obtained via a Tandem HMM system

Note that Tandem HMM result is best result with only top 39 features following a principal components analysis

• Goal:

• Previous CRF experiments used phone posteriors for CRF, and linear outputs transformed via a Karhunen-Loeve (KL) transform for the HMM sytem

• This transformation is needed to improve the HMM performance through decorellation of inputs

• Using the same linear outputs as the HMM system, do our results change?

Also shown â€“ Adding both feature sets together and giving the system supposedly redundant information leads to a gain in accuracy

• Goal:

• Previous CRF experiments did not allow for realignment of the training labels

• Boundaries for labels provided by TIMIT hand transcribers used throughout training

• HMM systems allowed to shift boundaries during EM learning

• If we allow for realignment in our training process, can we improve the CRF results?

Allowing realignment gives accuracy results for a monophone trained CRF that are superior to a triphone trained HMM, with fewer parameters