conditional random fields
Download
Skip this Video
Download Presentation
Conditional Random Fields

Loading in 2 Seconds...

play fullscreen
1 / 14

Conditional Random Fields - PowerPoint PPT Presentation


  • 121 Views
  • Uploaded on

Conditional Random Fields. A form of discriminative modelling Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks Processes evidence bottom-up Combines multiple features of the data

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Conditional Random Fields' - barto


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
conditional random fields
Conditional Random Fields
  • A form of discriminative modelling
    • Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks
  • Processes evidence bottom-up
    • Combines multiple features of the data
    • Builds the probability P( sequence | data)
conditional random fields2

Transition functions add associations

between transitions from

one label to another

State functions help determine the

identity of the state

Conditional Random Fields

/k/

/k/

/iy/

/iy/

/iy/

  • CRFs are based on the idea of Markov Random Fields
    • Modelled as an undirected graph connecting labels with observations
    • Observations in a CRF are not modelled as random variables

X

X

X

X

X

conditional random fields3

State Feature Weight

λ=10

One possible weight value

for this state feature

(Strong)

Transition Feature Weight

μ=4

One possible weight value

for this transition feature

State Feature Function

f([x is stop], /t/)

One possible state feature function

For our attributes and labels

Transition Feature Function

g(x, /iy/,/k/)

One possible transition feature

function

Indicates /k/ followed by /iy/

Conditional Random Fields
  • Hammersley-Clifford Theorem states that a random field is an MRF iff it can be described in the above form
    • The exponential is the sum of the clique potentials of the undirected graph
conditional random fields4
Conditional Random Fields
  • Conceptual Overview
    • Each attribute of the data we are trying to model fits into a feature function that associates the attribute and a possible label
      • A positive value if the attribute appears in the data
      • A zero value if the attribute is not in the data
    • Each feature function carries a weight that gives the strength of that feature function for the proposed label
      • High positive weights indicate a good association between the feature and the proposed label
      • High negative weights indicate a negative association between the feature and the proposed label
      • Weights close to zero indicate the feature has little or no impact on the identity of the label
experimental setup
Experimental Setup
  • Attribute Detectors
    • ICSI QuickNet Neural Networks
  • Two different types of attributes
    • Phonological feature detectors
      • Place, Manner, Voicing, Vowel Height, Backness, etc.
      • Features are grouped into eight classes, with each class having a variable number of possible values based on the IPA phonetic chart
    • Phone detectors
      • Neural networks output based on the phone labels – one output per label
    • Classifiers were applied to 2960 utterances from the TIMIT training set
experimental setup6
Experimental Setup
  • Output from the Neural Nets are themselves treated as feature functions for the observed sequence – each attribute/label combination gives us a value for one feature function
    • Note that this makes the feature functions non-binary features.
experiment 1
Experiment 1
  • Goal: Implement a Conditional Random Field Model on ASAT-style phonological feature data
    • Perform phone recognition
    • Compare results to those obtained via a Tandem HMM system
experiment 1 results
Experiment 1 - Results
  • CRF system trained on monophones with these features achieves accuracy superior to HMM on monophones
    • CRF comes close to achieving HMM triphone accuracy
experiment 2
Experiment 2
  • Goals:
    • Apply CRF model to phone classifier data
    • Apply CRF model to combined phonological feature classifier data and phone classifier data
      • Perform phone recognition
      • Compare results to those obtained via a Tandem HMM system
experiment 2 results
Experiment 2 - Results

Note that Tandem HMM result is best result with only top 39 features following a principal components analysis

experiment 3
Experiment 3
  • Goal:
    • Previous CRF experiments used phone posteriors for CRF, and linear outputs transformed via a Karhunen-Loeve (KL) transform for the HMM sytem
      • This transformation is needed to improve the HMM performance through decorellation of inputs
    • Using the same linear outputs as the HMM system, do our results change?
experiment 3 results
Experiment 3 - Results

Also shown – Adding both feature sets together and giving the system supposedly redundant information leads to a gain in accuracy

experiment 4
Experiment 4
  • Goal:
    • Previous CRF experiments did not allow for realignment of the training labels
      • Boundaries for labels provided by TIMIT hand transcribers used throughout training
      • HMM systems allowed to shift boundaries during EM learning
    • If we allow for realignment in our training process, can we improve the CRF results?
experiment 4 results
Experiment 4 - Results

Allowing realignment gives accuracy results for a monophone trained CRF that are superior to a triphone trained HMM, with fewer parameters

ad