Part of Speech Tagging & Hidden Markov Models. Mitch Marcus CSE 391. NLP Task I – Determining Part of Speech Tags. The Problem:. NLP Task I – Determining Part of Speech Tags. The Old Solution: Depth First search.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
For a string of words
W = w1w2w3…wn
find the string of POS tags
T = t1 t2 t3 …tn
which maximizes P(T|W)
A Simple, Impossible Approach to Compute P(T|W):
Count up instances of the string "heat oil in a large pot" in the training corpus, and pick the most common tag assignment to the string..
But we can't accurately estimate more than tag bigrams or so…
Again, we change to a model that we CAN estimate:
So, for a given string W = w1w2w3…wn, the taggerneeds to find the string of tags T which maximizes
, how do we find the state sequence that best explains the observations?
(This and following slides follow classic formulation by Rabiner and Juang, as adapted by Manning and Schutze. Slides adapted from Dorr.)
As the probability of being in state si, given the complete observation O
by the following update rules: