260 likes | 420 Views
Persian Part Of Speech Tagging. Mostafa Keikha Database Research Group (DBRG) ECE Department, University of Tehran. Decision Trees. Decision Tree (DT): Tree where the root and each internal node is labeled with a question.
E N D
Persian Part Of Speech Tagging Mostafa Keikha Database Research Group (DBRG) ECE Department, University of Tehran
Decision Trees • Decision Tree (DT): • Tree where the root and each internal node is labeled with a question. • The arcs represent each possible answer to the associated question. • Each leaf node represents a prediction of a solution to the problem. • Popular technique for classification; Leaf node indicates class to which the corresponding tuple belongs.
Decision Trees • A Decision Tree Model is a computational model consisting of three parts: • Algorithm to create the tree • Algorithm that applies the tree to data • Creation of the tree is the most difficult part. • Processing is basically a search similar to that in a binary search tree (although DT may not be binary).
Using DT in POS Tagging • Compute Ambiguity classes • Each term may have different tags • Ambiguity class for each term: set of all possible tags • compute # of occurrence for each tag in each ambiguity class
Using DT in POS Tagging • Create Decision Tree on Ambiguity classes • In each level delete tag with minimum occurrence a b c d 10 20 25 40 b c d 40 39 50 b d 60 55 b
Using DT in POS Tagging • Advantage • Easy to understand • Easy to implement • Disadvantage • Context independent
Using DT in POS Tagging • Known Tokens Results
POS tagging using HMMs Let W be a sequence of words W = w1 , w2 , … , wn Let T be the corresponding tag sequence T = t1 , t2 , … , tn Task : Find T which maximizes P ( T | W ) T’ = argmaxT P ( T | W )
POS tagging using HMMs By Bayes Rule, P ( T | W ) = P ( W | T ) * P ( T ) / P ( W ) T’ = argmaxT P ( W | T ) * P ( T ) Transition Probability, P ( T ) = P ( t1 ) * P ( t2 | t1 ) * P ( t3 | t1 t2 ) …… * P ( tn | t1… tn-1 ) Applying Tri-gram approximation, P ( T ) = P ( t1 ) * P ( t2 | t1 ) * P ( t3 | t1 t2 ) …… * P ( tn | tn-2 tn-1 ) Introducing a dummy tag, $, to represent the beginning of a sentence, P ( T ) = P ( t1 | $ ) * P ( t2 | $ t1 ) * P ( t3 | t1 t2 ) …… * P ( tn | tn-2 tn-1 )
POS tagging using HMMs Smoothing Transition Probabilities Sparse data problem Linear interpolation method P'(ti | ti - 2 , ti - 1) = λ1 P( ti ) + λ2 P(ti | ti - 1 ) + λ3 P(ti | ti - 2 , ti - 1) such that the s sum to 1
Calculation of λs POS tagging using HMMs
POS tagging using HMMs Emission Probability, P(W | T ) ≈ P(w1 | t1) * P(w2 | t2) * . . . * P(wn | tn) Context Dependency To make more dependent on the context the emission probability is calculated as: P(W | T ) ≈ P(w1 | $ t1) * P(w2 | t1t2) ...* P(wn | tn-1tn)
POS tagging using HMMs • Smoothing technique is applied P' (wi | ti-1ti) = θ1 P(wi | ti) + θ2 P(wi | ti-1ti) Sum of all θs is equal to 1 • θs are different for different words.
POS tagging using HMMs 1) 2) 3) 4) 5) 6)
POS tagging using HMMs • Lexicon generation probability
POS tagging using HMMs P(N V ART N | files like a flower) = 4.37*10-6
POS tagging using HMMs • Known Tokens Results