1 / 3

Natural Language Processing

Natural Language Processing

Download Presentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Name : Christoffel Daniel Yesaya Tambunan Student ID: 9169420231 Course : Natural Language Programming a)Prove that the loss J (Equation 2) is the same as the cross-entropy loss between y and ? ̂ (note that y, ? ̂ are vectors and ?? ̂ is a scalar): Answer : y is a one-hot encoded vector. •For ? ≠ 0,??= 0.→ ??log(? ̂?) = 0 •For ? = 0,??= 1.→ ??log(? ̂?) = log(? ̂?) •Substitute it, •The loss J is the same as cross-entropy loss. b)Compute the partial derivative of ?(??,?,?)with respect to vc. Please write your answer in terms of y, ? ̂, and U. Show your work (the whole procedure) to receive full credit. Answer : ???) exp (?? ??(??,?,?) ??? exp(?? ? ) = − ??? ??? ∑ ??? ?∈????? ? ???)) − ???∑ ??? )) (???(exp(?? = − exp (?? ??? ?∈????? ? ???) − ???∑ ??? )) ((?? = − exp (?? ??? ?∈????? ???)?? exp (?? exp(?? ?? ?(? = ?|? = ?) = −?0+ ∑ = −?0+ ∑ ??? ∑ ) ?∈????? ?∈????? ?∈????? = ?(? ̂ − ?) = −?0+ ∑ ? ̂??? ?∈????? c)Compute the partial derivatives of ?(??,?,?)with respect to each of the ‘outside’ word vectors, ??’?. There will be two cases: when ? = ?, the true ‘outside’ word vector, and ? ≠ ?, for all other words. Please write your answer in terms of y, ? ̂, and vc. In this part, you may use specific elements within these terms as well (such as y1, y2, . . . ). Note that uw is a vector while y1, y2, . . . are scalars. Show your work (the whole procedure) to receive full credit.

  2. Answer : ???) exp (?? ??(??,?,?) ?? exp(?? ? ) = − ??? ??? ∑ ??? ?∈????? ? )) ???) − ???∑ ??? ((?? = − exp (?? ??? ?∈????? ???) ??? ???)) = −?(?? +?(log (∑ exp (?? ?∈????? ??? •When w=o : = −??+ ?(? = ?|? = ?)??= ? ̂???− ?? = ??(? ̂?− 1) •When w≠o : = 0 + +?(? = ?|? = ?)?? = ? ̂??? d)Write down the partial derivative of ?(??,?,?) with respect to U. Please break down you answer in terms of the column ??(??,?,?) ,??(??,?,?) ??2 ,…,??(??,?,?) ??|?????|. No ??1 derivations are necessary, just an answer in the form of a matrix. Answer : The partial derivatives of each component of J with respect to each element of U (parameter of outside words vec). [??(??,?,?) ??1 ,??(??,?,?) ??2 ,??(??,?,?) ??3 ,…,??(??,?,?) ??|?????| ] e)Suppose the center word is ? = ??and the context window is [??−?,...,??−1,??,??+1,...,??+?], where ? is the context window size. Recall that for the skip-gram version of word2vec, the total loss for the context window is: ?????−????(??,??−?,…,??+?,? = ∑ ?(??,??+?,?) −?≤?≤?,?≠0 Here, ?(??,??+?,? represents an arbitrary loss term for the center word ? = ?? and outside word ??+?.?(??,??+?,?) is equal to Equation 2. Write down three partial derivatives:

  3. ??????−????(??,??−?,…,??+?,?) ?? ??????−????(??,??−?,…,??+?,?) ??? ??????−????(??,??−?,…,??+?,?) ??? I. II. III. ?ℎ?? ? ≠ ? ??(??,??+?,?) ?? ??(??,??+?,?) ??? Write your answers in terms of solution should be one line. and . This is very simple – each Answer : ??????−????(??,??−?,…,??+?,?) ?? ??????−????(??,??−?,…,??+?,?) ??? ??????−????(??,??−?,…,??+?,?) ??? ??(??,??+?,?) ?? ??(??,??+?,?) ??? I. = ∑ −?≤?≤?,?≠0 II. = ∑ −?≤?≤?,?≠0 III. ?ℎ?? ? ≠ ? = 0

More Related