semantic addressable encoding n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Semantic Addressable Encoding PowerPoint Presentation
Download Presentation
Semantic Addressable Encoding

Loading in 2 Seconds...

play fullscreen
1 / 16

Semantic Addressable Encoding - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Semantic Addressable Encoding. Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang Department of Computer Science and Information Engineering National Taiwan University TC402, Oct. 5, ICONIP 2006, Hong Kong. Web red.csie.ntu.edu.tw. Sentence generating function

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Semantic Addressable Encoding' - mika


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
semantic addressable encoding

Semantic Addressable Encoding

Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang

Department of Computer Science and

Information Engineering

National Taiwan University

TC402, Oct. 5, ICONIP 2006, Hong Kong

web red csie ntu edu tw
Web red.csie.ntu.edu.tw
  • Sentence generating function
  • The semantic world of Mark Twain
  • Semantic Search under Shakespeare
outline
Outline
  • Introduction
  • Encoding Method
    • Elman network
    • The word corpus – Elman’s idea
    • Review semantic search
    • Multidimensional Scaling (MDS) space
    • Representative vector of a document
    • Iterative re-encoding
  • Example
  • Summary
introduction
Introduction
  • A central problem in semantic analysis is to effectively encoding and extracting the contents of word sequences.
  • Traditional way of creating a prime semantic space is extremely expensive and complex because experienced linguists are required to analyze huge number of words.
  • This paper represents an automatic encoding process.
elman network
Uoh: Lh x Lo weight matrix

Uhi: Li x Lh weight matrix

Uhc: Lc x Lh weight matrix

Ll = # neurons in output

Lh = # neurons in hidden

Li = # neurons in input

Lc = # neurons in context

The context layer carries memory

The hidden layer activates output layer and refreshes context layer

Desired behavior after training process

Elman Network
the word corpus elman s idea
The word corpus – Elman’s idea
  • All words are coded with certain given lexical codes and all word sequences in corpus D follow the syntax (Noun + Verb + Noun).
  • After training, input all sequences again and record all hidden outputs for each individual input.
  • Obtain new code for nth word by averaging all vectors in
  • Construct a word tree based on the new codes to explore the relationship between words.
review semantic search
Review semantic search
  • The conventional semantic search constructs a semantic model and a semantic measure.
  • A manually designed semantic code set by experts is used in the model. (main focus)
  • One can build a raw semantic matrix W for all N different words
  • A code of a word is a column vector of R features
  • One may use the orthogonal space configured by the characteristic decomposition of the matrix, WWT.
the semantic search
The semantic search
  • Since WWT is a symmetric matrix, all its eigenvalues are real and nonnegative numbers.
  • Each eigenvalue λi equals the variance of the N projections of the codes on the ith eigenvector, fi, that is,
multidimensional scaling mds space
Multidimensional Scaling (MDS) space
  • Select a set of Rs eigenvectors {fr, r=1~Rs} from all R eigenvectors to build a reduced feature space
  • The MDS space is MDS = span{Fs}
  • These selected features are independent and significant. The new code of each word in this space is
representative vector of a document
Representative vector of a document
  • A representative vector for a document D should contain the semantic meaning of the whole document.
  • Two measures are defined
    • Peak preferred measure
    • Average preferred measure
  • Magnitude is normalized
representative vector of a document1
Representative vector of a document
  • The normalized measure vD is used to represent a whole document. And a representative vector vQ for a whole query can be obtained by the same way.
  • The relation score is defined as
iterative re encoding
Iterative re-encoding
  • Eleman’s method for sentence generation of fixed syntax Noun+Verb+Noun can not be applied to more complex sentences.
  • We modify his method. Each word has random lexical code initially
  • After the jth training epoch, a new raw code is calculated
iterative re encoding1
Iterative re-encoding
  • The set sn contains all prediction for the word wn based on its precedent words.
  • After each epoch, all the codes are normalized by the following two equations. The normalization prevents a diminished solution derived by the backpropgation algorithm.
example
Example
  • Test the ability of classifying 36 Shakespeare’s plays. We consider each play as the query input and calculate the relation score of this and one other play. The figure below shows the relation tree.

c: comedy r: romance

h: history t: tragedy

Number denotes publication year

Model parameters: Di=1…36, Qi=1…36, N=10000, Lh=Lc=200, Lo=Li=Rs=R=64

example1
Example
  • We provide a semantic search tool using corpus from Shakespeare’s comedies and tragedies at http://red.csie.ntu.edu.tw/demo/literal/SAS.htm
  • Example search result with parameters Di=1…7777, N=10000, Lo=Li=R=100, Lh=Lc=200, Rs=64
summary
Summary
  • We have explored the concept of semantic addressable encoding and completed a design for it that includes automatic encoding methods.
  • We have presented the result of applying this method in studying literary works.
  • The trained semantic codes can facilitate other research such as linguistic analysis, authorship identity, categorization, etc.
  • The method can be modified to accommodate polysemous words.