1 / 29

Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc.

BRAINSUP Brainstorming Support for Creative Sentence Generation. Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc. Zürich, Switzerland ACL 2013. Introduction. 在現實世界裡 , 創作是一件非常費時費力的事 廣告標語 : punchy, catchy, memorable 前人有做過類似的研究 , 但是都未提出一個統一的格式

angelo
Download Presentation

Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BRAINSUP Brainstorming Support for Creative Sentence Generation GözdeÖzbal Carlo Strapparava FBK-irst Trento, Italy Daniele Pighin Google Inc. Zürich, Switzerland ACL 2013

  2. Introduction • 在現實世界裡, 創作是一件非常費時費力的事 • 廣告標語: punchy, catchy, memorable • 前人有做過類似的研究,但是都未提出一個統一的格式 • 作者提出Brainsup, 一個可擴展的framework,使用者可以控制所有在創作過程中會使用到的參數,來更符合使用者的需求.

  3. Architecture of BRAINSUP • 首先, 使用者可以選擇一定要出現在句子內的target words, 另外也可以選擇像是 • 特定的semanticdomain: 運動, 毯子… • 特定的emotiondomain: 喜悅, 憤怒,或者負面情緒 • 特定的color: 紅, 藍… • 字的phonetic properties:rhymes(押韻), alliterations (頭韻)and plosives(塞音) • 使用者輸入 U=<t, d, c, e, p,w> • 在target和domain words, 使用者可以選擇words所要考慮的詞性, 例如:“drink/verb” or “drink/verb,noun.

  4. Architecture of BRAINSUP • Pattern selection • Searching the solution space • Filler selection and solution scoring

  5. Architecture of BRAINSUP 最多/最少要產生幾個句子, 最多要考慮幾種pattern, 句子的最長長度… set of meta-parameters  User input  <t, d, c, e, p,w> 根據user input U, 從treebank L挑選符合pattern p的最佳解答 從curposP中挑選常見且符合使用者需求的patterns

  6. Architecture of BRAINSUPPattern selection • 從corpus P中挑選出morpho-syntactic patterns • First:選擇corpus, 不同的corpus產生的句子其風格不同 • Second:用Stanford parser對corpus內句子做parse, 再將content words移除, 產生patterns,並記錄每種pattern在corpus中出現的次數

  7. Architecture of BRAINSUPPattern selection 空格內可以填入使用者所選的target words嗎? target words t = [heading/VBG, edge/NN] X t = [heading/NN, edge/NN] V

  8. Architecture of BRAINSUPPattern selection • 空格的數量必須大於targetwords的數量  CompatiblePatterns(.)slots > t, slots的最大/最小數量在Θ內控制, 另外, 為了避免同樣的inputs會產生相同的結果, sort algorithm內加入random component(一樣在Θ內控制) • CompatiblePatterns(.)最後依照patterns出現的次數(多少)回傳

  9. Architecture of BRAINSUPSearching the solution space • 挑選完patterns之後, 再來要選擇每個空格內要填入哪些字(從dependencies數量最多的空格開始執行) 僅包含stop words, syntactic relations, morphologic constraints(POS tags)

  10. Architecture of BRAINSUPSearching the solution space • 分析大型corpus L(資料為parsed sentences)並記錄head-relation-modifier(<h,r,m>) dependency relations出現次數(operator τr(h)) m h m h τ-1nsubj(fires) τamod(smoke)

  11. Architecture of BRAINSUPSearching the solution space τ-1nsubj(fires) τ-1dobj(smoke) τ-1prep(in)

  12. Architecture of BRAINSUPFiller selection and solution scoring • 得到候選字的lists之後, 再來要選擇填入哪些字分數最高且符合使用者的需求

  13. Architecture of BRAINSUPFiller selection and solution scoring • 12 feature functions: • Chromatic and emotional connotation • C為使用者選定的color, si為句子中第i個word • Domain relatedness • d為使用者選定的domain, si為句子中第i個word

  14. Architecture of BRAINSUPFiller selection and solution scoring • Semantic cohesion • 與Domain relatedness相同, 將domain d換成target words t • Target-words scorer • 強迫target words t 必須在sentence中出現

  15. Architecture of BRAINSUPFiller selection and solution scoring • Phonetic features (plosives, alliteration and rhyme) • plosives:計算plosives在一個sentence中出現的比例 • alliteration:用trie來紀錄, ci表示node i走過的次數 • rhyme:和alliteration相同,不過在加入trie前先反轉

  16. Architecture of BRAINSUPFiller selection and solution scoring • Variety scorer • calculated as the number of distinct words in the sentence over the size of the sentence • Unusual-words scorer • ci表示從另一個corpus V中每一個word si∈ s所觀察到的次數

  17. Architecture of BRAINSUPFiller selection and solution scoring • N-gram likelihood • Dependency likelihood

  18. Evaluation • Five experienced annotators were asked to rate 432 creative sentences • 1) Catchiness: is the sentence attractive, catchy or memorable? [Yes/No] • 2) Humor: is the sentence witty or humorous? [Yes/No]; • 3) Relatedness: is the sentence semantically related to the target domain? [Yes/No]; • 4) Correctness: is the sentence grammatically correct?[Ungrammatical/Slightly disfluent/Fluent]; • 5) Success: could the sentence be a good slogan for the target domain? [As it is/With minor editing/No].

  19. Evaluation • Randomly selected a subset of these slogans and for each of them generated an input specification U

  20. Evaluation • t: 從句子中隨機選2~3個 • d:commerical domain • e:positive • c:domain如果有極度相關的顏色才使用,不然就隨機選擇一個顏色 • 產生10個tuple<t, d, c, e, p>再配合5種不同的features組合

  21. Evaluation • Base: Target-word scorer + N-gram likelihood + Dependency likelihood + Variety scorer + Unusual-words scorer + Semantic cohesion • Base + D: base + Domain relatedness • Base + D + C: base + D + Chromatic connotation • Base + D + E: base + D + Emotional connotation • Base + D + P: base + D + Phonetic features • 50種input各產生10句sentences,總共產生432句

  22. Evaluation • weight: set heuristically • Target Word scorer: 1.0 • Variety and Unusual Word scorers: 0.99 • Phonetic Features, Chromatic/Emotional Connotation and Semantic Cohesion scorers :0.98 • Domain, N-gram and Dependency Likelihood scorers: 0.97 • Patterns :corpus of 16,000 proverbs • Dependency operators :British National Corpus • 只考慮字數不大於20個字,且裡面所有的字在wordnet中查得到的sentences

  23. Evaluation-result

  24. Evaluation-result

  25. Evaluation - result • 有63個cases每一個dimensions都是標為YES,table 1的例子就是選自其中, 除了正確性外, 還可以觀察到許多修辭方法 • 隱喻:a summer sunshine • 雙關: lash your drama • 擬人化:lips and eyes want. • 語音特性的使用 • plosives :passionate kiss, perfect lips • alliteration:thedark drink • rhyme :lips and eyes wantthe kiss

  26. Conclusion • 提出一個extensible framework Brainsup, 使用者可以依照個人需求定義參數. • 系統大量的使用dependency parsed data來保證創造出的句子符合句法性. • 雖然創造出的句子不一定完全符合使用者的需求,但至少會對使用者產生啟發作用.

  27. Conclusion It is wiser to believe in sciencethan in everlasting love.

More Related