Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc.

BRAINSUP Brainstorming Support for Creative Sentence Generation GözdeÖzbal Carlo Strapparava FBK-irst Trento, Italy Daniele Pighin Google Inc. Zürich, Switzerland ACL 2013

Introduction • 在現實世界裡, 創作是一件非常費時費力的事 • 廣告標語: punchy, catchy, memorable • 前人有做過類似的研究,但是都未提出一個統一的格式 • 作者提出Brainsup, 一個可擴展的framework,使用者可以控制所有在創作過程中會使用到的參數,來更符合使用者的需求.

Architecture of BRAINSUP • 首先, 使用者可以選擇一定要出現在句子內的target words, 另外也可以選擇像是 • 特定的semanticdomain: 運動, 毯子… • 特定的emotiondomain: 喜悅, 憤怒,或者負面情緒 • 特定的color: 紅, 藍… • 字的phonetic properties:rhymes(押韻), alliterations (頭韻)and plosives(塞音) • 使用者輸入 U=<t, d, c, e, p,w> • 在target和domain words, 使用者可以選擇words所要考慮的詞性, 例如:“drink/verb” or “drink/verb,noun.

Architecture of BRAINSUP • Pattern selection • Searching the solution space • Filler selection and solution scoring

Architecture of BRAINSUP 最多/最少要產生幾個句子, 最多要考慮幾種pattern, 句子的最長長度… set of meta-parameters  User input  <t, d, c, e, p,w> 根據user input U, 從treebank L挑選符合pattern p的最佳解答從curposP中挑選常見且符合使用者需求的patterns

Architecture of BRAINSUPPattern selection • 從corpus P中挑選出morpho-syntactic patterns • First:選擇corpus, 不同的corpus產生的句子其風格不同 • Second:用Stanford parser對corpus內句子做parse, 再將content words移除, 產生patterns,並記錄每種pattern在corpus中出現的次數

Architecture of BRAINSUPPattern selection 空格內可以填入使用者所選的target words嗎? target words t = [heading/VBG, edge/NN] X t = [heading/NN, edge/NN] V

Architecture of BRAINSUPPattern selection • 空格的數量必須大於targetwords的數量  CompatiblePatterns(.)slots > t, slots的最大/最小數量在Θ內控制, 另外, 為了避免同樣的inputs會產生相同的結果, sort algorithm內加入random component(一樣在Θ內控制) • CompatiblePatterns(.)最後依照patterns出現的次數(多少)回傳

Architecture of BRAINSUPSearching the solution space • 挑選完patterns之後, 再來要選擇每個空格內要填入哪些字(從dependencies數量最多的空格開始執行) 僅包含stop words, syntactic relations, morphologic constraints(POS tags)

Architecture of BRAINSUPSearching the solution space • 分析大型corpus L(資料為parsed sentences)並記錄head-relation-modifier(<h,r,m>) dependency relations出現次數(operator τr(h)) m h m h τ-1nsubj(fires) τamod(smoke)

Architecture of BRAINSUPSearching the solution space τ-1nsubj(fires) τ-1dobj(smoke) τ-1prep(in)

Architecture of BRAINSUPFiller selection and solution scoring • 得到候選字的lists之後, 再來要選擇填入哪些字分數最高且符合使用者的需求

Architecture of BRAINSUPFiller selection and solution scoring • 12 feature functions: • Chromatic and emotional connotation • C為使用者選定的color, si為句子中第i個word • Domain relatedness • d為使用者選定的domain, si為句子中第i個word

Architecture of BRAINSUPFiller selection and solution scoring • Semantic cohesion • 與Domain relatedness相同, 將domain d換成target words t • Target-words scorer • 強迫target words t 必須在sentence中出現

Architecture of BRAINSUPFiller selection and solution scoring • Phonetic features (plosives, alliteration and rhyme) • plosives:計算plosives在一個sentence中出現的比例 • alliteration:用trie來紀錄, ci表示node i走過的次數 • rhyme:和alliteration相同,不過在加入trie前先反轉

Architecture of BRAINSUPFiller selection and solution scoring • Variety scorer • calculated as the number of distinct words in the sentence over the size of the sentence • Unusual-words scorer • ci表示從另一個corpus V中每一個word si∈ s所觀察到的次數

Architecture of BRAINSUPFiller selection and solution scoring • N-gram likelihood • Dependency likelihood

Evaluation • Five experienced annotators were asked to rate 432 creative sentences • 1) Catchiness: is the sentence attractive, catchy or memorable? [Yes/No] • 2) Humor: is the sentence witty or humorous? [Yes/No]; • 3) Relatedness: is the sentence semantically related to the target domain? [Yes/No]; • 4) Correctness: is the sentence grammatically correct?[Ungrammatical/Slightly disfluent/Fluent]; • 5) Success: could the sentence be a good slogan for the target domain? [As it is/With minor editing/No].

Evaluation • Randomly selected a subset of these slogans and for each of them generated an input specification U

Evaluation • t: 從句子中隨機選2~3個 • d:commerical domain • e:positive • c:domain如果有極度相關的顏色才使用,不然就隨機選擇一個顏色 • 產生10個tuple<t, d, c, e, p>再配合5種不同的features組合

Evaluation • Base: Target-word scorer + N-gram likelihood + Dependency likelihood + Variety scorer + Unusual-words scorer + Semantic cohesion • Base + D: base + Domain relatedness • Base + D + C: base + D + Chromatic connotation • Base + D + E: base + D + Emotional connotation • Base + D + P: base + D + Phonetic features • 50種input各產生10句sentences,總共產生432句

Evaluation • weight: set heuristically • Target Word scorer: 1.0 • Variety and Unusual Word scorers: 0.99 • Phonetic Features, Chromatic/Emotional Connotation and Semantic Cohesion scorers :0.98 • Domain, N-gram and Dependency Likelihood scorers: 0.97 • Patterns :corpus of 16,000 proverbs • Dependency operators :British National Corpus • 只考慮字數不大於20個字,且裡面所有的字在wordnet中查得到的sentences

Evaluation-result

Evaluation - result • 有63個cases每一個dimensions都是標為YES,table 1的例子就是選自其中, 除了正確性外, 還可以觀察到許多修辭方法 • 隱喻:a summer sunshine • 雙關: lash your drama • 擬人化:lips and eyes want. • 語音特性的使用 • plosives :passionate kiss, perfect lips • alliteration:thedark drink • rhyme :lips and eyes wantthe kiss

Conclusion • 提出一個extensible framework Brainsup, 使用者可以依照個人需求定義參數. • 系統大量的使用dependency parsed data來保證創造出的句子符合句法性. • 雖然創造出的句子不一定完全符合使用者的需求,但至少會對使用者產生啟發作用.

Conclusion It is wiser to believe in sciencethan in everlasting love.

Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc.

Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc.

Presentation Transcript

zde

Prof. Dr. Orhan ZBAL

Information on FBK

Google Inc.

Information on FBK

FBK FEI4 ATLAS07

CARIATI DANIELE!!.

Carmel Daniele

Marina Barbui Trento, Italy, April 7-11, 2014

Luciano Serafini FBK-IRST Trento, Italia

RD50 Trento Workshop ITC-IRST 28/02/2005-01/03/2005

Overview of the Multilingual Question Answering Track Alessandro Vallin ITC-irst, Trento - Italy

Development of SiPMs a FBK-irst

Oliviero Stock, ITC-irst,Trento

IRST Technology

Ingegneria Senza Frontiere - Trento - Italy

Daniele Dionisio

Maurizio Pighin Dipartimento di Matematica e Informatica Università di Udine, Italy

Developments on 3D detectors at FBK-irst

Remo Job - University of Trento, Italy Claudio Tonzar - University of Urbino, Italy

I.I.S. Carlo Urbani Sede Acilia Rome – Italy

Daniele Gasparri