Salience-Driven Text Planning

Salience-Driven Text Planning Christian Chiarcos & Manfred Stede {chiarcos|stede}@ling.uni-potsdam.de

Salience-Driven Text Planning • Scenario • Proposal • Previous Research • Salience & Iconicity • Salience-Driven Text Planning • Content Selection • Local Salience • Global Salience • Text Structuring • Concluding Remarks 1/26

I Scenario • PoLiBoX: PotsdamLinguisticBookExpert • Adaptive hypertext generator • User-tailored • explicit user model: preferred features and feature values • Generate rhetorically structured paragraphs each presenting one book • Data base: 25 books on Computational Linguistics • Present books subsequently • describe a book • compare a book with the previous one in the sequence • give recommendations (strongly, mildly, neutral) 2/26

I ScenarioAn Example • Query: user preferences • language: German • expert level: expert audience • topics: modal logics • etc. • User can browse an interactive series of book descriptions/recommendations, e.g. The book BOOK0 was written by AUTHOR0 and published in YEAR0. Like the previous one, it is in German and intended for an expert audience, therefore it seems to be a reasonable choice, too. (somewhat shortened) 3/26

II ProposalPrevious Research • O‘Donnell et al. (2001): • adaptive hypertext, rhetorically structured, but not user-tailored • Carenini/Moore (2000): • user-tailored generation, but without rhetorical structure • Marcu (1997): • bottom-up text planning • precompiled rhetorical relations and ordering preferences G. Carenini and J. Moore (2000). A Strategy for Generating Evaluative Arguments. Proc. of INLG-00. Mitzpe Ramon, Israel, 47-54. M. O'Donnell, Ch.Mellish, J.Oberlander, and A.Knott (2001). ILEX : An architecture for a dynamic hypertext generation system. Natural Language Engineering 7. D. Marcu (1997). From local to global coherence: A bottom-up approach to text planning. Proc. of AAAI-97, Providence/RI, 629-635. 4/26

II ProposalGeneral Overview • Bottom-up algorithm • Dynamically determined rhetorical relations • Independent of sequential order • Motivated by linguistic theory • iconicity principles • salience („importance“, „relatedness“) • User-adaptive & user-tailored • salience is sensitive to previous paragraph/book description • salience is sensitive to user model 5/26

II Proposal Iconicity Principles • Exploit universal functional strategies of communication • Adjacency Pieces of information that are closely related are expected to be presented together. • Importance More important information will be presented more prominently (with „emphasis“) than less important information. • Motivated by iconicity, empirical evidence from different structural levels of language (Givón 2001) T. Givón (2001), Syntax, vol. II. Benjamins, Amsterdam 6/26

II Proposal Salience • Salience „The salience of an entity, in intuitive terms, refers to its prominence, and is interpreted as a measure of how well an entity stands out from other entities ...“ (Pattabhiraman 1993) • Propositional salience Degree of prominence a proposition is assigned by the speaker in a given context • Two granularities of context: • preceding utterance  local salience (bipropositional) • preceding paragraph  global salience (monopropositional) T. Pattabhiraman (1993). Aspects of Salience in Natural Language Generation. PhD thesis. Simon Fraser University. 7/26

II Proposal Salience and Iconicity Reconstruction of iconicity principles: • Local propositional salience ~ „relatedness“ of two discourse units • Global propositional salience ~ „importance“ of a discourse unit in the given context • Adjacency Local salience  discourse segmentation • Importance Global salience  nuclearity patterns 8/26

III Salience-Driven Text Planning • Scenario • Proposal • Salience-Driven Text Planning • Content Selection • Communicative Goal • Local Salience • Global Salience • Text Structuring • Concluding Remarks

III Salience-Driven Text PlanningContent Selection • Select a book depending on user preferences • Attributes to be uttered • identificational (author, title) • unusual and 'important' features very recent, very long, ...  a priori globally salient • attributes in which the book mis-/matches the user model • attributes in which the book mis-/matches the book presented before • Each ‘natural‘ subset of these attributes (e.g. page number, length, ...) is mapped to a proposition 9/26

III Salience-Driven Text PlanningContent Selection • Three kinds of propositions: • ID • ARG • EVAL • an optional comment like „this book perfectly fits your needs“ extrapolated from the communicative goal • ARG propositions are further distinguished • UM-match/mismatch/neutral („user model match“) • PB-match/mismatch/neutral („match with previous book“) 10/26

III Salience-Driven Text PlanningCommunicative Goal • communicative goal: 6 possibilities • degree of recommendation (overlap of attributes with user preferences) • recommend-mildly, recommend-strongly, describe • is it the first book to be presented ? • compare (or just present) • Furthermore, an additional evaluative proposition can be computed that summarizes the overall „value" of the book (EVAL). 11/26

III Salience-Driven Text PlanningLocal Salience • Formalized via thematical relations expressing the semantic type of relation between two propositions • extrapolated from argumentative types • different strengths reflecting the „prototypicality“ of a thematical relation unrelated no corresponding or deviating information on PB- or UM-matches available contrastive (1,2,3) deviation in terms of PB- or UM-matching: one is PB/UM-matching, the other one is PB/UM- mismatching similar (1,2) else 12/26

one proposition has UM-match, the other one UM-mismatch one proposition is UM-neutral both propositions have UM-match or UM-mismatch III Salience-Driven Text PlanningLocal Salience • Ranking of thematical relations • expect adjacency for similar propositions rather than for contrastive propositions and unrelated ones • information theoretic interpretation: number of shared/differing features • Represented by local salience scores σl(p1,p2) for any possible pair of propositions (p1,p2) 13/26

t2 PB-match UM-mismatch contrastive 1 σl: 0.6 contrastive 2 σl: 0.4 t1 PB-match UM-match t3 PB-neutral UM-match similar 2 σl: 0.8 III Salience-Driven Text PlanningLocal Salience: An Example 14/26

III Salience-Driven Text PlanningGlobal Salience If global salience is to define nuclearity preferences, it has to be sensitive to the discourse goal and the argumentative type of a proposition • The better the compatibility of a proposition with the discourse goal (recommendation) is, the more likely it is to appear as nucleus • The higher the degree of recommendation (discourse goal) is, the more significant nuclearity preferences will be • Additionally, factors such as unusual att-values or importance of attributes should play a role 15/26

α discourse goals distinguish three levels of recommendation • Prefer asymmetric rhetorical relations if the communicative goal is to recommend rather than to describe • δ(p) contribution of the argumentative type • Propositions can be salient only if they suggest that the book is preferred by the user (UM-match) or if they represent interesting features (PB-mismatch) The global salience score σg(p) is then computed using multiplication () III Salience-Driven Text PlanningGlobal Salience i(p) external knowledge on the importance a proposition has in the actual paragraph 16/26

III Salience-Driven Text PlanningGlobal Salience • α discourse goals distinguish three levels of recommendation • Prefer asymmetric rhetorical relations if the communicative goal is to recommend rather than to describe • δ(p) contribution of the argumentative type • Propositions can be salient only if they suggest that the book is preferred by the user (UM-match) or if they represent interesting features (PB-mismatch) i(p) external knowledge on the importance a proposition has in the actual paragraph The global salience score σg(p) is then computed using multiplication () 16/26

III Salience-Driven Text PlanningGlobal Salience: An Example t2 σg: 0 (recommend-strongly- and-compare) α = 1 contrastive 1 σl: 0.6 contrastive 2 σl: 0.4 t3 σg: 0.375 t1 σg: 0.25 similar 2 σl: 0.8 17/26

IV Text Structuring • Scenario • Proposal • Salience-Driven Text Planning • Text Structuring • A Bottom-up approach • Conjoining elementary trees • Beam search • Concluding Remarks

IV Text StructuringA Bottom-up Approach • Building up elementary trees in a bottom-up fashion • local salience  rating of a merge • global salience  nuclearity preferences • Schema application (optional) • Rhetorical strategies: ordering preferences • Different schemata with different optimality (depending on discourse goal/specificity) • Modified beam search • Rating  local salience scores + schema rating 18/26

IV Text StructuringConjoining Elementary Trees • Given a pair of discourse segments t1,t2 to be conjoined into an elementary tree t1o2 • rhetorical relation holding between t1 and t2 (SUBORD t1 t2) iff. g(t1) >> g(t2) (COORD t1 t2) iff. g(t1)  g(t2)  BACKGROUND, ANTITHESIS, ... t1o2 =  JOINT, CONTRAST, ... • global salience of t1o2 g(t1) iff. t1o2 = (SUBORD t1 t2) 0.5 (g(t1) + g(t2)) iff. t1o2 = (COORD t1 t2) g(t1o2) = • argumentative type of t1o2 local salience score if t1o2 = (SUBORD t1 t2): inherit argumentative type of nucleus t1 if t1o2 = (COORD t1t2): intersection of the argumentative types of t1 and t2 19/26

IV Text StructuringModified Beam Search • Initial configuration is the set of propositions (elementary discourse segments) • Offspring configurations are calculated by conjoining promising (locally salient) pairs of discourse segments into more complex elementary trees • Rating of configurations depends on local salience scores of a discourse segment‘s constituents and the „optimality“ score of the highest-ranked compatible schema • nonmonotone rating function  keep best previous configurations 20/26

IV Text StructuringModified Beam Search A configuration is a set of elementary trees • given a constant k, a list of configurations C (initially unitary) and the list B of best previous results (initially empty) • for every configuration cC and every promising2 pair of elementary trees (t1,t2)  c • build t1o2 as described above • generate a successor configuration c' = c\{t1,t2}  t1o2 • calculate some rating over all (promising) successor configurations and keep the k best results as candidates for the next iteration in C‘ • add the k best configurations from B and C‘ to an ordered list B‘ • replace C by C‘, B by B‘ • iterate until C contains only unitary sets • the optimal result is the first item in B 2 A „promising“ pair is characterized by a high degree of local salience. 21/26

IV Text StructuringRating • For each discourse segment, rating r is minimal local salience between any two constituent trees • For a complete rhetorical tree, rating r is the sum over discourse segments' rating • For a set of elementary trees • first a rhetorical tree is derived (by schema application) • the final rating is then multiplied with the rating of the best-fitting scheme used to derive the rhetorical tree 22/26

IV Text StructuringSchema Application • Schemata represent rhetorical strategies e.g. (ID EVAL UM-match UM-mismatch) • Depend on the discourse goal • ranked with respect to their „prototypicality“ • partial order reflected by rating scores („weights“) • A schema s is applicable to a configuration c • if s provides exactly one slot for every elementary tree in c, and • every slot in s receives at least one filler from c • The highest-ranked applicable schema is the one to be chosen 23/26

, i.e. Though it is intended for a novice audience (t2), BOOK1 is in German (t1) and deals with modal logics (t3),just as requested. IV Text StructuringAn Example t2 σg: 0 contrastive 1 σl: 0.6 contrastive 2 σl: 0.4 t3 σg: 0.375 t1 σg: 0.25 similar 2 σl: 0.8 most promising offspring configuration is {t1o3, t2} 24/26

V Concluding Remarks • User-tailored, user-adaptive generation of rhetorically structured paragraphs describing books • Bottom-up algorithm • flexible capable to express multiple/conflicting goals (not shown here due to the rather simplistic scenario) • Theoretically motivated text planning algorithm • compatible with functional accounts • on syntactic foregrounding/backgrounding (global salience) • on clause combining (local salience) • Dynamically determined rhetorical relations 25/26

V Concluding RemarksDiscussion & Outlook • Advantages: • highly flexible and user-adaptive • dynamically determined rhetorical relations • natural generation order • hierarchical structuring preceding sequential structuring • reduced complexity for linearization • Disadvantages: • domain-specific salience scores • discourse structure cannot be evaluated independently from linearization • Outlook: • salience-driven linearization • empirically motivated heuristics for interaction of local salience, thematical relations and rhetorical relations 26/26

V Concluding RemarksA Note on Evaluation For the book description domain, no ultimate strategy for direct evaluation has been found • methodological problems • user judgements, retellings, reading time experiments methodically valid only if standardized user models are used, but then it cannot be guaranteed that users don‘t mess up their own preferences with the user model • this problem is expected to be solved by extending the approach to other domains without explicit user models • complexity problems • to exclude artifacts of a specific sequential order we have to test each permutation of a rhetorical relation (binary tree with n nodes: 2n) • this problem is expected to be solved by salience-driven linearization (in preparation) 27/26

Salience-Driven Text Planning