110 likes | 240 Views
This paper explores two significant advancements in statistical natural language generation (NLG): the implementation of BAGEL, a phrase-based generation system that emphasizes learning from domain-specific data, and the integration of active learning techniques. BAGEL operates by mapping input dialogues to semantic concept stacks and employs Dynamic Bayesian networks to optimize linguistic output. By focusing on active learning for efficient annotation, this approach significantly improves data utilization and output quality, showcasing a new direction for NLG without relying on manual syntactic annotations.
E N D
Phrase-based Statistical Language Generation using Graphical Models & Active Learning
Two directions of research in statistical NLG : • Statistics is introduced in the generation process • Statistics is introduced at the generation decision level • Traditional methods rely on handcrafted generator to define the generation decision space • BAGEL : an NLG system that can be trained from aligned data • BAGEL shifts the effort from model design and implementation to data annotation • BAGEL uses a stack based semantic representation to reduce the number of semantic concepts to be searched
Phrase-based generation from semantic stacks • First maps the input dialogue to a set of stacks of semantic concepts and then aligns it with a word sequence • Concepts used in stack: • Dialogue act type, i.e., either inform(an utterance providing information about the object under discussion) or reject(request of the user cannot be met) • Attributes of the object • Values for the attributes • Special symbols for denoting irrelevance
Generator goal : find the most likely combination given an unordered set of mandatory semantic stacks from an input dialogue • BAGEL models contiguous words belonging to the same semantic stack as a phrase
Dynamic Bayesian networks for NLG • Provides a principled framework for predicting elements in a large structured space • Suitable for modelling linguistic variation • BAGEL must derive the optimal sequence of semantic stacks that will appear in the utterance given a set of mandatory stacks • Mandatory stacks that are visited in a current sequence are marked • Any sequence not including all mandatory input stacks in the final frame are discarded
Certainty-based active learning • Domain-specific data not readily available • Active learning is used to optimize data in which the next semantic input to annotate is determined by the current model • BAGEL’s active learning training process: • Generate an utterance for each semantic input using the current model • Annotate the k semantic inputs { } yielding the lowest realization probability • Retrain the model with the additional k data points
Score evaluation • Results are averaged over a 10-fold cross-validation using unseen data • Performance improves with • Adding dependency on future stacks • Backing off to partial stacks(with sparse data only)
Score evaluation • Increasing the training set one utterance at a time using active learning significantly outperforms random sampling (k=1) • Increasing the number of utterances to be queried to 10 at a time results to smaller performance increase
Subjective evaluation • Subjects are asked to evaluate the informativeness and naturalness of the utterances • Utterances are the same as those used for cross validation
Discussions • BAGEL uses domain specific data in the generation process without any syntactic annotation • No manual work involved beyond the semantic annotation • Future work: • Use of committee based active learning • Can syntactic information improve performance in more complex domains?