Innovations in Statistical NLG: BAGEL and Active Learning for Enhanced Language Generation

Phrase-based Statistical Language Generation using Graphical Models & Active Learning

Two directions of research in statistical NLG : • Statistics is introduced in the generation process • Statistics is introduced at the generation decision level • Traditional methods rely on handcrafted generator to define the generation decision space • BAGEL : an NLG system that can be trained from aligned data • BAGEL shifts the effort from model design and implementation to data annotation • BAGEL uses a stack based semantic representation to reduce the number of semantic concepts to be searched

Phrase-based generation from semantic stacks • First maps the input dialogue to a set of stacks of semantic concepts and then aligns it with a word sequence • Concepts used in stack: • Dialogue act type, i.e., either inform(an utterance providing information about the object under discussion) or reject(request of the user cannot be met) • Attributes of the object • Values for the attributes • Special symbols for denoting irrelevance

Generator goal : find the most likely combination given an unordered set of mandatory semantic stacks from an input dialogue • BAGEL models contiguous words belonging to the same semantic stack as a phrase

Dynamic Bayesian networks for NLG • Provides a principled framework for predicting elements in a large structured space • Suitable for modelling linguistic variation • BAGEL must derive the optimal sequence of semantic stacks that will appear in the utterance given a set of mandatory stacks • Mandatory stacks that are visited in a current sequence are marked • Any sequence not including all mandatory input stacks in the final frame are discarded

Certainty-based active learning • Domain-specific data not readily available • Active learning is used to optimize data in which the next semantic input to annotate is determined by the current model • BAGEL’s active learning training process: • Generate an utterance for each semantic input using the current model • Annotate the k semantic inputs { } yielding the lowest realization probability • Retrain the model with the additional k data points

Score evaluation • Results are averaged over a 10-fold cross-validation using unseen data • Performance improves with • Adding dependency on future stacks • Backing off to partial stacks(with sparse data only)

Score evaluation • Increasing the training set one utterance at a time using active learning significantly outperforms random sampling (k=1) • Increasing the number of utterances to be queried to 10 at a time results to smaller performance increase

Subjective evaluation • Subjects are asked to evaluate the informativeness and naturalness of the utterances • Utterances are the same as those used for cross validation

Score evaluation

Discussions • BAGEL uses domain specific data in the generation process without any syntactic annotation • No manual work involved beyond the semantic annotation • Future work: • Use of committee based active learning • Can syntactic information improve performance in more complex domains?

Innovations in Statistical NLG: BAGEL and Active Learning for Enhanced Language Generation