ISCA Archive ICSLP 1992
ISCA Archive ICSLP 1992

Language modelling for recognition and understanding using layered bigrams

Stephanie Seneff, Helen Meng, Victor Zue

This paper describes a new implementation of the TINA parser, which allows it both to run more efficiently and to achieve greater constraint for the recognizer than the original implementation. It also supports an integrated search that provides both full-parse and robust-parse theories using a best-first search order. In TINA probability assignments are applied to sibling-sibling transitions conditioned on the parent category. However, the left sibling is simply the context-insensitive [start] category whenever the right sibling is the first child. The new implementation restructures the nodes so that they occur in layers, thus allowing these first children to condition their probabilities on a left sibling, even when that left sibling has a different parent. The result is about a 10% improvement in perplexity. The parser was also redesigned so as to support a great deal more structure sharing. The main change is to defer unifications until a complete theory is obtained, so that nodes don't have to be cloned for branching theories. We trained the parser on some 10,000 sentences from the ATIS domain, and achieved a test-set perplexity comparable to that obtained from a word trigram language model. The parse tree also provides a meaning representation.