ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Nparse - a shallow n-gram-based grammatical-phrase parser

Alice Carlberger

Nparse is a shallow probabilistic unification-based parser for N-best list resorting and the finding of simple grammatical phrases. It is data-driven and robust, allowing both domain-specific and unrestricted-language training. We believe it can be an interesting alternative for use in a synthesis or recogni-tion front end. This parser has been trained for Swedish on a fine-grained set of grammatical-phrase nodes and grammatical features and evaluated on three language domains. A tree bank database has been built and a detailed linguistic assessment performed. Later, these results will be compared with evalua-tion on a simplified node-and-feature system. Our aim is to find the optimal system complexity for accurately establishing phrase boundaries and phrase types in newspaper text and, ul-timately, unrestricted language. For this, a combination of it-erative manual training and unsupervised training will be used.