Prosody in speech is manifested by variations of loudness, exaggeration
of pitch, and specific phonetic variations of prosodic segments. For
example, in the stressed and unstressed syllables, there are differences
in place or manner of articulation, vowels in unstressed syllables
may have a more central articulation, and vowel reduction may occur
when a vowel changes from a stressed to an unstressed position.
In this paper, we characterize the sound patterns using phonological
posteriors to capture the phonetic variations in a concise manner.
The phonological posteriors quantify the posterior probabilities of
the phonological classes given the input speech acoustics, and they
are obtained using the deep neural network (DNN) computational method.
Built on the assumption that there are unique sound patterns in different
prosodic segments, we devise a sound pattern matching (SPM) method
based on 1-nearest neighbour classifier. In this work, we focus on
automatic detection of prosodic stress placed on words, called also
emphasized words. We evaluate the SPM method on English and French
data with emphasized words. The word emphasis detection works very
well also on cross-lingual tests, that is using a French classifier
on English data, and vice versa.