doi: 10.21437/Eurospeech.1997
ISSN: 1018-4074
Is syntactic structure prosodically recoverable?
Mario Rossi
Conversational interfaces: advances and challenges
Victor W. Zue
Prosodic modelling in text-to-speech synthesis
Jan P. H. van Santen
Impact of the unknown communication channel on automatic speech recognition: a review
Jean-Claude Junqua
Statistical techniques for robust ASR: review and perspectives
Jerome Bellegarda
Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise
Richard Lippmann, Beth A. Carlson
Using multiple time scales in a multi-stream speech recognition system
Stéphane Dupont, Hervé Bourlard
Speech recognition using HMM-state confusion characteristics
Yumi Wakita, Harald Singer, Yoshinori Sagisaka
Bottom-up and top-down state clustering for robust acoustic modeling
Cristina Chesta, Pietro Laface, Franco Ravera
Comparison of optimization methods for discriminative training criteria
Ralf Schlüter, W. Macherey, S. Kanthak, Hermann Ney, Lutz Welling
Clustering beyond phoneme contexts for speech recognition
Clark Z. Lee, Douglas O'Shaughnessy
Influence of outliers in training the parametric trajectory models for speech recognition
Rathinavelu Chengalvarayan
Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition
Trym Holter, Torbjorn Svendsen
Modelling and decoding of crossword context dependent phones in the Philips large vocabulary continuous speech recognition system
Peter Beyerlein, Meinhard Ullrich, Patricia Wilcox
Modelling inter-frame dependence with preceeding and succeeding frames
Philip Hanna, Ji Ming, Peter O'Boyle, F. Jack Smith
Continuous speech recognition using syllables
Rhys James Jones, Simon Downey, John S. Mason
A new approach to generalized mixture tying for continuous HMM-based speech recognition
Daniel Willett, Gerhard Rigoll
State tying for context dependent phoneme models
Klaus Beulen, Elmar Bransch, Hermann Ney
A novel node splitting criterion in decision tree construction for semi-continuous HMMs
Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle
Creating unseen triphones by phone concatenation in the spectral, cepstral and formant domains
Mats Blomberg
Creating large subword units for speech recognition
Thilo Pfau, Manfred Beham, W. Reichl, Günther Ruske
Segmental modeling using a continuous mixture of non-parametric models
Jacob Goldberger, David Burshtein, Horacio Franco
Segmentation and modeling in segment-based recognition
Jane W. Chang, James R. Glass
Using syllables in a hybrid HMM-ANN recognition system
Alfred Hauenstein
Noise robust segment-based word recognition using vector quantisation
Ramalingam Hariharan, Juha Hakkinen, Kari Laurila, Janne Suontausta
Viterbi based splitting of phoneme HMMs
Luis Javier Rodriguez, Ines M. Torres
The demiphone: an efficient subword unit for continuous speech recognition
José B. Marino, Albin Nogueiras, Antonio Bonafonte
Organizing phone models based on piecewise linear segment lattices of speech samples
Hiroaki Kojima, Kazuyo Tanaka
Automatic architecture design by likelihood-based context clustering with crossvalidation
Ivica Rogina
Towards articulatory speech recognition: learning smooth maps to recover articulator information
Sam Roweis, Abeer Alwan
Selection of the most effective set of subword units for an HMM-based speech recognition system
Anastasios Tsopanoglou, Nikos Fakotakis
Multi-band continuous speech recognition
Christophe Cerisara, Jean-Paul Haton, Jean-Francois Mari, Dominique Fohr
The design of acoustic parameters for speaker-independent speech recognition
Nabil N. Bitar, Carol Y. Espy-Wilson
Adaptation of natural articulatory movements to the control of the command parameters of a production model
Laurence Candille, Henri Méloni
Three-dimensional coarticulatory strategies of tongue movement
Maureen Stone, Andrew Lundberg, Edward Davis, Rao Gullapalli, Moriel NessAiver
From laryngographic and acoustic signals to voicing gestures
Nathalie Parlangeau, Regine Andre-Obrecht
Ultrasonographic measurement of cricothyroid space in speech
Erkki Vilkman, Raija Takalo, Taisto Maatta, Anne-Maria Laukkanen, Jaana Nummenranta, Tero Lipponen
Coarticulation and articulatory compensations studied by dynamic MRI
Didier Demolin, M. George, V. Lecuit, T. Metens, A. Soquet, H. Raeymaekers
Determining tongue articulation: from discrete fleshpoints to continuous shadow
Pierre Badin, Enrico Baricchi, Anne Vilain
Predicting, diagnosing and improving automatic language identification performance
Marc A. Zissman
Language identification with language-independent acoustic models
Cristobal Corredor-Ardoy, Jean Luc Gauvain, Martine Adda-Decker, Lori Lamel
Bayesian methods for language verification
Eluned S. Parris, Harvey Lloyd-Thomas, Michael J. Carey, Jerry H. Wright
Use of recurrent network for unknown language rejection in language identification system
HingKeung Kwan, Keikichi Hirose
Language-identification based on cross-language acoustic models and optimised information combination
Ove Andersen, Paul Dalsgaard
Phonetic-context mapping in language identification
Jiri Navratil, Werner Zühlke
Discriminative feature and model design for automatic speech recognition
Mazin Rahim, Yoshua Bengio, Yann LeCun
Large vocabulary speech recognition with context dependent MMI-connectionist / HMM systems using the WSJ database
Jörg Rottland, Christoph Neukirchen, Daniel Willett, Gerhard Rigoll
Automatic selection of segmental acoustic parameters by means of neural-fuzzy networks for reordering the n-best HMM hypotheses
Thierry Moudenc, Guy Mercier
Comparison results for segmental training algorithms for mixture density HMMs
Mikko Kurimo
A connectionist approach to machine translation
Asuncion Castano, Francisco Casacuberta
Continuous speech recognition using a context sensitive ANN and HMM2s
Nicolas Pican, Jean-Francois Mari, Dominique Fohr
Acoustic modeling based on the MDL principle for speech recognition
Koichi Shinoda, Takao Watanabe
Discriminative utterance verification using multiple confidence measures
Piyush Modi, Mazin Rahim
Subspace distribution clustering for continuous observation density hidden Markov models
Enrico Bocchieri, Brian Mak
A comparative study of methods for phonetic decision-tree state clustering
H. J. Nock, M. J. F. Gales, Steve J. Young
Comparing Gaussian and polynomial classification in SCHMM-based recognition systems
Alfred Kaltenmeier, Jürgen Franke
Maximum likelihood successive state splitting algorithm for tied-mixture HMNET
Alexandre Girardi, Harald Singer, Kiyohiro Shikano, Satoshi Nakamura
String-level MCE for continuous phoneme recognition
Erik McDermott, Shigeru Katagiri
HMM state clustering across allophone class boundaries
Ze'ev Rivlin, Ananth Sankar, Harry Bratt
Weighted determinization and minimization for large vocabulary speech recognition
Mehryar Mohri, Michael Riley
Parallel speech recognition
Steven Phillips, Anne Rogers
Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition
Stefan Ortmanns, Thorsten Firzlaff, Hermann Ney
A static lexicon network representation for cross-word context dependent phones
Kris Demuynck, Jacques Duchateau, Dirk van Compernolle
Decision-tree based quantization of the feature space of a speech recognizer
Mukund Padmanabhan, L. R. Bahl, D. Nahamoo, Pieter de Souza
Sub-vector clustering to improve memory and speed performance of acoustic likelihood computation
Mosur Ravishankar, R. Bisiani, E. Thayer
The incorporation of path merging in a dynamic network recogniser
Simon Hovell
Improvement on connected digits recognition using duration constraints in the asynchronous decoding scheme
Miroslav Novak
Explicit word error minimization in n-best list rescoring
Andreas Stolcke, Yochai Konig, Mitchel Weintraub
Efficient 2-pass n-best decoder
Long Nguyen, Richard Schwartz
A memory management method for a large word network
Tomohiro Iwasaki, Yoshiharu Abe
Persistence of prosodic features between dialectal and standard Italian utterances in six sub-varieties of a region of southern Italy (salento): first assessments of the results of a recognition test and an instrumental analysis
Antonio Romano
Improving the phonetic annotation by means of prosodic phrasing
Halewijn Vereecken, Annemie Vorstermans, Jean-Pierre Martens, Bert van Coile
A descriptive study of prosodic phenomena in Mpur (west Papuan Phylum)
Cecilia Ode
Automated quantitative analysis of F0 contours of utterances from a German ToBI-labeled speech database
Hansjörg Mixdorff, Hiroya Fujisaki
Identification and automatic generation of prosodic contours for a text-to-speech synthesis system in French
Stéphanie de Tournemire
Quantitative analysis and formulation of tone concatenation in Chinese F0 contours
Jin-Fu Ni, Ren-Hua Wang, Keikichi Hirose
An environment for the labelling and testing of melodic aspects of speech
Christel Brindöpke, Arno Pahde, Franz Kummert, Gerhard Sagerer
PROPAUSE: a syntactico-prosodic system designed to assign pauses
David Casacuberta, Lourdes Aguilar, Rafael Marin
Integrated dialog act segmentation and classification using prosodic features and language models
Volker Warnke, Ralf Kompe, Heinrich Niemann, Elmar Nöth
Evaluation of prosodic characteristics in retold stories in Dutch by means of semantic scales
Monique E. van Donzel, Florien J. Koopmans-van Beinum
Text-to-intonation in spontaneous Swedish
Gosta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House
Synthesising attitudes with global rhythmic and intonation contours
Yann Morlec, Gérard Bailly, Véronique Auberge
Prosody-particle pairs as discourse control signs
Dafydd Gibbon, Claudia Sassen
Focus detection with additional information of phrase boundaries and sentence mode
Anja Elsner
The role of prosody in infants' native-language discrimination abilities: the case of two phonologically close languages
Laura Bosch, Nuria Sebastian-Galles
Prosodic cycles and interpersonal synchrony in American English and Swedish
Eugene H. Buder, Anders Eriksson
Relating prosody to syntax: boundary signalling in Swedish
Eva Strangert
On representation of fundamental frequency of speech for prosody analysis using reliability function
Mitsuru Nakai, Hiroshi Shimodaira
Efficient method of establishing words tone dictionary for Korean TTS system
Seong-Hwan Kim, Jin-Young Kim
Perception of questions and statements in Neapolitan Italian
Mariapaola D'Imperio, David House
Key-phrase spotting using an integrated language model of n-grams and finite-state grammar
Qiguang Lin, Dave Lubensky, Michael Picheny, P. Srinivasa Rao
Efficient methods for detecting keywords in continuous speech
Jochen Junkawitsch, Gunther Ruske, Harald Höge
Providing sublexical constraints for word spotting within the ANGIE framework
Raymond Lau, Stephanie Seneff
Usefulness of phonetic parameters in a rejection procedure of an HMM-based speech recognition system
Katarina Bartkova, Denis Jouvet
Keyword spotting using F0 contour matching
Yoichi Yamashita, Riichiro Mizoguchi
A frame and segment based approach for topic spotting
Elmar Nöth, Stefan Harbeck, Heinrich Niemann, Volker Warnke
Cyclic autocorrelation-based linear prediction analysis of speech
Kuldip K. Paliwal, Yoshinori Sagisaka
Novel filler acoustic models for connected digit recognition
Ilija Zeljkovic, Shrikanth Narayanan
A non-iterative model-adaptive e-CMN/PMC approach for speech recognition in car environments
Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano
Discriminative feature extraction for speech recognition in noise
Angel de la Torre, Antonio M. Peinado, Antonio J. Rubio, Pedro Garcia
Noise robust recognition using feature selective modeling
Michael K. Brendborg, Borge Lindberg
Mixture input transformations for adaptation of hybrid connectionist speech recognizers
Victor Abrash
Adaptation of time differentiated cepstrum for noisy speech recognition
Tai-Hwei Hwang, Lee-Min Lee, Hsiao-Chuan Wang
On the importance of various modulation frequencies for speech recognition
Noboru Kanedera, Takayuki Arai, Hynek Hermansky, Misha Pavel
A robust RNN-based pre-classification for noisy Mandarin speech recognition
Wei-Tyng Hong, Sin-Horng Chen
A parallel environment model (PEM) for speech recognition and adaptation
Mazin Rahim
Adaptive model combination for robust speech recognition in car environments
Volker Schless, Fritz Class
A comparative study of speech detection methods
Stefaan Van Gerven, Fei Xie
Voice activity detection using source separation techniques
Nikos Doukas, Patrick Naylor, Tania Stathaki
Voice activity detection using source separation techniques
Tomohiko Taniguchi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura
Multiresolution channel normalization for ASR in reverberant environments
Carlos Avendano, Sangita Tibrewala, Hynek Hermansky
A speech pre-processing technique for end-point detection in highly non-stationary environments
Rafael Martinez, Agustin Alvarez, Vilda Pedro Gomez, Mercedes Perez, Victor Nieto, Victoria Rodellar
Application of several channel and noise compensation techiques for robust speaker recognition
Laura Docio-Fernandez, Carmen Garcia-Mateo
Knowing the wheat from the weeds in noisy speech
Hany Agaiby, Thomas J. Moir
Model-based approach for robust speech recognition in noisy environements with multiple noise sources
Do Yeong Kim, Nam Soo Kim, Chong Kwan Un
Normalization of speaker variability by spectrum warping for robust speech recognition
Y.C. Chu, Charlie Jie, Vincent Tung, Ben Lin, Richard Lee
LPC poles tracker for music/speech/noise segmentation and music cancellation
Stephane H. Maes
Comparative evaluations of several front-ends for robust speech recognition
Doh-Suk Kim, Jae-Hoon Jeong, Soo-Young Le, Rhee M. Kil
Speaker normalization through formant-based warping of the frequency scale
Evandro B. Gouvea, Richard M. Stern
The use of cepstral means in conversational speech recognition
Martin Westphal
Compensation for environmental and speaker variability by normalization of pole locations
Juan M. Huerta, Richard M. Stern
Cellular phone speech recognition: noise compensation vs. robust architectures
Jean-Baptiste Puel, Régine André-Obrecht
Speech recognition in noise using on-line HMM adaptation
Tung-Hui Chiang
Metrical representations of demarcation and constituency in noun phrases
Christos Malliopoulos, George Mikros
A system of stylized intonation contours in German
Hannes Pirker, Kai Alter, Erhard Rank, John Matiasek, Harald Trost, Gernot Kubin
A method of representing fundamental frequency contours of Japanese using statistical models of moraic transition
Keikichi Hirose, Kouji Iwano
Modeling arbitrarily long sentence-Spanning F0 contours by parametric concatenation of word-Spanning patterns
Evita F. Fotinea, Michael A. Vlahakis, George V. Carayannis
Strong interaction between factors influencing consonant duration
Rob J. J. H. van Son, Jan P. H. van Santen
Speech timing in Slovenian TTS
Jerneja Gros, Nikola Pavesic, France Mihelic
Small microphone arrays with optimized directivity for speech enhancement
Matthias Dorbecker
Microphone array design measures for hands-free speech recognition
Masaaki Inoue, Satoshi Nakamura, Takeshi Yamada, Kiyohiro Shikano
Noise reduction by paired microphones
Masato Akagi, Mitsunori Mizumachi
A microphone array for speech enhancement using multiresolution wavelet transform
Djamila Mahmoudi
A two-channel adaptive microphone array with target tracking
Yoshifumi Nagata, Hiroyuki Tsuboi
Use of different microphone array configurations for hands-free speech recognition in noisy and reverberant environment
Diego Giuliani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer
YINHE: a Mandarin Chinese version of the GALAXY system
Chao Wang, James Glass, Helen Meng, Joe Polifroni, Stephanie Seneff, Victor W. Zue
Multilingual speech recognition for flexible vocabularies
Patrizia Bonaventura, Filippo Gallocchio, Giorgio Micca
A study of multilingual speech recognition
Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke
Multilingual speech recognition: the 1996 byblos callhome system
Jayadev Billa, Kristine Ma, John W. McDonough, George Zavaliagkos, David R. Miller, Kenneth N. Ross, Amro El-Jaroudi
Japanese LVCSR on the spontaneous scheduling task with JANUS-3
Tanja Schultz, Detlef Koll, Alex Waibel
Fast bootstrapping of LVCSR systems with multilingual phoneme sets
Tanja Schultz, Alex Waibel
Factors of variation in the production of the German dorsal fricative
Bernd Pompino-Marschall, Christine Mooshammer
EPG and aerodynamic evidence for the coproduction and coarticulation of clicks in Isizulu
Kimberly Thomas
Formant trajectory dynamics in Swabian diphthongs
Anja Geumann
The gestural organization of vowels and consonants: a cinefluorographic study of articulator gestures in Greenlandic
Sidney A. J. Wood
The perception of coronals in Western Arrernte
Victoria B. Anderson
Acoustic modelling of American English /r/
Carol Y. Espy-Wilson, Shrikanth Narayanan, Suzanne E. Boyce, Abeer Alwan
Acoustic parameters optimised for recognition of phonetic features
Anya Varnich Hansen
Heterogeneous acoustic measurements for phonetic classification 1
Andrew K. Halberstadt, James R. Glass
Cepstral-time matrices and LDA for improved connected digit and sub-word recognition accuracy
Ben Milner
Data-driven design of RASTA-like filters
Sarel van Vuuren, Hynek Hermansky
Evaluating feature set performance using the f-ratio and j-measures
Simon Nicholson, Ben Milner, Stephen Cox
Robust speech parameters located in the frequency domain
Javier Hernando, Climent Nadeu
A modified zero-crossing method for pitch detection in presence of interfering sources
Francois Gaillard, Frederic Berthommier, Gang Feng, Jean-Luc Schwartz
Using simulated annealing expectation maximization algorithm for hidden Markov model parameters estimation
Jacques Simonin, Chafic Mokbel
Covariation of subglottal pressure, F0 and glottal parameters
Gunnar Fant, Stellan Hertegard, Anita Kruckenberg, Johan Liljencrants
The fractal behaviour of unvoiced plosives: a means for classification
Anastasios Delopoulos, Maria Rangoussi
A method for analysis of the local speech rate using an inventory of reference units
Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi
Analysis and modeling of fundamental frequency contours of Greek utterances
Hiroya Fujisaki, Sumio Ohno, Takashi Yagi
Characteristics of slow, average and fast speech and their effects in large vocabulary continuous speech recognition
Fernando Martinez, Daniel Tapias, Jorge Alvarez, Paloma Leon
Analysis of children's speech: duration, pitch and formants
Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan
A method of measuring formant frequencies at high fundamental frequencies
Hartmut Traunmüller, Anders Eriksson
Analysis of speaking rate variations in stress-timed languages
Tom Brondsted, Jens Printz Madsen
Automatic identification of phoneme boundaries using a mixed parameter model
Paul Micallef, Ted Chilton
Pitch detection reliability assessment for forensic applications
Serguei Koval, Veronika Bekasova, Michael Khitrov, Andrey Raev
Efficient estimation of perceptual features for speech recognition
Zhihong Hu, Etienne Barnard
Towards decomposing the sources of variability in speech
Narendranath Malayath, Hynek Hermansky, Alexander Kain
Use of vector-valued dynamic weighting coefficients for speech recognition: maximum likelihood approach
Rathinavelu Chengalvarayan
Automatic segmentation: data-driven units of speech
S. W. Beet, L. Baghai-Ravary
On robust time-varying AR speech analysis based on t-distribution
Dejan Bajic
A simple phoneme energy model for the Greek language and its application to speech recognition
Dimitris Tambakas, Iliana Tzima, Nikos Fakotakis, George Kokkinakis
A macroscopic analysis of an emotional speech corpus
James E. H. Noad, Sandra P. Whiteside, Phil Green
Restoration of pitch pattern of speech based on a pitch generation model
Hiroshi Shimodaira, Mitsuru Nakai, Akihiro Kumata
The research of correlation between pitch and skin galvanic reaction at change of human emotional state
A. V. Agranovski, O. Y. Berg, D. A. Lednov
K-NN versus Gaussian in HMM-based recognition system
Claude Montacié, Marie-José Caraty, Fabrice Lefèvre
Spectral methods for voice source parameters estimation
Boris Doval, Christophe d'Alessandro, Benoit Diard
A simple and efficient algorithm for the compression of MBROLA segment databases
Olivier van der Vrecken, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrere
A segmental formant vocoder based on linearly varying mixture of Gaussians
Parham Zolfaghari, Tony Robinson
Voice mimic system using an articulatory codebook for estimation of vocal tract shape
Samir Chennoukh, Daniel Sinder, Gael Richard, James L. Flanagan
Adaptive transform coding for linear predictive residual
Damith J. Mudugamuwa, Alan B. Bradley
Performance evaluation of objective quality measures for coded speech
Akira Takahashi, Nobuhiko Kitawaki, Paolino Usai, David Atkinson
Between recognition and synthesis - 300 bits/second speech coding
Mohamed Ismail, Keith Ponting
High quality split-band LPC vocoder and its fixed point real time implementation
Stephane Villette, Milos Stefanovic, Ian Atkinson, Ahmet Kondoz
Missing packet recovery techniques for DM coded speech
Wen-Whei Chang, Hwai-Tsu Chang, Wan-Yu Meng
Spectral sensitivity of LSP parameters and their transformed coefficients
Hai Le Vu, Laszlo Lois
Reducing the complexity of the LPC vector quantizer using the k-d tree search algorithm
V. Ramasubramanian, Kuldip K. Paliwal
Quantization using wavelet based temporal decomposition of the LSF
Aweke N. Lemma, W. Bastiaan Kleijn, Ed F. Deprettere
A novel 1.7/2.4 kb/s DCT based prototype interpolation speech coding system
Costas S. Xydeas, Gokhan H. Ilk
Improved regular pulse VSELP coding of speech at low bit-rates
Yong-Soo Choi, Hong-Goo Kang, Sang-Wook Park, Jae-Ha Yoo, Dae-Hee Youn
Joint estimation of pitch, band magnitudes, and v\UV decisions for MBE vocoder
Yong Duk Cho, Hong Kook Kim, Moo Young Kim, Sang Ryong Kim
A new distance measure in LPC coding: application for real time situations
Balazs Kovesi, Samir Saoudi, Jean Marc Boucher, Gábor Horvath
Consideration of processing strategies for very-low-rate compression of wideband speech signals with known text transcription
Peter Vepyek, Alan B. Bradley
Zero-redundancy error protection for CELP speech codecs
Norbert Görtz
Low bit rate speech coding using an improved HSX model
Ridha Matmti, Milan Jelinek, Jean-Pierre Adoul
Phonetic vocoding with speaker adaptation
Carlos M. Ribeiro, Isabel Trancoso
Quantization of spectral sequences using variable length spectral segments for speech coding at very low bit rate
Geneviève Baudoin, Jan Cernocky, Gérard Chollet
On modeling event functions in temporal decomposition based speech coding
Shahrokh Ghaemmaghami, Mohamed Deriche, Boualem Boashash
Phase quantization by pitch-cycle waveform coding in low bit rate sinusoidal coders
Soledad Torres, F. Javier Casajús-Quirós
A perceptual study of the greek vowel space using synthetic stimuli
Antonis Botinis, Marios Fourakis, John W. Hawks
Mixed multi-band excitation coder using frequency domain mixture function (FDMF) for a low-bit rate speech coding
Woo-Jin Han, Sung-Joo Kim, Yung-Hwan Oh
Robust GSM speech decoding using the channel decoder's soft output
Tim Fingscheidt, Olaf Scheufen
A low-bit-rate speech coder using adaptive line spectral frequency prediction 1319
Carl W. Seymour, Tony A. Robinson
Optimising unit selection with voice source and formants in the CHATR speech synthesis system
Wen Ding, Nick Campbell
A new framework to provide high-controllability speech signal and the development of a workbench for it
Masanobu Abe, Hideyuki Mizuno, Satoshi Takahashi, Shin'ya Nakajima
Shape-invariant prosodic modification algorithm for concatenative text-to-speech synthesis
Eduardo R. Banga, Carmen Garcia-Mateo, Xavier Fernandez-Salgado
An RNN-based spectral information generation for Mandarin text-to-speech
Shaw-Hwa Hwang, Sin-Horng Chen, Saga Chang
Methods for optimal text selection
Jan P. H. van Santen, Adam L. Buchsbaum
High resolution prosody modification for speech synthesis
Francisco M. Gimenez de los Galanes, David Talkin
Text-to-speech conversion with neural networks: a recurrent TDNN approach
Orhan Karaali, Gerald Corrigan, Ira Gerson, Noel Massey
Data driven formant synthesis
Jesper Högberg
Speech synthesis using non-uniform units in the Verbmobil project
Simon King, Thomas Portele, Florian Höfer
On the pronunciation mode of acronyms in several European languages
Isabel Trancoso, M. Ceu Vianna
Evaluation of speech synthesis systems for Dutch in tele-communication applications in GSM and PSTN networks
Toni Rietveld, Joop Kerkhoff, M. J. W. M. Emons, E.J. Meijer, Angelien A. Sanderman, Agaath M. C. Sluijter
Automatic diphone extraction for an Italian text-to-speech synthesis system
Bianca Angelini, Claudia Barolo, Daniele Falavigna, Maurizio Omologo, Stefano Sandri
Simplification of TTS architecture vs. operational quality
Eric Keller
Felix - a TTS system with improved pre-processing and source signal generation
Georg Fries, Antje Wirth
Investigating the limitations of concatenative synthesis
Mike Edgington
Speech coding and synthesis using parametric curves
Luis Miguel Teixeira de Jesus, Gavin C. Cawley
Automatically clustering similar units for unit selection in speech synthesis
Alan W. Black, Paul Taylor
Improvements on a trainable letter-to-sound converter
Li Jiang, Hsiao-Wuen Hon, Xuedong Huang
On a cepstral pitch alteration technique for prosody control in the speech synthesis system with high quality
Myungjin Bae, Kyuhong Kim, Woncheol Lee
Diphone concatenation using a harmonic plus noise model of speech
Yannis Stylianou, Thierry Dutoit, Juergen Schroeter
The "sketchboard": a dynamic interpretative memory and its use for spoken language understanding
Gérard Sabah
Speech technology integration and research platform: a system study
Qiru Zhou, Chin-Hui Lee, Wu Chou, Andrew Pargellis
Speech recognition on SPHERIC - an IC for command and control applications
Dieter Geller, Markus Lieb, Wolfgang Budde, Oliver Muelhens, Manfred Zinke
MUSE: a scripting language for the development of interactive speech analysis and recognition tools
Michael K. McCandless, James R. Glass
Language learning based on non-native speech recognition
Silke Witt, Steve J. Young
Task modelling by sentence templates
Ute Kilian, Klaus Bader
Extraction and representation rhythmic components of spontaneous speech
Shigeyoshi Kitaazawa, Hideya Ichikawa, Satoshi Kobayashi, Yukihiro Nishinuma
Automatic pronunciation scoring of specific phone segments for language instruction
Yoon Kim, Horacio Franco, Leonardo Neumeyer
Automatic detection of mispronunciation for language instruction
Orith Ronen, Leonardo Neumeyer, Horacio Franco
Continuous formant-tracking applied to visual representations of the speech and speech recognition
Agustin Alvarez, Rafael Martinez, Victor Nieto, Victoria Rodellar, Pedro Gomez
A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents
Goh Kawai, Keikichi Hirose
An educational and experimental workbench for visual processing of speech data
Jan Nouza, Miroslav Holada, Daniel Hajek
A 3 channel digital CVSD bit-rate conversion system using a general purpose DSP
Yong-Soo Choi, Hong-Goo Kang, Sung-Youn Kim, Young-Cheol Park, Dae-Hee Youn
SLIM prosodic module for learning activities in a foreign language
Rodolfo Delmonte, Mirela Petrea, Ciprian Bacalu
Barge-in revised
Bernhard Kaspar, Karlheinz Schuhmacher, Stefan Feldes
Waveedit, an interactive speech processing environment for microsoft windows platform
Mohammad Akbar
Subarashii: Japanese interactive spoken language education
Farzad Ehsani, Jared Bernstein, Amir Najmi, Ognjen Todic
Deploying speech applications over the web
David Goddeau, William Goldenthal, Chris Weikart
CSLUsh: an extendible research environment
Johan Schalkwyk, Jacques de Villiers, Sarel van Vuuren, Pieter Vermeulen
A flexible client-server model for multilingual CTS/TTS development
Tibor Ferenczi, Geza Nemeth, Gabor Olaszy, Zoltan Gaspar
Critically sampled PR filterbanks of nonuniform resolution based on block recursive FAMlet transform
Unto K. Laine
Automatic detection of accent in English words spoken by Japanese students
Nobuaki Minematsu, Nariaki Ohashi, Seiichi Nakagawa
An English conversation and pronunciation CAI system using speech recognition technology
Yasuhiro Taniguchi, Allan A. Reyes, Hideyuki Suzuki, Seiichi Nakagawa
Bringing spoken language systems to the classroom
Stephen Sutton, Ed Kaiser, A. Cronk, Ron Cole
Automatic assessment of foreign speakers' pronunciation of dutch
Catia Cucchiarini, Lou Boves
Use of low power EM radar sensors for speech articulator measurements
John F. Holzrichter, Greg C. Burnett
Real time measurements of the vocal tract resonances during speech
Julien Epps, Annette Dowd, John Smith, Joe Wolfe
Linguistic criteria for building and recording units for concatenative speech synthesis in brazilian portuguese
Eleonora Cavalcante Albano, Patricia Aparecida Aquino
four-and-twenty, twenty-four. what's in a number?
Knut Kvale, Arne Kjell Foldvik
Vowel nasalization in Brazilian Portuguese: an articulatory investigation
Joao Antonio de Moraes
Rhythmic organization pecularities of the spoken text
Elena Steriopolo
Obtaining confidence measures from sentence probabilities
Bernhard Rueber
Sentence design for speech synthesis and speech recognition database by phonetic rules
Yiqing Zu
Identification of regional variants of high German from digit sequences in German telephone speech
Christoph Draxler, Susanne Burger
Aerodynamic constraints on the production of palatalized trills: the case of the Slavic trilled [r]
Darya Kavitskaya
An experimental phonetic study of the interrelationship between prosodic phrase and syntactic structure
Cheol-jae Seong, Sanghun Kim
Individual differences between vowel systems of German speakers
Sebastian J. G. G. Heid
Tempo and its change in spontaneous speech
Anton Batliner, Andreas Kießling, Ralf Kompe, Heinrich Niemann, Elmar Nöth
A corpus-based approach to diphthong analysis of standard Slovenian
Bojan Petek, Rastislav Sustarsic
Catalan vowel duration
Lourdes Aguilar, Julia A. Gimenez, Maria Machuca, Rafael Marin, Montse Riera
The intonation of vocatives in spoken Neapolitan Italian
Maria Rosaria Caputo
A comparative acoustic study of spontaneous and read Italian speech
Emanuela Magno Caldognetto, Claudio Zmarich, Franco Ferrero
A contribution to the estimation of naturalness in the intonation of Italian spontaneous speech
Mario Refice, Michelina Savino, Martine Grice
Diphthongs and the process of monophthongization in Austrian German: a first approach
Sylvia Moosmüller
The prosody of broad and narrow focus in English: two experiments
Steve Hoskins
The domain of accentual lengthening in Scottish English
Alice Turk, Laurence White
Spontaneous dialogue: some results about the F0 predictions of a pragmatic model of information processing
Mariette Bessac, Geneviève Caelen-Haumont
Phonetic characteristics of double articulations in some Mangbutu-efe languages
Didier Demolin, Bernard Teston
Intonation modeling for the southern dialects of the Basque language
Inmaculada Hernaez, Inaki Gaminde, Borja Etxebarria, Pilartxo Etxebarria
From phone identification to phone clustering using mutual information
Peter O'Boyle, Ji Ming, Marie Owens, F. Jack Smith
Phonetic code emergence in a society of speech robots: explaining vowel systems and the MUAF principle
Ahmed-Reda Berrah, Rafael Laboissiere
Effects of voicing on /t,d/ tongue/palate contact in English and norwegian
Inger Moen, Hanne Gram Simonsen
Fieldwork techniques for relating formant frequency, amplitude and bandwidth
Peter Ladefoged, Gunnar Fant
Word juncture modelling based on the TIMIT database
Xue Wang, Louis C.W. Pols
The phonology and phonetics of second language intonation: the case of "Japanese English"
Motoko Ueyama
A low-cost phonetic transcription method
Pablo Fetter, Udo Haiber, Peter Regel-Brietzmann
Word and acoustic confidence annotation for large vocabulary speech recognition
Lin Chase
A senone based confidence measure for speech recognition
Zachary Bergen, Wayne Ward
OOV utterance detection based on the recognizer response function
Erica Bernstein, Ward R. Evans
Estimating confidence using word lattices
Thomas Kemp, Thomas Schaaf
Improved estimation, evaluation and applications of confidence measures for speech recognition
Man-hung Siu, Herbert Gish, Fred Richardson
Improved speaker verification system with limited training data on telephone quality speech
Salleh Hussain, Fergus R. McInnes, Mervyn A. Jack
Verbal information verification
Qi Li, Biing-Hwang Juang, Qiru Zhou, Chin-Hui Lee
A segment-based speaker verification system using SUMMIT
Sridevi V. Sarma, Victor W. Zue
Speaker verification on the world wide web
Michael Sokolov
Text-prompted versus sound-prompted passwords in speaker verification systems
Johan Lindberg, Håkan Melin
GMM sample statistic log-likelihoods for text-independent speaker recognition
Michael Schmidt, John Golden, Herbert Gish
The influence of phrase boundaries on perceived prominence in two-peak intonation contours
Toni Rietveld, Carlos Gussenhoven
Testing the meaning of four dutch pitch accent types
Johanneke Caspers
A perceptual study for modelling speaker-dependent intonation in TTS and dialog systems
Joachim J. Mersdorf, Thomas Domhover
Can we perceive attitudes before the end of sentences? the gating paradigm for prosodic contours
Veronique Aubergé, Tuulikki Grepillat, A. Rilliard
To what extent is perceived focus determined by F0-cues?
Mattias Heldner, Eva Strangert
Temporal-alignment categories of accent-lending rises and falls
David House, Dik Hermes, Frédéric Beaugendre
Webgalaxy - integrating spoken language and hypertext navigation
Raymond Lau, Giovanni Flammia, Christine Pao, Victor W. Zue
Pitch estimation of singing for re-synthesis and musical transcription
Michael J. Carey, Eluned S. Parris, Graham D. Tattersall
Automated lip synchronisation for human-computer interaction and special effect animation
Christian Martyn Jones, Satnam Singh Dlay
Developing web-based speech applications
Charles T. Hemphill, Yeshwant K. Muthusamy
Automatic post-synchronization of speech utterances
Werner Verhelst
Automatic generation of hyperlinks between audio and transcript
Jordi Robert-Ribes, Rami G. Mukhtar
Analysis of infant cries for the early detection of hearing impairment
Sebastian Möller, Rainer Schonweiler
Optical logo-therapy (OLT): a computer-based real time visual feedback application for speech training
A. Hatzis, P.D. Green, S.J. Howard
Intelligent retrieval of very large Chinese dictionaries with speech queries
Sung-Chien Lin, Lee-Feng Chien, Ming-Chiuan Chen, Lin-Shan Lee, Ker-Jiann Chen
Preliminary results of a multilingual interactive voice activated telephone service for people-on-the-move
Fulvio Leonardi, Giorgio Micca, Sheyla Militello, Mario Nigra
Assessment of an operational dialogue system used by a blind telephone switchboard operator
Jean-Christophe Dubois, Yolande Anglade, Dominique Fohr
STACC: an automatic service for information access using continuous speech recognition through telephone line
Antonio J. Rubio, Pedro Garcia, Angel de la Torre, Jose C. Segura, Jesus Diaz-Verdejo, Maria C. Benitez, Victoria Sanchez, Antonio M. Peinado, Juan M. Lopez-Soler, Jose L. Perez-Cordoba
A voice activated dialogue system for fast-food restaurant applications
Ramon Lopez-Cozar, Pedro Garcia, Jesus Diaz-Verdejo, Antonio J. Rubio
Multi-microphone sub-band adaptive signal processing for improvement of hearing aid performance
Paul W. Shields, Douglas R. Campbell
Tactile transmission of intonation and stress
Hans Georg Piroth, Thomas Arnhold
Hearing impairment simulation: an interactive multimedia programme on the internet for students of speech therapy
Kerttu Huttunen, Pentti Korkko, Martti Sorri
Analysis of dysarthric speech by means of formant-to-area mapping
Sorin Ciocea, Jean Schoentgen, Lisa Crevier-Buchman
An intelligent telephone answering system using speech recognition
Boris M. Lobanov, Simon V. Brickle, Andrey V. Kubashin, Tatiana V. Levkovskaja
Speedata: a prototype for multilingual spoken data-entry
Ulla Ackermann, Bianca Angelini, Fabio Brugnara, Marcello Federico, Diego Giuliani, Roberto Gretter, Heinrich Niemann
Applications for the hearing-impaired: evaluation of finnish phoneme recognition methods
Matti Karjalainen, Peter Boda, Panu Somervuo, Toomas Altosaar
Applications for the hearing-impaired: comprehension of finnish text with phoneme errors
Nina Alarotu, Mietta Lennes, Toomas Altosaar, Anja Malm, Matti Karjalainen
Access - automated call center through speech understanding system
Ute Ehrlich, Gerhard Hanrieder, Ludwig Hitzenberger, Paul Heisterkamp, Klaus Mecklenburg, Peter Regel-Brietzmann
Integrating a radio model with a spoken language interface for military simulations
E. Richard Anthony, Charles Bowen, Margot T. Peet, Susan Tammaro
On field experiments of continuous digit recognition over the telephone network
Daniele Falavigna, Roberto Gretter
An HMM-based phoneme recognizer applied to assessment of dysarthric speech
Xavier Menendez-Pidal, James B. Polikoff, H.Timothy Bunnell
Multiapplication platform based on technology for mobile telephone network services
Celinda de la Torre, Gonzalo Alonso
Field test of a calling card service based on speaker verification and automatic speech recognition
Els den Os, Lou Boves, David James, Richard Winski, Kurt Fridh
Speech: a privileged modality
Luc E. Julia, Adam J. Cheyer
Transcription of broadcast news
Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Martine Adda-Decker
Can continuous speech recognizers handle isolated speech?
Fil Alleva, Xuedong Huang, Mei-Yuh Hwang, Li Jiang
Toward automatic transcription of Japanese broadcast news
Tatsuo Matsuok, Yuichi Taguchi, Katsutoshi Ohtsuki, Sadaoki Furui, Katsuhiko Shirai
Automatic detection of semantic boundaries
Mauro Cettolo, Anna Corazza
Connected digit recognition in spontaneous speech
Etienne Bauche, Bojana Gajic, Yasuhiro Minami, Tatsuo Matsuoka, Sadaoki Furui
Advances in transcription of broadcast news
Francis Kubala, Hubert Jin, Spyros Matsoukas, Long Nguyen, Richard Schwartz, John Makhoul
The domain of final lengthening in production and perception in Dutch
Tina Cambier-Langeveld, Marina Nespor, Vincent J. van Heuven
Voicing assimilation as a cue for cluster identification
Christine Meunier
On the perceptual relevance of degemination in Dutch
Saskia M.M. te Riele, Manon Loef, O. van Herwijnen
Does deletion of French SCHWA lead to neutralization of lexical distinctions?
Cecile Fougeron, Donca Steriade
An approach of the catalan palatals discrimination based on durational patterns of spectral evolution
Marielle Bruyninckx, Bernard Harmegnies
Syllable and segment duration at different speaking rates in the Slovenian language
Jerneja Gros, Nikola Pavesic, France Mihelic
Hybrid networks based on RBFN and GMM for speaker recognition
Wei-Ying Li, Douglas O'Shaughnessy
A discriminative training algorithm for Gaussian mixture speaker models
Jialong He, Li Liu, Günther Palm
Comparison of background normalization methods for text-independent speaker verification
Douglas A. Reynolds
Speaker verification with limited enrollment data
Owen Kimball, Michael Schmidt, Herbert Gish, Jason Waterman
Speaker verification in the telephone network: research activities in the cave project
Frédéric Bimbot, Hans-Peter Hutter, Cedric Jaboulet, Johan W. Koolwaaij, Johan Lindberg, Jean-Benoit Pierrot
Speaker verification with GSM coded telephone speech
Mark Kuitert, Lou Boves
Speaker identification with user-selected password phrases
Aaron E. Rosenberg, S. Parthasarathy
Speaker verification based on phonetic decision making
Jesper O. Olsen
Analysis and comparison of score normalisation methods for text-dependent speaker verification
A. M. Ariyaeeinia, P. Sivakumaran
Automatic speaker recognition on a vocoder link
Frederic Jauquet, Patrick Verlinde, Claude Vloeberghs
Likelihood ratio adjustment for the compensation of model mismatch in speaker verification
Frederic Bimbot, Dominiqne Genoud
A lognormal tied mixture model of pitch for prosody based speaker recognition
M. Kemal Sönmez, Larry Heck, Mitchel Weintraub, Elizabeth Shriberg
Parsers, prominence, and pauses
Nick Campbell, Tony Hebert, Ezra Black
Automatic assignment of part-of-speech to out-of-vocabulary words for text-to-speech processing
Frédéric Béchet, Marc El-Bèze
Text-to-prosody parsing in an Italian speech synthesizer. recent improvements
Barbara Gili Fivela, Silvia Quazza
Tagging syllables
Brigitte Krenn
Assigning phrase breaks from part-of-speech sequences
Alan W. Black, Paul Taylor
Prediction of word prominence
Christina Widera, Thomas Portele, Maria Wolters
Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate
Hisao Kuwabara
New results in vowel production: MRI, EPG, and acoustic data
Shrikanth Narayanan, Abeer Alwan, Yong Song
The temporal properties of spoken Japanese are similar to those of English
Takayuki Arai, Steven Greenberg
The amplitudes of the peaks in the spectrum: data from /a/ context
Anna Esposito
Acoustical characteristics of speech and voice in speech pathology
Natalija Bolfan-Stosic, Mladen Hedjever
Pronuncation modeling applied to automatic segmentation of spontaneous speech
Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel
Dynamic and static improvements to lexical baseforms
Simon Downey, Richard Wiseman
Signal driven generation of word baseforms from few examples
Andreas Hauenstein
Modeling the acoustic differences between L1 and L2 speech: the short vowels of africaans and south-african English
Elizabeth C. Botha, Louis C. W. Pols
Laryngeal movements and speech rate: an x-ray investigation
Béatrice Vaxelaire, Rudolph Sock
How flexible is the human voice? - a case study of mimicry
Anders Eriksson, Pär Wretling
The effect of low-pass filtering on estimated voice source parameters
Helmer Strik
Vowel development of /i/ and /u/ in 15-36 month old children at risk and not at risk to stutter
Susan M. Fosnot
Optopalatograph: development of a device for measuring tongue movement in 3D
Alan Wrench, Alan McIntosh, William Hardcastle
Speech synthesis and prosody modification using segmentation and modelling of the excitation signal
Juana M. Gutierrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammed H. Savoji, José M. Pardo
How can the control of the vocal tract limit the speaker's capability to produce the ultimate perceptive objectives of speech? 1063
Christophe Savariaux, Louis-Jean Boë, Pascal Perrier
A step toward general model for symbolic description of the speech signal 1067
Goran S. Jovanovic
Referring in long term speech by using orientation patterns obtained from vector field of spectrum pattern
Kiyoshi Furukawa, Masayuki Nakazawa, Takashi Endo, Ryuichi Oka
Experiments in spoken queries for document retrieval
J. Barnett, S. Anderson, J. Broglio, M. Singh, R. Hudson, S. W. Kuo
Towards an automated directory information system
Frank Seide, Andreas Kellner
A strategy for mixed-initiative dialogue control
Lars Bo Larsen
On the design of effective speech-based interfaces for desktop applications
Jim Hugunin, Victor W. Zue
Dialogue strategies guiding users to their communicative goals
Matthias Denecke, Alex Waibel
A speech interface for forms on WWW
Sunil Issar
Learning the structure of mixed initiative dialogues using a corpus of annotated conversations 1
Giovanni Flammia, Victor W. Zue
AMICA: the AT&t mixed initiative conversational architecture
Roberto Pieraccini, Esther Levin, Wieland Eckert
Generating semantically consistent inputs to a dialog manager
Alicia Abella, Allen L. Gorin
A stochastic model of computer-human interaction for learning dialogue strategies
Esther Levin, Roberto Pieraccini
Semantic processing of out-of-vocabulary words in a spoken dialogue system
Manuela Boros, Maria Aretoulaki, Florian Gallwitz, Elmar Noth, Heinrich Niemann
Clarification dialogues in VERBMOBIL
Elisabeth Maier
Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum
Levent M. Arslan, David Talkin
Optimal state dependent spectral representation for HMM modeling : a new theoretical framework
C. Mokbel, G. Gravier, Gérard Chollet
Speech analysis and synthesis using an AM-FM modulation model
Alexandros Potamianos, Petros Maragos
Synthesis of fricative consonants by audiovisual-to-articulatory inversion
Khaled Mawass, Pierre Badin, Gérard Bailly
New transformations of cepstral parameters for automatic vocal tract length normalization in speech recognition
Tom Claes, Ioannis Dologlou, Louis ten Bosch, Dirk Van Compernolle
A multiresolutionally oriented approach for determination of cepstral features in speech recognition
S. Dobrisek, F. Mihelic, N. Pavesic
Residual noise suppression using psychoacoustic criteria
Tim Haulick, Klaus Linhard, Peter Schrogmeier
Processing linear prediction residual for speech enhancement
B. Yegnanarayana, Carlos Avendano, Hynek Hermansky, P. Satyanarayana Murthy
Combined acoustic echo control and noise reduction for mobile communications
Stefan Gustafsson, Rainer Martin
A nonstationary autoregressive HMM and its application to speech enhancement
Ki Yong Lee, Jae Yeol Rheem
Spectral subtraction and mean normalization in the context of weighted matching algorithms
Nestor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack
Improving the intelligibility of noisy speech using an audible noise suppression technique
D. E. Tsoukalas, J. Mourjopoulos, George Kokkinakis
Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions
Laurent Girin, Gang Feng, Jean-Luc Schwartz
Spectral subtraction using a non-critically decimated discrete wavelet transform
Andreas Engelsberg, Thomas Gulzow
Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition
Jen-Tzung Chien, Hsiao-Chuan Wang, Chin-Hui Lee
Integrated bias removal techniques for robust speech recognition \lambda
Craig Lawrence, Mazin Rahim
Acoustic front ends for speaker-independent digit recognition in car environments
Detlev Langmann, Alexander Fischer, Friedhelm Wuppermann, Reinhold Haeb-Umbach, Thomas Eisele
Signal bias removal using the multi-path stochastic equalization technique
Lionel Delphin-Poulat, Chafic Mokbel
Subband echo cancellation in automatic speech dialog systems
Andrej Miksic, Bogomir Horvat
Speech enhancement via energy separation
Hesham Tolba, Douglas O'Shaughnessy
A method of signal extraction from noisy signal
Masashi Unoki, Masato Akagi
Multi-channel noise reduction using wavelet filter bank
Jiri Sika, Vratislav Davidek
Speech signal detection in noisy environement using a local entropic criterion
Imad Abdallah, Silvio Montresor, Marc Baudry
A new algorithm for robust speech recognition: the delta vector taylor series approach
Pedro J. Moreno, Brian Eberman
Robust enhancement of reverberant speech using iterative noise removal
David Cole, Miles Moody, Sridha Sridharan
A network speech echo canceller with comfort noise
D. J. Jones, Scott D. Watson, K. G. Evans, B. M. G. Cheetham, R. A. Reeve
A new metric for selecting sub-band processing in adaptive speech enhancement systems
Amir Hussain, Douglas R. Campbell, Thomas J. Moir
Estimation of LPC cepstrum vector of speech contaminated by additive noise and its application to speech enhancement
Hidefumi Kobatake, Hideta Suzuki
Multi-band and adaptation approaches to robust speech recognition
Sangita Tibrewala, Hynek Hermansky
Non-quadratic criterion algorithms for speech enhancement
Enrique Masgrau, Eduardo Lleida, Luis Vicente
Automatic acquisition of salient grammar fragments for call-type classification
J. H. Wright, Allen L. Gorin, Giuseppe Riccardi
Stochastically-based natural language understanding across tasks and languages
Wolfgang Minker
Transducer composition for context-dependent network expansion
Michael Riley, Fernando Pereira, Mehryar Mohri
Giving prosody a meaning
Christian Lieske, Johan Bos, Martin Emele, Bjorn Gambac, C.J. Rupp
Feature-based language understanding
Kishore A. Papineni, Salim Roukos, Todd R. Ward
Speech translation based on automatically trainable finite-state models
Juan Carlos Amengual, Jose Miguel Benedi, Klaus Beulen, Francisco Casacuberta, Asuncion Castano, Antonio Castellanos, Victor M. Jimenez, David Llorens, Andres Marzal, Hermann Ney, Federico Prat, Enrique Vida, Juan Miguel Vila
Document space models using latent semantic analysis
Yoshihiko Gotoh, Steve Renals
Adaptive topic - dependent language modelling using word - based varigrams
Sven C. Martin, Jörg Liermann, Hermann Ney
A latent semantic analysis framework for large-Span language modeling
Jerome R. Bellegarda
A maximum likelihood model for topic classification of broadcast news
Richard Schwartz, Toru Imai, Francis Kubala, Long Nguyen, John Makhoul
Language modelling for task-oriented domains
Cosmin Popovici, Paolo Baggia
Chinese language model adaptation based on document classification and multiple domain-specific language models
Sung-Chien Lin, Chi-Lung Tsai, Lee-Feng Chien, Ker-Jiann Chen, Lin-Shan Lee
Estimating prosodic weights in a syntactic-rhythmical prediction system
Philippe Langlais
Syntactic information contained in prosodic features of Japanese utterances
Kazuhiko Ozeki, Kazuyuki Kousaka, Yujie Zhang
Hierarchical duration modelling for speech recognition using the ANGIE framework
Grace Chung, Stephanie Seneff
On the use of prosody in a speech-to-speech translator
Volker Strom, Anja Elsner, Wolfgang Hess, Walter Kasper, Alexandra Klein, Hans Ulrich Krieger, Jörg Spilker, Hans Weber, Gunther Gorz
Automatic recognition of sentence type from prosody in dutch
Vincent J. van Heuven, Judith Haan, Jos J.A. Pacilly
Automatic word demarcation based on prosody
Paul Munteanu, Bertrand Caillaud, Jean-Francois Serignat, Genevicve Caelen-Haumont
A 16-kbit/s wideband speech codec scalable with g.729
A. Kataoka, S. Kurihara, S. Sasaki, S. Hayashi
Comparison of auditory masking models for speech coding
M. Lynch, E. Ambikairajah, A. Davis
Wideband speech coding based on the MBE structure
A. Amodio, G. Feng
Perceptual filter comparisons for wideband and FM bandwidth audio coders
Marcos Perreau Guimaraes, Nicolas Moreau, Madeleine Bonnet
Wideband coding of speech using neural network gain adaptation
Cheung-Fat Chan, Man-Tak Chu
Wideband-speech APVQ coding from 16 to 32 kbps
Josep M. Salavedra
A comparative analysis of blind channel equalization methods for telephone speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang
HMM retraining based on state duration alignment for noisy speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang
Fast parallel model combination noise adaptation processing
Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto, Masayuki Yamada
Speech recognition module for CSCW using a microphone array
Takashi Endo, Shigeki Nagaya, Masayuki Nakazawa, Kiyoshi Furukawa, Ryuichi Oka
Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition
Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, Wen Gao
Robust speech detection method for speech recognition system for telecommunication networks and its field trial
Seiichi Yamamoto, Masaki Naito, Shingo Kuroiwa
The tuning of speech detection in the context of a global evaluation of a voice response system
Laurent Mauuary, Lamia Karray
New methods in continuous Mandarin speech recognition
C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael A. Picheny, Katherine Shen
Automatic transcription of general audio data: effect of environment segmentation on phonetic recognition 1
Michelle S. Spina, Victor W. Zue
Automatic recognition of continuous Cantonese speech with very large vocabulary
Alfred Ying Pang Ng, L. W. Chan, P. C. Ching
Source normalization training for HMM applied to noisy telephone speech recognition
Yifan Gong
The development of a speaker independent continuous speech recognizer for portuguese
Joao P. Neto, Ciro A. Martins, Luis B. Almeida
Blame assignment for errors made by large vocabulary speech recognizers
Lin Chase
Predicting speech recognition performance
Atsushi Nakamura
A voice activity detector for the ITU-t 8kbit/s speech coding standard g.729
Scott D. Watson, Barry M.G. Cheetham, P.A. Barrett, W.T.K. Wong, A.V. Lewi
Vocabulary-independent recognition of American Spanish phrases and digit strings
Yeshwant K. Muthusamy, John J. Godfrey
Recognition of spoken and spelled proper names
Michael Meyer, Hermann Hild
HMM compensation for noisy speech recognition based on cepstral parameter generation
Takao Kobayashi, Takashi Masuko, Keiichi Tokuda
On the robustness of the critical-band adaptive filtering method for multi-source noisy speech recognition
George Nokas, Evangelos Dermatas, George Kokkinakis
A space transformation approach for robust speech recognition in noisy environments
Cun-tai Guan, Shu-hung Leung, Wing-hong Lau
Robust isolated word recognition using WSP-PMC combination
Tzur Vaich, Arnon Cohen
Fuzzy logic for rule-based formant speech synthesis
Spyros Raptis, George V. Carayannis
Integrating acoustic and labial information for speaker identification and verification
Pierre Jourlin, Juergen Luettin, Dominique Genoud, Hubert Wassner
Subword unit representations for spoken document retrieval
Kenney Ng, Victor W. Zue
Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise
Pascal Teissier, Jean-Luc Schwartz, Anne Guerin-Dugue
Securized flexible vocabulary voice messaging system on unix workstation with ISDN connection
Philippe Renevey, Andrzej Drygajlo
Automatic derivation of multiple variants of phonetic transcriptions from acoustic signals
Houda Mokbel, Denis Jouvet
Improved bimodal speech recognition using tied-mixture HMMs and 5000 word audio-visual synchronous database
Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano
On the use of phone duration and segmental processing to label speech signal
Philippe Depambour, Regine Andre-Obrecht, Bernard Delyon
Automatic detection of disturbing robot voice- and ping pong-effects in GSM transmitted speech
Martin Paping, Thomas Fahnle
Speech synthesis using phase vocoder techniques
Joseph Di Martino
Integration of eye fixation information with speech recognition systems
Ramesh R. Sarukkai, Craig Hunter
Generation of broadband speech from narrowband speech using piecewise linear mapping
Yoshihisa Nakatoh, M. Tsushima, T. Norimatsu
An assessment of the benefits active noise reduction systems provide to speech intelligibility in aircraft noise environments
Ian E.C. Rogers
OLGA - a dialogue system with an animated talking agent
Jonas Beskow, Kjell Elenius, Scott McGlashan
Towards usable multimodal command languages: definition and ergonomic assessment of constraints on users' spontaneous speech and gestures
Sandrine Robbe, Noelle Carbonell, Claude Valot
Exploiting repair context in interactive error recovery
Bernhard Suhm, Alex Waibel
An hybrid image processing approach to liptracking independent of head orientation
Lionel Reveret, Frederique Garcia, Christian Benoit, Eric Vatikiotis-Bateson
Automatic modeling of coarticulation in text-to-visual speech synthesis
Bertrand Le Goff
A multimedia platform for audio-visual speech processing
Ali Adjoudani, Thierry Guiard-Marigny, Bertrand Le Goff, Lionel Reveret, Christian Benoit
An intelligent system for information retrieval over the internet through spoken dialogue
Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Takuya Ito, Ken Tajima, Kenji Abe
Data hiding in speech using phase coding
Yasemin Yardimci, A. Enis Cetin, Rashid Ansari
CAVE: an on-line procedure for creating and running auditory-visual speech perception experiments-hardware, software, and advantages
Denis Burnham, John Fowler, Michelle Nicol
The bavarian archive for speech signals: resources for the speech community
Florian Schiel, Christoph Draxler, Hans G. Tillmann
WWWTranscribe - a modular transcription system based on the world wide web
Christoph Draxler
Design, recording and verification of a danish emotional speech database
Inger S. Engberg, Anya Varnich Hansen, Ove Andersen, Paul Dalsgaard
Issues in database creation: recording new populations, faster and better labelling
Maxine Eskenazi, C. Hogan, J. Allen, R. Frederking
Design and analysis of a German telephone speech database for phoneme based training
Stefan Feldes, Bernhard Kaspar, Denis Jouvet
The design of a large vocabulary speech corpus for portuguese
Joao P. Neto, Ciro A. Martins, Hugo Meinedo, Luis B. Almeida
Continued investigations of laryngectomee speech in noise - measurements and intelligibility tests
Lennart Nord, Britta Hammarberg, Elisabet Lundstrom
An appreciation study of an ASR inquiry system
L.J.M. Rothkrantz, W.A.Th. Manintveld, M.M.M. Rats, R.J. van Vark, J.P.M. de Vreught, H. Koppelaar
Object-oriented modeling of articulatory data for speech research information systems
Kamel Bensaber, Paul Munteanu, Jean-Francois Serignat, Pascal Perrier
A Korean speech corpus for train ticket reservation aid system based on speech recognition
Woosung Kim, Myoung-Wan Koo
Recall memory for earcons
Dawn Dutton, Candace Kamm, Susan Boyce
Semi-automatic phonetic labelling of large corpora
O. Mella, D. Fohr
CORPORA - speech database for Polish diphones
Stefan Grocholewski
Multilingual speech interfaces (MSI) and dialogue design environments for computer telephony services
Christel Müller, Thomas Ziem
Getting started with SUSAS: a speech under simulated and actual stress database
John H. L. Hansen, Sahar E. Bou-Ghazale
A markup language for text-to-speech synthesis richard sproat
Paul Taylor, Michael Tanenblatt, Amy Isard
Several measures for selecting suitable speech CORPORA
Shuichi Itahashi, Naoko Ueda, Mikio Yamamoto
Greek speech database for creation of voice driven teleservices
Irene Chatzi, Nikos Fakotakis, George Kokkinakis
Combined on-line model adaptation and Bayesian predictive classification for robust speech recognition
Qiang Huo, Chin-Hui Lee
Speaker adaptive training applied to continuous mixture density modeling
Xavier Aubert, Eric Thelen
Speaker normalization training for mixture stochastic trajectory model
Irina Illina, Yifan Gong
On-line adaptation of hidden Markov models using incremental estimation algorithms
V. Digalakis
Modeling dependency in adaptation of acoustic models using multiscale tree processes
Ashvin Kannan, Mari Ostendorf
Acoustic clustering and adaptation for robust speech recognition
Larry Heck, Ananth Sankar
The DET curve in assessment of detection task performance
Alvin Martin, George Doddington, Terri Kamm, Mark Ordowski, Mark Przybocki
Speech quality evaluation of hands-free terminals
Harald Klaus, Ekkehard Diedrich, Astrid Dehnel, Jens Berger
Use of broadcast news materials for speech recognition benchmark tests
David S. Pallett, Jonathan G. Fiscus, William M. Fisher, John S. Garofolo
Spoken dialogue system evaluation: a first framework for reporting results
Norman M. Fraser
Generality and transferability. two issues in putting a dialogue evaluation tool into practical use
Niels Ole Bernsen, Hans Dybkjaer, Laila Dybkjaer, Vytautas Zinkevicius
Within-speaker variability of the word error rate for a continuous speech recognition system
David A. van Leeuwen, Herman J. M. Steeneken
Opportunities for computer-aided instruction in phonetics and speech communication provided by the internet
Mark Huckvale, Christian Benoit, C. Bowerman, Anders Eriksson, M. Rosner, M. Tatham, Briony Williams
The landscape of future education in speech communication sciences
Gerrit Bloothooft
An integrated system for teaching spoken dialogue systems technology
Kare Sjölander, Joakim Gustafson
Communication science within education for logopedics/speech and language therapy in europe: the state of the art
Janet Beck, Bernard Camilleri, Hilde Chantrain, Anu Klippi, Marianne Leterme, Matti Lehtihalmes, PeterSchneider PeterSchneider, Wilhelm Vieregge, Eva Wigforss
Education in spoken language engineering in europe
Phil Green, Carlos Espain
A survey of phonetics education in Europe
Valerie Hazan, Wim van Dommelen
Matching training and testing criteria in hybrid speech recognition systems
Xin Tu, Yonghong Yan, Ron Cole
Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks
Stephane Dupont, Christophe Ris, Olivier Deroo, Vincent Fontaine, Jean-Marc Boite, L. Zanoni
Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems
J. Hennebert, Christophe Ris, Hervè Bourlard, Steve Renals, Nelson Morgan
Confidence measures for hybrid HMM/ANN speech recognition
Gethin Williams, Steve Renals
Ensemble methods for connectionist acoustic modelling
Gary D. Cook, Steve R. Waterhouse, A.J. Robinson
Improving performance on switchboard by combining hybrid HME/HMM and mixture of Gaussians acoustic models
Jürgen Fritsch, Michael Finke
Experiments in adaptation of language models for commercial applications
Petra Witschel, Harald Höge
Language model adaptation using dynamic marginals
Reinhard Kneser, Jochen Peters, Dietrich Klakow
Transforming out-of-domain estimates to improve in-domain language models
Rukmini Iyer, Mari Ostendorf
MDI adaptation of language models across corpora
P. Srinivasa Rao, Satya Dharanipragada, Salim Roukos
A class based approach to domain adaptation and constraint integration for empirical m-gram models
Klaus Ries
Using story topics for language model adaptation
Kristie Seymore, Ronald Rosenfeld
Towards speaker independent continuous speechreading
Juergen Luettin
Driving synthetic mouth gestures: phonetic recognition for faceme!
William Goldenthal, Keith Waters, Jean-Manuel Van Thong, Oren Glickman
Continuous visual speech recognition using geometric lip-shape models and neural networks
Alexandrina Rogozan, Paul Deleglise
The teleface project multi-modal speech-communication for the hearing impaired
Jonas Beskow, Martin Dahlquist, Björn Granström, Magnus Lundeberg, Karl-Erik Spens, Tobias Öhman
Real-time lip-tracking for lipreading
Rainer Stiefelhagen, Uwe Meier, Jie Yang
From raw images of the lips to articulatory parameters: a viseme-based prediction
Lionel Reveret
Adaptation of Maeda's model for acoustic to articulatory inversion
Bruno Mathieu, Yves Laprie
Why should speech control studies based on kinematics be considered with caution? insights from a 2d biomechanical model of the tongue.
Yohan Payan, Pascal Perrier
An integrated model of the biomechanics and neural control of the tongue, jaw, hyoid and larynx system
Vittorio Sanguineti, Rafael Laboissiere, David J. Ostry
Using MRI to image the moving vocal tract during speech
M. Mohammad, E. Moore, J.N. Carter, C.H. Shadle, S.J. Gunn
Unified physiological model of audible-visible speech production
Eric Vatikiotis-Bateson, Hani Yehia
Motor control information recovering from the dynamics with the EP hypothesis
Hélène Loevenbruck, Pascal Perrier
Speaker adaptation for context-dependent HMM using spatial relation of both phoneme context hierarchy and speakers
Yasuhiro Komori, Tetsuo Kosaka, Masayuki Yamada, Hiroki Yamamoto
Fast algorithm for speech recognition using speaker cluster HMM
Masayuki Yamada, Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto
A comparison of novel techniques for instantaneous speaker adaptation
Timothy J. Hazen, James R. Glass
Fast adaptation of acoustic models to environmental noise using jacobian adaptation algorithm
Yoshikazu Yamaguchi, Satoshi Takahashi, Shigeki Sagayama
Unsupervised HMM adaptation based on speech-silence discrimination
Ilija Zeljkovic, Shrikanth Narayanan, Alexandros Potamianos
Correlation based predictive adaptation of hidden Markov models
Mohamed Afify, Yifan Gong, Jean-Paul Haton
Adaptation of hidden Markov models using multiple stochastic transformations
Vassilios Diakoloukas, Vassilios Digalakis
Transformation smoothing for speaker and environmental adaptation
M. J. F. Gales
Nonlinear discriminant analysis for improved speech recognition
Vincent Fontaine, Christophe Ris, Jean-Marc Boite
On the interplay between auditory-based features and locally recurrent neural networks for robust speech recognition in noise
Jurgen Tchorz, Klaus Kasper, Herbert Reininger, Bilger Kollmeier
Speech recognition using on-line estimation of speaking rate
Nelson Morgan, Eric Fosler, Nikki Mirghafori
Using formant frequencies in speech recognition
John N. Holmes, Wendy J. Holmes, Philip N. Garner
Speaker normalization and speaker adaptation - a combination for conversational speech recognition
Puming Zhan, Martin Westphal, Michael Finke, Alex Waibel
Speaker adaptation based on pre-clustering training speakers
Yuqing Gao, Mukund Padmanabhan, Michael Picheny
A fast method of speaker normalisation using formant estimation
Mike Lincoln, Stephen Cox, Simon Ringland
Acoustic front-end optimization for large vocabulary speech recognition
Lutz Welling, N. Haberland, Hermann Ney
Improving autoregressive hidden Markov model recognition accuracy using a non-linear frequency scale with application to speech enhancement
B. T. Logan, A. J. Robinson
Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training
Tsuneo Nitta, Akinori Kawamura
Speaker adaptation by correlation (ABC)
Scott Shaobing Chen, Peter DeSouza
Preliminary experiments on the perception of double semivowels
William A. Ainsworth, Georg F. Meyer
Does syllable frequency affect production time in a delayed naming task?
Niels O. Schiller
Human and machine identification of consonantal place of articulation from vocalic transition segments
Andrew C. Morris, Gerrit Bloothooft, William J. Barry, Bistra Andreeva, Jacques Koreman
Modelling the recognition of spectrally reduced speech
Jon Barker, Martin Cooke
Prosodic structure and phonetic processing: a cross-linguistic study
Christophe Pallier, Anne Cutler, Nuria Sebastian-Galles
The correlation between consonant identification and the amount of acoustic consonant reduction
Rob J. J. H. van Son, Louis C. W. Pols
Relevant spectral information for the identification of vowel features from bursts
Anne Bonneau
Perceptual study of intersyllabic formant transitions in synthesized V1-V2 in standard Chinese
Aijun Li
Role of perception of rhythmically organized speech in consolidation process of long-term memory traces (LTM-traces) and in speech production controlling
Oleg P. Skljarov
Sequential probabilities as a cue for segmentation
Arie H. van der Lugt
Perception and acoustics of emotions in singing
Susan Jansens, Gerrit Bloothooft, Guus de Krom
Phonemes and syllables in speech perception: size of attentional focus in French
Christophe Pallier
Quality of a vowel with formant undershoot: a preliminary perceptual study
Shinichi Tokuma
Segmental and suprasegmental contributions to spoken-word recognition in dutch
Mariette Koster, Anne Cutler
Perception of vowel duration and spectral characteristics in Swedish
Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan
Relative contributions of noise burst and vocalic transitions to the perceptual identification of stop consonants
Adrien Neagu, Gerard Bailly
Effect of speaker familiarity and background noise on acoustic features used in speaker identification
Satoshi Kitagawa, Makoto Hashimoto, Norio Higuchi
Dynamic versus static specification for the perceptual identity of a coarticulated vowel
Michel Pitermann
Asymmetries in consonant confusion
Madelaine Plauche, Cristina Delogu, John J. Ohala
Rime and syllabic effects in phonological priming between French spoken words
Nicolas Dumay, Monique Radeau
Roles of static and dynamic features of formant trajectories in the perception of talk indedivduality
Weizhong Zhu, Hideki Kasuya
Database management and analysis for spoken dialog systems: methodology and tools
Chih-mei Lin, Shrikanth Narayanan, Russell Ritenour
Evaluating spoken dialog systems for telecommunication services
Candace Kamm, Shrikanth Narayanan, Dawn Dutton, Russell Ritenour
Robust spoken dialogue management for driver information systems
Xavier Pouteau, Emiel Krahmer, Jan Landsbergen
Using acoustic and prosodic cues to correct Chinese speech repairs
Yue-Shi Lee, Hsin-Hsi Chen
Integrating domain specific focusing in dialogue models
Nils Dahlbäck, Arne Jönsson
Evaluating competing agent strategies for a voice email agent
Marilyn Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel
Discourse marker use in task-oriented spoken dialog \lambda
Donna K. Byron, Peter A. Heeman
From interface to content: translingual access and delivery of on-line information
Victor W. Zue, Stephanie Seneff, James Glass, Lee Hetherington, Edward Hurley, Helen Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid
Learning dialogue structures from a corpus
Jan Alexandersson, Norbert Reithinger
Dialogue act classification using language models
Norbert Reithinger, Martin Klesen
User's multiple goals in spoken dialogue
Didier Pernel
Chatting with interactive agent
Noriko Suzuki, Seiji Inokuchi, K. Ishii, Michio Okada
Generic template for the evaluation of dialogue management systems
Gavin E. Churcher, Eric S. Atwell, Clive Souter
Analysis of interactive strategy to recover from misrecognition of utterances including multiple information items
Yasuhisa Niimi, Takuya Nishimoto, Yutaka Kobayashi
A referential approach to reduce perplexity in the vocal command system comppa
Francois-Arnould Mathieu, Bertrand Gaiffe, Jean-Marie Pierrel
Linguistic processor for a spoken dialogue system based on island parsing techniques
Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis
Modelling of speech-based user interfaces
Brian Mellor, Chris Baber
Can you predict responses to yes/no questions? yes, no, and stuff
Beth Ann Hockey, Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, Stephen Isard
Dia-moLE: an unsupervised learning approach to adaptive dialogue models for spoken dialogue systems
Jens-Uwe Möller
How do system questions influence lexical choices in user answers?
Joakim Gustafson, Anette Larsson, Rolf Carlson, K. Hellman
Gaussian mixture models with common principal axes and their application in text-independent speaker identification
Kuo-Hwei Yuo, Hsiao-Chuan Wang
Speaker models designed from complete data sets: a new approach to text-independent speaker verification
Dominik R. Dersch, Robin W. King
A double Gaussian mixture modeling approach to speaker recognition
Rivarol Vergin, Douglas O'Shaughnessy
An acoustic subword unit approach to non-linguistic speech feature identification
Mohamed Afify, Yifan Gong, Jean-Paul Haton
N-best GMM's for speaker identification
Chakib Tadj, Pierre Dumouchel, Yu Fang
Model dependent spectral representations for speaker recognition
Guillaume Gravier, Chafic Mokbel, Gerard Chollet
Equalizing sub-band error rates in speaker recognition
Roland Auckenthaler, John S. Mason
Automatic gender identification under adverse conditions
Stefan Slomka, Sridha Sridharan
Acoustic features and perceptive processes in the identification of familiar voices
Yizhar Lavner, Isak Gath, Judith Rosenhouse
On the use of acoustic segmentation in speaker identification
Leandro Rodriguez-Linares, Carmen Garcia-Mateo
Speaker recognition by humans and machines
Herman J. M. Steeneken, David A. van Leeuwen
Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks
Karsten Kumpf, Robin W. King
A comparison of human and machine in speaker recognition
Li Liu, Jialong He, Günther Palm
Evaluation of second language learners' pronunciation using hidden Markov models
Simo M. A. Goddijn, Guus de Krom
Delta vector taylor series environment compensation for speaker recognition
Brian Eberman, Pedro J. Moreno
Wavelet-like regression features in the cepstral domain for speaker recognition
Jonathan Hume
Minimum classification error linear regression (MCELR) for speaker adaptation using HMM with trend functions
Rathinavelu Chengalvarayan
A continuous HMM text-independent speaker recognition system based on vowel spotting
Nikos Fakotakis, Kallirroi Georgila, Anastasios Tsopanoglou
On the independence of digits in connected digit strings
Johan W. Koolwaaij, Lou Boves
A new procedure for classifying speakers in speaker verification systems
Johan W. Koolwaaij, Lou Boves
Sound channel video indexing
Claude Montacié, Marie-José Caraty
CDHMM speaker recognition by means of frequency filtering of filter-bank energies
Javier Hernando, Climent Nadeu
Using accent-specific pronunciation modelling for improved large vocabulary continuous speech recognition
J. J. Humphries, P. C. Woodland
Automatic speech recognition for children
Alexandros Potamianos, Shrikanth Narayanan, Sungbok Lee
Recognition of non-native accents
Carlos Teixeira, Isabel Trancoso, Antonio Serralheiro
Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition
Michael Finke, Alex Waibel
A prosody only decision-tree model for disfluency detection
Elizabeth Shriberg, Rebecca Bates, Andreas Stolcke
A novel training approach for improving speech recognition under adverse stressful conditions
Sahar E. Bou-Ghazale, John H. L. Hansen
Methods for microphone equalization in speech recognition
L. Fissore, Giorgio Micca, C. Vair
Room acoustics and reverberation: impact on hands-free recognition
Satoshi Nakamura, Kiyohiro Shikano
Echo and noise reduction for hands-free terminals - state of the art -
Gerard Faucon, Regine Le Bouquin-Jeannes
Robust speech recognition for wireless networks and mobile telephony
Reinhold Haeb-Umbach
Speech recognition in the car from phone dialing to car navigation
Dirk Van Compernolle
A keyvowel approach to the synthesis of regional accents of English
Briony Williams, Stephen Isard
Experimental implementation of pitch-synchronous synthesis methods for the ROMVOX text-to-speech system
Attila Ferencz, Radu Arsinte, Istvan Nagy, Teodora Ratiu, Maria Ferencz, Gavril Toderean, Diana Zaiu, Tunde-Csilla Kovacs, Lajos Simon
The bell labs German text-to-speech system: an overview
Bernd Möbius, Richard Sproat, Jan P. H. van Santen, Joseph P. Olive
The generation of regional pronunciations of English for speech synthesis
Susan Fitt
Bell laboratories Russian text-to-speech system
Elena Pavlova, Yuri Pavlov, Richard Sproat, Chilin Shih, Jan P. H. van Santen
A bilingual text-to-speech system in Spanish and catalan
Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, Francesc Vallverdu
Automatic rule-based generation of word pronunciation networks
Nick Cremelie, Jean-Pierre Martens
Creating user defined new vocabularies for voice dialing
Jose Maria Elvira, Juan Carlos Torrecilla, Javier Caminero
Automatic generation of context-dependent pronunciations
Mosur Ravishankar, Maxine Eskenazi
Automatic generation of a pronunciation dictionary based on a pronunciation network
Toshiaki Fukada, Yoshinori Sagisaka
What is wrong with the lexicon - an attempt to model pronunciations probabilistically
Uwe Jost, Henrik Heine, Gunnar Evermann
Lexical tuning based on triphone confidence estimation
Kevin L. Markey, Wayne Ward
Improving of amplitude modulation maps for F0-dependent segregation of harmonic sounds
Frédéric Berthommier, Georg Meyer
Psychophysical evaluation of PSOLA: natural versus synthetic speech
Reinier Kortekaas, Armin Kohlrausch
Perception of noised words by normal children and children with speech and language impairments
Valentina V. Lublinskaja, Inna V. Koroleva, A.N. Kornev, Elena V. Iagounova
Modelling the perception of simultaneous semi-vowels
Georg F. Meyer, William A. Ainsworth
Properties of auditory model representations
Fernando S. Perdigao, Luis V. Sa
Impact of "ascending sequence" AI (auditory primary cortex) cells on stop consonant perception
Eduardo Sa Marta, Luis Vieira de Sa
Combinatorial issues in text-to-speech synthesis
Jan P. H. van Santen
Application-dependent prosodic models for text-to-speech synthesis and automatic design of learning database corpus using genetic algorithm
Olivier Boeffard, F. Emerard
Automatic corpus-based training of rules for prosodic generation in text-to-speech
Eduardo Lopez-Gonzalo, Jose M. Rodriguez-Garcia, Luis Hernandez-Gomez, Juan M. Villar
Hidden Markov model based voice conversion using dynamic characteristics of speaker
Eun-Kyoung Kim, Sangho Lee, Yung-Hwan Oh
Speaker interpolation in HMM-based speech synthesis system
Takayoshi Yoshimura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi, Tadashi Kitamura
Designing a speaker adaptable formant-based text-to-speech system
Vassilios Darsinos, Dimitrios Galanis, George Kokkinakis
On using fractal features of speech sounds in automatic speech recognition
Petros Maragos, Alexandros Potamianos
Dynamic constraint weighting in the context of articulatory parameter estimation
Hywel B. Richards, John S. Bridle, Melvyn J. Hunt, John S. Mason
Estimation of vocal tract front cavity resonance in unvoiced fricative speech
Minkyu Lee, Donald G. Childers
A software tool to study portuguese vowels
Antonio Teixeira, Francisco Vaz, Jose Carlos Principe
Post-synchronization via formant-to-area mapping of asynchronously recorded speech signals and area functions
Jean Schoentgen, Sorin Ciocea
Geometrically and acoustically optimized codebook for unique mapping from formants to vocal-tract shape
Zhenli L. Yu, P.C. Ching
Modeling segmental duration with multivariate adaptive regression splines
Marcel Riedi
High-quality speech synthesis for phonetic speech segmentation
Fabrice Malfrere, Thierry Dutoit
Factors affecting perceived quality and intelligibility in the CHATR concatenative speech synthesiser
Nick Campbell, Yoshiharu Itoh, Wen Ding, Norio Higuchi
Reduced lexicon trees for decoding in a MMIi-connectionist/HMM speech recognition system
Christoph Neukirchen, Daniel Willett, Gerhard Rigoll
A stochastic model of intonation for French text-to-speech synthesis
Jean Veronis, Philippe Di Cristo, Fabienne Courtois, Benoit Lagrue
Phonetic rules for a phonetic-to-speech system
Angelien A. Sanderman, Renè Collier
Multi-lingual duration modeling
Jan van Santen, Chilin Shih, Bernd Möbius, Evelyne Tzoukermann, Michael Tanenblatt
A model of segment (and pause) duration generation for Brazilian Portuguese text-to-speech synthesis
Plinio A. Barbosa
Parsing strategy for spoken language interfaces with a lexicalized tree grammar
Ariane Halber, David Roussel
What's in a word graph evaluation and enhancement of word lattices?
Jan W. Amtrup, Henrik Heine, Uwe Jost
Accelerated DP based search for statistical translation
C. Tillmann, S. Vogel, Hermann Ney, A. Zubiaga, H. Sawaf
Use of pitch pattern improvement in the CHATR speech synthesis system
Ken Fujisawa, Toshio Hirai, Norio Higuchi
Generating segment durations in a text-zo-speech system: a hybrid rule-based/neural network approach
G. Corrigan, N. Massey, O. Karaali
On the global FO shape model using a transition network for Japanese text-to-speech systems
Yasushi Ishikawa, Takashi Ebihara
An alternative and flexible approach in robust information retrieval systems
José Colás, Juan M. Montero, Javier Ferreiros, José M. Pardo
A probabilistic approach to analogical speech translation
Keiko Horiguchi, Alexander Franz
Dynamic lexicon for a very large vocabulary vocal dictation
Marie-José Caraty, Claude Montacié, Fabrice Lefèvre
Construction of language models using the morphic generator grammatical inference (MGGI) methodology
E. Segarra, L. Hurtado
An integrated language modeling with n-gram model and WA model for speech recognition
Shuwu Zhang, Taiyi Huang
Statistical analysis of dialogue structure
Ye-Yi Wang, Alex Waibel
Statistical language modeling using the CMU-cambridge toolkit
Philip Clarkson, Ronald Rosenfeld
Text normalization and speech recognition in French
Gilles Adda, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel
A novel tree-based clustering algorithm for statistical language modeling
G. Damnati, J. Simonin
Variable-length language modeling integrating global constraints
Shoichi Matsunaga, Shigeki Sagayama
An hybrid language model for a continuous dictation prototype
K. Smaili, I. Zitouni, F. Charpillet, Jean-Paul Haton
Dealing with pronunciation variants at the language model level for the continuous automatic speech recognition of French
Guy Pérennou, L. Pousse
Rational interpolation of maximum likelihood predictors in stochastic language modeling
Ernst Günter Schukat-Talamazzini, Florian Gallwit, Stefan Harbeck, Volker Warnke
N-gram language model adaptation using small corpus for spoken dialog recognition
Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda
Variable n-gram language modeling and extensions for conversational speech
Manhung Siu, Mari Ostendorf
Fuzzy class rescoring: a part-of-speech language model
Petra Geutner
Speech understanding based on integrating concepts by conceptual dependency
Akito Nagai, Yasushi Ishikawa
Dynamic language models for interactive speech applications
Fabio Brugnara, Marcello Federico
Large-scale lexical semantics for speech recognition support
George Demetriou, Eric Atwell, Clive Souter
Integration of grammar and statistical language constraints for partial word-sequence recognition
Hajime Tsukada, Hirofumi Yamamoto, Yoshinori Sagisaka
Using intonation to constrain language models in speech recognition
Paul Taylor, Simon King, Stephen Isard, Helen Wright, Jacqueline Kowtko
Incorporating POS tagging into language modeling
Peter A. Heeman, James F. Allen
Confidence metrics based on n-gram language model backoff behaviors
C. Uhrik, W. Ward
Structure and performance of a dependency language model
Ciprian Chelba, David Engle, Frederick Jelinek, Victor Jimenez, Sanjeev Khudanpur, Lidia Mangu, Harry Printz, Eric Ristad, Ronald Rosenfeld, Andreas Stolcke, Dekai Wu
Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech
Andreas Stolcke
Hybrid language models: is simpler better?
P. E. Kenne, Mary O'Kane
Internal and external tagsets in part-of-speech tagging
Thorsten Brants
A probabilistic model of double-vowel segregation
Laurent Varin, Frédéric Berthommier
Stimulus signal estimation from auditory-neural transduction inverse processing
Habibzadeh V. Houshang, Kitazawa Shigeyoshi
FDVQ based keyword spotter which incorporates a semi-supervised learning for primary processing
Chakib Tadj, Pierre Dumouchel, Franck Poirier
The initial time Span of auditory processing used for speaker attribution of the speech signal
V. V. Lublinskaja, Christian Sappok
Sparse connection and pruning in large dynamic artificial neural networks
Nikko Ström
A modular initialization scheme for better speech recognition performance using hybrid systems of MLPs/HMMs
Roxana Teodorescu, Dirk Van Compernolle, Ioannis Dologlou
Lateralization for auditory perception of foreign words
Tatiana V. Chernigovskaya
The structural weighted sets method for continuous speech and text recognition
Yuri Kosarev, Pavel Jarov, Alexander Osipov
Lateral inhibitory networks for auditory processing
C. J. Sumner, D. F. Gillies
Missing fundamentals: a problem of auditory or mental processing?
Henning Reetz
Predictive neural networks applied to phoneme recognition
F. Freitag, E. Monte, J. Salavedra
Empirical comparison of two multilayer perceptron-based keyword speech recognition algorithms
_ Suhardi, Klaus Fellbaum
Segment boundary estimation using recurrent neural networks
Toshiaki Fukada, Sophie Aveline, Mike Schuster, Yoshinori Sagisaka
Incorporation of HMM output constraints in hybrid NN/HMM systems during training
Mike Schuster
Principles of the hearing periphery functioning in new methods of pitch detection and speech enhancement
Ludmila Babkina, Sergey Koval, Alexander Molchanov
The locus of the syllable effect: prelexical or lexical?
Christine Meunier, Alain Content, Uli H. Frauenfelder, Ruth Kearns
On not remembering disfluencies
Robin J. Lickley, Ellen G. Bard
Using an auditory model and leaky autocorrelators to tune in to speech
T. Andringa
Article |
---|