doi: 10.21437/Eurospeech.1999
Putting language into language modeling
Frederick Jelinek, Ciprian Chelba
The controversial connection between speech production and perception: theories vs. facts
Mária Gósy
Multimedia interaction for the new millennium
Mark T. Maybury
How speech works - questions and preliminary answers
Björn Lindblom
Maximum a posterior linear regression with elliptically symmetric matrix variate priors
Wu Chou
A MAP-like weighting scheme for MLLR speaker adaptation
Silke Goronzy, Ralf Kompe
HMM adaptation for telephone applications
Hans-Günter Hirsch
A study of adaptation techniques on a voicemail transcription task
Jing Huang, Mukund Padmanabhan
Maximum likelihood sequential adaptation
Beth Logan
The relationship between utterance type and F0 contour in German
aren Brinckmann, Ralf Benzmüller
A contrastive investigation of discourse intonational characteristic features of sofia bulgarian and hamburg German in MAP task dialogues
Evelina Grigorova, Vladimir Filipov, Bistra Andreeva
Prosodic correlates of information structure in Swedish human-human dialogues
Merle Horne, Petra Hansson, Gösta Bruce, Johan Frid
Paralinguistic features as suprasegmental acoustics observed in natural Japanese dialogue
S. Kitazawa, S. Kobayashi
Integrating prosodic features in dialogue understanding
Masafumi Tamoto, Masahito Kawamori, Takeshi Kawabata
A high-level approach to confidence estimation in speech recognition
Stephen Cox, Srinandan Dasmahapatra
Utterance verification using modified segmental probability model
Bin Jia, Xiaoyan Zhu, Yupin Luo, Dongcheng Hu
OOV-detection in large vocabulary system using automatically defined word-fragments as fillers
Dietrich Klakow, Georg Rose, Xavier Aubert
Use of recursive mumble models for confidence measuring
Qiguang Lin, David Lubensky, Salim Roukos
Utterance verification for the numeric language in a natural spoken dialogue
Mazin Rahim
Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition
Rathinavelu Chengalvarayan
Acoustic pre-processing for optimal effectivity of missing feature theory
Johan de Veth, Bert Cranen, Febe de Wet, Louis Boves
Simultaneous recognition of multiple sound sources based on 3-d n-best search using microphone array
Panikos Heracleous, Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano
Down-sampling speech representation in ASR
Hynek Hermansky, Pratibha Jain
Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR
Dusan Macho, Climent Nadeu, Peter Jancovic, Gregor Rozinaj, Javier Hernando
Syllable onset detection applied to the portuguese language
Hugo Meinedo, Joao P. Neto, Luis B. Almeida
Decorrelated and liftered filter-bank energies for robust speech recognition
Kuldip K. Paliwal
Optimization algorithms for estimating modulation spectrum domain filters
Pau Paches-Leal, Richard C. Rose, Climent Nadeu
Efficient vector quantization using an n-path binary tree search algorithm
R. San-Segundo, R. Córdoba, J. Ferreiros, A. Gallardo, J. Colás, J. Pastor, Y. López
Neural network based optimal feature extraction for ASR
Narada D. Warakagoda, Magne H. Johnsen
A study of speech recognition for the elderly
Fumihiro Yato, Naomi Inoue, Kazuo Hashimoto
The analysis and application of a new endpoint detection method based on distance of autocorrelated similarity#
Jie Zhu, Fei-li Chen
Hyper-articulated speech: auditory and visual intelligibility
Denis Beautemps, Pascal Borel, Sébastien Manolios
Modeling of the vocal tract in three dimensions
Olov Engwall
Articulatory reduction in emotional speech
Miriam Kienast, Astrid Paeschke, Walter Sendlmeier
A trajectory formation model of articulatory movements using a multidimensional phonemic task
Tokihiko Kaburagi, Masaaki Honda, Takeshi Okadome
LPC-based inversion of the DRM articulatory model
Sacha Krstulovic
A vocal tract model using multi-line equivalent circuits
Nobuhiro Miki, Thoru Yokoyama, Takeshi Ohtani, Shinobu Masaki, Ikuhiro Shimada, Ichiro Fujimoto, Yuji Nakamura
Acoustic nature of the whisper
Masahiro Matsuda, Hideki Kasuya
Relations between utterance speed and articulatory movements
Takeshi Okadome, Tokihiko Kaburagi, Masaaki Honda
Design of hypercube codebooks for the acoustic-to-articulatory inversion respecting the non-linearities of the articulatory-to-acoustic mapping
Slim Ouni And Yves Laprie
A missing-word test comparison of human and statistical language model performance
Marie Owens, Anja Krüger, Paul Donnelly, F J Smith, Ji Ming
Estimating velum height from acoustics during continuous speech
Korin Richmond
On improving the decision algorithm for articulatory codebook search
C. Silva, S. Chennoukh, Isabel Trancoso
Extraction of articulators in x-ray image sequences
G. Thimm, J. Luettin
Effects of source-tract interaction in perception of nasality
António Teixeira, Francisco Vaz, José Carlos Príncipe
Perceiving anticipatory phonetic gestures in French
Béatrice Vaxelaire, Rudolph Sock, Véronique Hecker
Motor equivalence evidenced by articulatory modelling
Anne Vilain, Christian Abry, Pierre Badin
Using likelihood ratios to perform utterance verification in automatic pronunciation assessment
Febe de Wet, Catia Cucchiarini, Helmer Strik, Lou Boves
A system for learning the pronunciation of Japanese pitch accent
Goh Kawai, Carlos Toshinori Ishi
Computer-aided spoken-language training with enhanced visual and auditory feedback
Jan Nouza
An effective scoring method for speaking skill evaluation system
Zhanjiang Song, Fang Zheng, Mingxing Xu, Wenhu Wu
Vocal synthesis in a computerized dictation exercise
Conception Santiago-Oriola
On deriving rules for nativised pronunciation in navigation queries
Isabel Trancoso, Céu Viana, Isabel Mascarenhas, Carlos Teixeira
Pronouncing unknown words using multi-dimensional analogies
Francois Yvon
Characteristics features of planning of speech and production of secondary schoolchildren's spontaneous speech
Maria Laczko
Discounted likelihood linear regression for rapid adaptation
William Byrne, Asela Gunawardana
Extraction of reliable transformation parameters for unsupervised speaker adaptation
Jen-Tzung Chien, Jean-Claude Junqua, Philippe Gelin
Maximum a posteriori linear regression for hidden Markov model adaptation
Cristina Chesta, Olivier Siohan, Chin-Hui Lee
On combining vocal tract length normalisation and speaker adaptation for noise robust speech recognition
Ramalingam Hariharan, Olli Viikki
Speaker adaptation using regularization and network adaptation for hybrid MMI-NN/HMM speech recognition
Jörg Rottland, Christoph Neukirchen, Daniel Willett, Gerhard Rigoll
Prosodic phrasing and accentuation in speech production of patients with right hemisphere lesions
Kai Alter, Annett Schirmer, Sonja A. Kotz, Angela D. Friederici
A real-time filled pause detection system for spontaneous speech recognition
Masataka Goto, Katunobu Itou, Satoru Hayamizu
Prosodic word boundary detection using mora transition modeling of fundamental frequency contours -speaker independent experiments-
Koji Iwano
Integrating multiple knowledge sources for word hypotheses graph interpretation
Volker Warnke, Florian Gallwitz, Anton Batliner, Jan Buckow, R. Huber, Elmar Nöth, A. Höthker
Prosodic correlates of interruptions in spoken dialogue
Li-chiung Yang
Dialog analysis in the carnegie mellon communicator
Paul C. Constantinides, Alexander I. Rudnicky
NIST's 1998 topic detection and tracking evaluation (TDT2)
Jon Fiscus, George Doddington, John Garofolo, Alvin Martin
Comparative evaluation of six German TTS systems
Gerit P. Sonntag, Thomas Portele, Felicitas Haas, Joachim Köhler
Standardisation of ergonomic assessment of speech communication
Herman J. M. Steeneken
Evaluation of affiliation in interaction with autonomous creatures
Noriko Suzuki, Yugo Takeuchi, Kazuo Ishii, Michio Okada
Accurate recognition of city names with spelling as a fall back strategy
Josef G. Bauer, Jochen Junkawitsch
Selective prosodic post-processing for improving recognition of French telephone numbers
Katarina Bartkova, Denis Jouvet
Improving rejection with semantic slot-based confidence scores
Eric I. Chang
The IBM conversational telephony system for financial applications
K. Davies, R. Donovan, M. Epstein, Martin Franz, Abraham Ittycheriah, E. E. Jan, J. M. LeRoux, David Lubensky, Chalapathy Neti, Mukund Padmanabhan, K. Papineni, Salim Roukos, A. Sakrajda, Jeffrey S. Sorensen, B. Tydlitat, T. Ward
Error spotting using syllabic fillers in spontaneous conversational speech recognition
Rachida El Méliani, Douglas O’Shaughnessy
Recognition of spelled names over the telephone and rejection of data out of the spelling lexicon
Denis Jouvet, Jean Monné
An utterance verification system based on subword modeling for a vocabulary independent speech recognition system
Myoung-Wan Koo, Sun-Jeong Lee
Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data
Nicolas Moreau, Denis Jouvet
Variable preselection list length estimation using neural networks in a telephone speech hypothesis-verification system
J. Macías-Guarasa, J. Ferreiros, A. Gallardo, R. San-Segundo, Juan Manuel Pardo, L. Villarrubia
Speaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech
Thilo Pfau, Robert Faltlhauser, Günther Ruske
Automatic speech recognition using acoustic confidence conditioned language models
Richard C. Rose, Giuseppe Riccardi
Utilizing prosody for unconstrained morpheme recognition
Volker Strom, Henrik Heine
Modeling the prosody of hidden events for improved word recognition
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tür, Gökhan Tür
A comparison of word graph and n-best list based confidence measures
Frank Wessel, Klaus Macherey, Hermann Ney
C++ software environment for speech signal processing
Marcus M. Prätzas, Ulrich Balss, Herbert Reininger, Harald Wüst
Improvement of electrolaryngeal speech by introducing normal excitation information
Kun Ma, Pelin Demirel, Carol Espy-Wilson, Joel MacAuslan
Detecting user speech in barge-in over prompts using speaker identification methods
Abraham Ittycheriah, Richard J. Mammone
Speaker and channel-normalized set of formant parameters for telephone speech recognition
Boris Lobanov, T. Levkovskaya, Igor E. Kheidorov
Fuzzy segmentation of lip image using cluster analysis
Alan W.C. Liew, K. L. Sum, S. H. Leung, Wai H. Lau
Software to support research and development of spoken dialogue systems
Michael F. McTear
Analysis of sources of variability in speech
Sachin Kajarekar, Narendranath Malayath, Hynek Hermansky
Adaptive nonlinear prediction based on order statistics for speech signals
Tetsuya Shimamura, Haruko Hayakawa
Developing a voiced information retrieval system for the portuguese language capable to handle both brazilian and portuguese spoken versions
M. N. Souza, E. J. Caprini, C. G. Machado, M. V. Ludolf, L. P. Calôba, J. M. Seixas, F. G. Resende, S. L. Netto, Diamantino R. Freitas, Joao Paulo Teixeira, C. Espain, V. Pera, F. Moreira
Real-time speech modeling using computationally efficient locally recurrent neural networks (CERNs)
John J. Soraghan, Amir Hussain, Ivy Shim
Effectiveness of KL-transformation in spectral delta expansion
M. Tokuhira, Y. Ariki
Evaluation of confidence measures for language identification
Kay Berkling, Douglas A. Reynolds, Marc Zissman
Chinese dialect identification using an acoustic-phonotactic model
Wuei-He Tsai, Wen-Whei Chang
Language identification from prosody without explicit features
Fred Cummins, Felix Gers, Jürgen Schmidhuber
Multigrams for language identification
Stefan Harbeck, Uwe Ohler
The use of 'rare' segments for language identification
Jean-Marie Hombert, Ian Maddieson
Spoken language identification utilizing fundamental frequency and cepstra
Shuichi Itahashi, Toshikazu Kiuchi, Mikio Yamamoto
Comparing different model configurations for language identification using a phonotactic approach
D. Matrouf, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel
Human language identification with reduced spectral information
K. Mori, N. Toba, T. Harada, T. Arai, M. Komatsu, M. Aoyagi, Y. Murahara
Prosody as a distinctive feature for the discrimination of arabic dialects
Melissa Barkat, John Ohala, François Pellegrino
Comparison of two phonetic approaches to language identification
François Pellegrino, Jérôme Farinas, Régine André-Obrecht
The effects of speaker training on ASR accuracy
Stephen Anderson, Natalie Liberman, Larry Gillick, Stephen Foster, Sahoko Hama
Creating hidden Markov models for fast speech by optimized clustering
Robert Faltlhauser, Thilo Pfau, Günther Ruske
Improvements on speech recognition for fast talkers
M. Richardson, M. Hwang, Alex Acero, Xuedong Huang
Modeling the rate of speech by Markov processes on curves
Lawrence Saul, Mazin Rahim
Modelling speaking rate using a between frame distance metric
Andreas Tuerk, Steve Young
Vocal registers revisited
Gerrit Bloothooft, Peter Pabon
Pseudo affine projection algorithm new solution for adaptive identication
Franck Bouteille, Pascal Scalart, Michel Corazza
Acoustic analysis of a speech corpus of european portuguese fricative consonants
Luis M. T. Jesus, Christine H. Shadle
Acoustic characteristics of plosives in consonant-consonant sequences at word boundaries
Natalia Petlyuchenko
Effects of stress and lexical structure on speech efficiency
Rob J. J. H. van Son, Louis C. W. Pols
A two-stage speech recognition method with an error correction model
Yoshiharu Abe, Hiroyasu Itsui, Yuzo Maruta, Kunio Nakajima
Speech recognition with automatic punctuation
C. Julian Chen
Automatic modeling of pronunciation variations
Ellen Eide
Reducing search complexity in low perplexity tasks
Martin Franz, Miroslav Novak
A two-stage speech recognition method for information retrieval applications
Paolo Coletti, Marcello Federico
Multi-level decision trees for static and dynamic pronunciation models
Eric Fosler-Lussier
Modeling and efficient decoding of large vocabulary conversational speech
Michael Finke, Jürgen Fritsch, Detlef Koll, Alex Waibel
Evaluation of a segmentation system based on multi-level lattices
Jean-Luc Husson
The application of an improved DP match for automatic lexicon generation
Philip Hanna, Darryl Stewart, Ji Ming
Modeling trajectories in the HMM framework
Rukmini Iyer, Owen Kimball, Herbert Gish
Korean large vocabulary continuous speech recognition using pseudomorpheme units
Oh-Wook Kwon, Kyuwoong Hwang, Jun Park
Navigating German cities by spontaneous French queries
Harouna Kabré, Alexander Waibel
Generating alternative pronunciations from a dictionary
Filipp Korkmazskiy, Chin-Hui Lee
Finding consensus among words: lattice-based word error minimization
Lidia Mangu, Eric Brill, Andreas Stolcke
An efficient decoding method for real time speech recognition
Stefan Ortmanns, Wolfgang Reichl, Wu Chou
Recent improvements in voicemail transcription
Mukund Padmanabhan, G. Saon, S. Basu, Jing Huang, Geoffrey Zweig
Acoustics-based baseform generation with pronunciation and/or phonotactic models
Bhuvana Ramabhadran, Sabine Deligne, Abraham Ittycheriah
Improving recognition correct rate of important words in large vocabulary speech recognition
Yasuo Shirosaki, Hideaki Kikuchi, Katsuhiko Shirai
Pronunciation modeling by sharing gaussian densities across phonetic models
Murat Saraclar, Harriet Nock, Sanjeev Khudanpur
One pass cross word decoding for large vocabularies based on a lexical tree search organization
Xavier L. Aubert
Automatic annotation and classification of phrase accents in spontaneous speech
Anton Batliner, M. Nutt, Volker Warnke, Elmar Nöth, Jan Buckow, R. Huber, Heinrich Niemann
Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events
Alistair Conkie, Giuseppe Riccardi, Richard C. Rose
A comparison between syntactic and prosodic phrasing
Marcus L. Fach
The prosody of left-dislocated topic constituents in italian read speech
Barbara Gili Fivela
Semantic boundaries in multiple languages
Jürgen Haas, Volker Warnke, Heinrich Niemann, M. Cettolo, A. Corazza, D. Falavigna, G. Lazzari
Prosodic phrasing in korean, determine governor, and then split or not
Yeon-Jun Kim, Heo-Jin Byeon, Yung-Hwan Oh
Linear prediction coding of individual pitch accent shapes
Joachim J. Mersdorf, Kai U. Schmidt, Stefanie Köster
Prominence variation beyond given/new
Christine H. Nakatani
Acoustical features as predictors for prominence in read aloud dutch sentences used in ANN's
Barbertje M. Streefkerk, Louis C. W. Pols, Louis F. M. ten Bosch
Parallelism, coherence, and contrastive accent
Mariet Theune
A phonetically-guided diagnosis of auditory deficiency based on synthetic speech stimuli
Anne Bonneau, Parham Mokhtari
On the selection of meaningful speech parameters used by a pathologic/non pathologic voice register classifier
Juan I. Godino-Llorente, Santiago Aguilera-Navarro, Carlos Hernández-Espinosa, Mercedes Fernández-Redondo, Pedro Gómez-Vilda
On-line captioning of TV-programs for the hearing impaired
Erik Harborg, Trym Holter, Magne Hallstein Johnsen, Torbjon Svendsen
Classification of pathological voice into normal/benign/malignant state
Cheol-Woo Jo, Dae-Hyun Kim
Cognitive experiments on timing lag for superimposing closed captions
Ichiro Maruyama, Yoshiharu Abe, Eiji Sawamura, Tetsuo Mitsuhashi, Terumasa Ehara, Katsuhiko Shirai
Speaker normalization for audio-visual articulation training
Marcel Ogner, Zdravko Kacic
Vowel production in aphasia
Tatjana Prizl-Jakovac
Towards a global optimization scheme for multi-band speech recognition
Christophe Cerisara, Jean-Paul Haton, Dominique Fohr
Multi-stream speech recognition: ready for prime time?
Adam Janin, Dan Ellis, Nelson Morgan
Sooner or later: exploring asynchrony in multi-band speech recognition
Nikki Mirghafori, Nelson Morgan
The full combination sub-bands approach to noise robust HMM/ANN based ASR
Andrew Morris, Astrid Hagen, Hervé Bourlard
A recombination strategy for multi-band speech recognition based on mutual information criterion
Shigeki Okawa, Takehiro Nakajima, Katsuhiko Shirai
Rapid unit selection from a large speech corpus for concatenative speech synthesis
Mark Beutnagel, Mehryar Mohri, Michael Riley
Objective distance measures for assessing concatenative speech synthesis
Jing-Dong Chen, Nick Campbell
Word and syllable concatenation in text-to-speech synthesis
Eric Lewis, Mark Tatham
Synthesis by word concatenation
Karlheinz Stöber, Thomas Portele, Petra Wagner, Wolfgang Hess
Speech synthesis by phonological structure matching
Paul Taylor, Alan W. Black
The implementation of a european masters in language and speech
Gerrit Bloothooft
The interactive auditory demonstrations project
Martin Cooke, Helen Parker, Guy J. Brown, Stuart N. Wrigley
Curricula and courseware in spoken language engineering in europe: a critical appraisal
Michael F. McTear
An interactive tutorial on text-to-speech synthesis from diphones in time domain
Rüdiger Hoffmann, Bettina Ketzmerick, Ulrich Kordon, Steffen Kürbis
Evaluating the dialogue component in the GULAN educational system
Pernilla Qvarfordt, Arne Jönsson
The philips/RWTH system for transcription of broadcast news
Peter Beyerlein, Xavier Aubert, Reinhold Haeb-Umbach, Matthew Harris, Dietrich Klakow, A. Wendemuth, Sirko Molau, Michael Pitz, A. Sixtus
Toward realtime transcription of broadcast news
Jason Davenport, Long Nguyen, Spyros Matsoukas, Richard Schwartz, John Makhoul
Recent advances in transcribing television and radio broadcasts
Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Michéle Jardino
Selection for acoustic coverage from unlimited speech extracted from closed-captioned TV
Photina Jaeyun Jang, Alexander G. Hauptmann
Laughter extracted from television closed captions as speech recognizer training data
Paul E. Kennedy, Alexander G. Hauptmann
Further advances in transcription of broadcast news
Long Nguyen, Spyros Matsoukas, Jason Davenport, Daben Liu, Jay Billa, Francis Kubala, John Makhoul
Recent advances in Japanese broadcast news transcription
Katsutoshi Ohtsuki, Sadaoki Furui, Naoyuki Sakurai, Atsushi Iwasaki, Zhi-Peng Zhang
Automatic verification of broadcast news transcriptions
Michael Pitz, Sirko Molau
Improved speaker segmentation and segments clustering using the bayesian information criterion
Alain Tritschler, Ramesh A. Gopinath
Development of the 1998 OGI-FONIX broadcast news transcription system
Xintian Wu, Yonghong Yan
Speech/music discrimination based on posterior probability features
Gethin Williams, Daniel P.W. Ellis
Dragon systems' 1998 broadcast news transcription system
Steven Wegmann, Puming Zhan, Ira Carp, Michael Newman, Jon Yamron, Larry Gillick
Progress in automatic meeting transcription
Hua Yu, Michael Finke, Alex Waibel
A study of broadcast news audio stream segmentation and segment clustering
Matthew Harris, Xavier Aubert, Reinhold Haeb-Umbach, Peter Beyerlein
Fast speaker change detection for broadcast news transcription and indexing
Daben Liu, Francis Kubala
Robust information extraction from spoken language data
David D. Palmery, Mari Ostendorf, John D. Burgerz
Integrated transcription and identification of named entities in broadcast speech
Steve Renals, Yoshihiko Gotoh
Improvements in accuracy and speed in the HTK broadcast news transcription system
P. C. Woodland, J. J. Odell, T. Hain, G. L. Moore, T. R. Niesler, Andreas Tuerk, E. W. D. Whittaker
A comparative study of HMM-based approaches for the automatic recognition of perceptually relevant aspects of spontaneous German speech melody
Christel Brindöpke, Gernot A. Fink, Franz Kummert
Modelling intonational phrase structure with artificial neural networks
Grazyna Demenko, Wiktor Jassem
Effects of articulation rate on duration in read French speech
Danielle Duez
A semi automatic method for the characterization of Spanish intonation contours
Jorge A. Gurlekian, Marcela Leticia Riccillo, Alejandro Renato, Jose Alvarez
Towards recognizing "non-lexical" words in spontaneous conversational speech
Hesham Tolba, Douglas O'Shaughnessy
A new F0 contour control method based on vector representation of F0 contour
Mitsuaki Isogai, Hideyuki Mizuno
Developing the database of the spontaneous speech prosody characteristics
Jana Kleckova
A method for the analysis of prosodic registers
Gregor Möhler, Jörg Mayer
Whole tunes, nuclear and pre-nuclear patterns and prosodic features in the perception of interrogativity and non-finality in dutch.
Natalia Smirnova
Prosodic modeling of Mandarin speech and its application to lexical decoding
Wern-Jun Wang, Yuan-Fu Liao, Sin-Horng Chen
Modeling carryover and anticipation effects for Chinese tone recognition
Jin-Song Zhang, Hiromichi Kawanami
Experimental evaluation of text-independent speaker verification on laboratory and field test databases in the M2VTS project
Laurent Besacier, J. Luettin, G. Maitre, E. Meurville
Channel estimation and normalization by coherent spectral averaging for robust speaker verification
Rajesh Balchandran, Vidhya Ramanujam, Richard J. Mammone
Time-frequency principal components of speech: application to speaker identification
Ivan Magrin-Chagnolleau, Geoffrey Durou
Speaker recognition by means of a combination of linear and nonlinear predictive models
Marcos Faúndez-Zanuy
Feature vector transformation using independent component analysis and its application to speaker identification
Gil-Jin Jang, Seong-Jin Yun, Yung-Hwan Oh
The prototype model in speaker identification
Yizhar Lavner, Judith Rosenhouse, Isak Gath
A new cepstrum-based channel compensation method for speaker verification
T. F. Lo, M. W. Mak, K. K. Yiu
Speaker recognition based on discriminative feature extraction - optimization of mel-cepstral features using second-order all-pass warping function
Chiyomi Miyajima, Hideyuki Watanabe, Tadashi Kitamura, Shigeru Katagiri
Facing severe channel variability in forensic speaker verification conditions
Javier Ortega-Garcia, Santiago Cruz-Llanas, Joaquin Gonzalez-Rodriguez
Speaker and language recognition using speech codec parameters
Thomas F. Quatieri, E. Singer, R. B. Dunn, Douglas A. Reynolds, J. P. Campbell
Robust speaker verification in noisy conditions by modification of spectral time trajectories
Vidhya Ramanujam, Rajesh Balchandran, Richard J. Mammone
Toward parametric representation of speech for speaker recognition systems
Rivarol Vergin, Douglas O'Shaughnessy, Pierre Dumouchel
Text independent speaker identification using LSP codebook speaker models and linear discriminant functions
R. D. Zilca, Y. Bistritz
Development of the philips 1999 taiwan Mandarin benchmark system
Chiwei Che, Nick Wang, Max Huang, Hank Huang, Frank Seide
The AT&t large vocabulary conversational speech recognition system
Andrej Ljolje, Michael D. Riley, Donald M. Hindle
Integrated context-dependent networks in very large vocabulary speech recognition
Mehryar Mohri, Michael Riley
Mandarin large vocabulary speech recognition using the globalphone database
J. Reichert, Tanja Schultz, Alex Waibel
Easytalk: a large-vocabulary speaker-independent Chinese dictation machine
Fang Zheng, Zhanjiang Song, Mingxing Xu, Jian Wu, Yinfei Huang, Wenhu Wu, Cheng Bi
Synthesis of regional English using a keyword lexicon
Susan Fitt, Stephen Isard
Speaker conversion through non-linear frequency warping of straight spectrum
Noriyasu Maeda, Banno Hideki, Shoji Kajita, Kazuya Takeda, Fumitada Itakura
User attitudes to concatenated natural speech and text-to-speech synthesis in an automated information service
F. R. McInnes, D. J. Attwater, Michael D. Edgington, Mark S. Schmidt, Mervyn A. Jack
From multilingual to polyglot speech synthesis
Christof Traber, Karl Huber, Karim Nedir, Beat Pfister, Eric Keller, Brigitte Zellner
A Japanese text-to-speech system based on multi-form units with consideration of frequency distribution in Japanese
Kimihito Tanaka, Hideyuki Mizuno, Masanobu Abe, Shin'ya Nakajima
Automatic detection and correction of pronunciation errors for foreign language learners: the demosthenes application
G. Deville, O. Deroo, H. Leich, S. Gielen, J. Vanparys
User adaptation in the fluency pronunciation trainer
Maxine Eskenazi, Scott Hansma, John Corwin, Jordi Albornoz
Automatic detection of phone-level mispronunciation for language learning
Horacio Franco, Leonardo Neumeyer, María Ramos, Harry Bratt
Automatic localization and diagnosis of pronunciation errors for second-language learners of English
Daniel Herron, Wolfgang Menzel, Eric Atwell, Roberto Bisiani, Fabio Daneluzzi, Rachel Morton, Juergen A. Schmidt
SPECO - a multimedia multilingual teaching and training system for speech handicapped children
Klára Vicsi, Peter Roach, Annemarie Öster, Zdravko Kacic, P. Barczikay, I. Sinka
Recognition of continuous persian speech using a medium-sized vocabulary speech corpus
S. M. Ahadi
Multi-lingual speech recognition based on demi-syllable subword units
Tibor Fegyó, Péter Tatai
MAP-based cross-language adaptation augmented by linguistic knowledge: from English to Chinese
Pascale Fung, Chi Yuen Ma, Wai Kat Liu
Analysis of HMM models in alphabet letters recognition
Stefan Grocholewski
Tone recognition of Chinese continuous speech using tone critical segments
Keikichi Hirose, Jin-song Zhang
Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition
Tai-Hsuan Ho, Chin-Jung Liu, Herman Sun, Ming-Yi Tsai, Lin-Shan Lee
The clustering algorithm for the definition of multilingual set of context dependent speech models
Bojan Imperl, Bogomir Horvat
Study on tone classification of Chinese continuous speech in speech recognition system
Jian Liu, Xiaodong He, Fuyuan Mo, Tiecheng Yu
Decision tree-based triphones are robust and practical for mandarian speech recognition
Yi Liu, Pascale Fung
Decision trees for inter-word context dependencies in Spanish continuous speech recognition tasks
K. López de Ipiña, A. Varona, I. Torres, L. J. Rodríguez
End points detection for noisy speech using a wavelet based algorithm
Amin M. Nassar, Nemat S. Abdel Kader, Amr M. Refat
Adaptation of acoustic models for multilingual recognition
C. Nieuwoudt, E. C. Botha
Recognition of non-native German speech with multilingual recognizers
Ulla Uebler, Manuela Boros
Relational vs. object-oriented models for representing speech: a comparison using ANDOSL data
Toomas Altosaar, Bruce Millar, Martti Vainio
First experiences of the German speechdat-car database collection in mobile environments
Christoph Draxler, Robert Grudszus, Stephan Euler, Klaus Bengler
OASIS - a framework for spoken language call steering
Mike Edgington, David Attwater, Peter Durston
VOCAPI - small standard API for command & control
Eike Gegenmantel
Standardised speech interfaces - key for objective evaluation of recognition accuracy
Christel Müller, Karsten Schröder
A medical rehabilitation diagnoses transcription method that integrates continuous and isolated word recognition
Shoichi Matsunaga, Yoshiaki Noda, Katsutoshi Ohtsuki, Eiji Doi, Tomio Itoh
Problems of creating a flexible e-mail reader for hungarian
Géza Németh, Csaba Zainkó, Gábor Olaszy, Gábor Prószéky
Interactive, TTS supported speech message composer for large, limited vocabulary, but open information systems
Gábor Olaszy, Géza Németh, Péter Olaszi, Géza Gordos
ALE for speech: a translation prototype
Gerald Penn, Bob Carpenter
An integrated system for Spanish CSR tasks
L.J. Rodríguez, M. I. Torres, J. M. Alcaide, A. Varona, K. López de Ipina, M. Penagarikano, G. Bordel
Use of speech synthesis in an application
Angelien Sanderman, Ellen Bosgoed, Hans de Graaff, Peter van Splunder
Text-to-audio-visual speech synthesis based on parameter generation from HMM
Masatsune Tamura, Shigekazu Kondo, Takashi Masuko, Takao Kobayashi
Authoring tools for speech synthesis using the sable markup standard
Johan Wouters, Brian Rundle, Michael W. Macon
Dynamic weighting of the distortion sequence in text-dependent speaker verification
A. M. Ariyaeeinia, P. Sivakumaran, M. Pawlewski, M. J. Loomes
On the use of supra model information from multiple classifiers for robust speaker identification
Hakan Altincay, Mübeccel Demirekler
Missing features detection and handling for robust speaker verification
Mounir El-Maliki, Andrzej Drygajlo
High performance text-independent speaker recognition system based on voiced/unvoiced segmentation and multiple neural nets
Nikos Fakotakis, John Sirigos, George Kokkinakis
Similarity normalization method based on world model and a posteriori probability for speaker verification
Corinne Fredouille, Jean-François Bonastre, Teva Merlin
Text-independent speaker verification using virtual speaker based cohort normalization
Toshihiro Isobe, Jun-ichi Takahashi
Robust person verification based on speech and facial images
J. Luettin, S. Ben-Yacoub
A neural network-based text-dependent speaker verification system using suprasegmental features
M. Mathew, B. Yegnanarayana, R. Sundar
Modelling output probability distributions for enhancing speaker recognition
Jason Pelecanos, Sridha Sridharan
On the use of neural networks to combine utterance and speaker verification systems in a text-dependent speaker verification task
L. Rodríguez-Linares, C. García-Mateo, J. L. Alba-Castro
Genesys: a neural network model for speaker identification
B. Ruiz-Mezcua, R. Rodríguez-Galán, Luis A. Hernández-Gómez, Paloma Domingo-García, Enrique Bailly-Baillicre Gutiérrez
Speaker verification with growing cell structures
Bogdan Sabac, Inge Gavat
Environment adaptation and long term parameters in speaker identification
Chakib Tadj, Pierre Dumouchel, Mohamed Mihoubi, Pierre Ouellet
Speaker identification using subband HMMS
K. Yoshida, K. Takagi, K. Ozeki
A priori threshold determination for phrase-prompted speaker verification
W. D. Zhang, K. K. Yiu, M. W. Mak, C. K. Li, M. X. He
Formant analysis and synthesis using hidden Markov models
Alex Acero
Accurate estimation of sinusoidal parameters in an harmonic+noise model for speech synthesis
Gérard Bailly
Modal synthesis and modeling of vowels
Unto K. Laine
Shape invariant pitch modification of speech using a harmonic model
Darragh O'Brien, Alex I. C. Monaghan
Interaction of units in a unit selection database
Mark Beutnagel, Alistair Conkie
Speech training for deaf and hearing-impaired people
Ramón García Gómez, Ricardo López Barquilla, José Ignacio Puertas Tera, José Parera Bermudez, Marie-Christine Haton, Jean-Paul Haton, Pierre Alinat, Sofia Moreno, Wolfgang Hess, Ma Araceli Sanchez Raya, Eduardo Alberto Martínez Gual, Juan Luis Navas-Chaveli Daza, Christophe Antoine, Marie-Madeleine Durel, Genevieve Maugin, Silke Hohmann
A post-processing of speech for hearing impaired integrate into standard digital audio decoders
Shinichi Hoshino, Itaru Kaneko, Hideaki Kikuchi, Katsuhiko Shirai
Effects of hoarseness on hypernasality ratings
Setsuko Imatomi, Takayuki Arai, Yuko Mimura, Masako Kato
Cross-language analysis of voice onset time in stuttered speech
N. Rezaei-Aghbash, S. P. Whiteside, P. A. Cudd
Experiments in constrained maximum likelihood extraction of temporal features for speech recognition
Gilles Boulianne, Julie Brousseau, Nathalie Talbot, Pierre Dumouchel
Model selection in acoustic modeling
S. S. Chen, Ramesh A. Gopinath
Acoustic modeling and language modeling for cantonese LVCSR
Y. W. Wong, K. F. Chow, Wai H. Lau, W. K. Lo, Tan Lee, P. C. Ching
Context dependent hybrid HMM/ANN systems for large vocabulary continuous speech recognition system
O. Deroo, C. Ris, S. Dupont
Reduced gaussian mixture models in a large vocabulary continuous speech recognizer
V. Fischer, T. Ross
Mixture trees - hierarchically tied mixture densities for modeling HMM emission probabilities
J. Fritsch
Reinforcement learning for phoneme recognition
Akira Ichikawa, Tomoyuki Shimizu, Yasuo Horiuchi
Combined temporal and spectral multi-resolution phonetic modelling
Paul McCourt, Naomi Harte, Saeed Vaseghi
Speed improvement of the time-asynchronous acoustic fast match
Miroslav Novak, Michael Picheny
A hybrid ANN/HMM syllable recognition module based on vowel spotting
John Sirigos, Nikos Fakotakis, George Kokkinakis
Data-driven modulation filter design under adverse acoustic conditions and using phonetic and syllabic units
Michael L. Shire
Accuracy versus complexity in context dependent phone modeling
Wei Xu, Jacques Duchateau, Kris Demuynck, Ioannis Dologlou, Patrick Wambacq, Dirk van Compernolle, Hugo van Hamme
A new hybrid structure of speech recognizer based on HMM and neural network
Jianlai Zhou, Xiaodong He, Tiecheng Yu, Fuyuan Mo
Dependency modeling with bayesian networks in a voicemail transcription system
Geoffrey Zweig, Mukund Padmanabhan
A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
Li Deng, Jeff Ma
A study on the effect of adding new dimensions to trajectories in the acoustic space
D. Albesano, R. De Mori, R. Gemello, F. Mana
Tail distribution modelling using the richter and power exponential distributions
M. J. F. Gales, P. A. Olsen
A study of duration in continuous speech recognition based on DDBHMM
Qingwei Zhao, Zuoying Wang, Dajin Lu
Comparison of continuous-density and semi-continuous HMM in isolated words recognition systems
T. Vaich, A. Cohen
A spoken dialog system for a mobile office robot
Hideki Asoh, Toshihiro Matsui, John Fry, Futoshi Asano, Satoru Hayamizu
Interaction with an animated agent in a spoken dialogue system
Linda Bell, Joakim Gustafson
Current practice in the development and evaluation of spoken language dialogue systems.
Niels Ole Bernsen, Laila Dybkjaer, Ulrich Heid
The august spoken dialogue system
Joakim Gustafson, Nikolaj Lindberg, Magnus Lundeberg
An event-based dialogue model and its implementation in multidial2
Olivier Grisvard, Bertrand Gaiffe
LODESTAR: a Mandarin spoken dialogue system for travel information retrieval
Chao Huang, Peng Xu, Xin Zhang, Shubin Zhao, Taiyi Huang, Bo Xu
EUROPA: a generic framework for developing spoken dialogue systems
Munehiko Sasajima, Takehide Yano, Yasuyuki Kono
Handling rich turn-taking in spoken dialogue systems
Mikio Nakano, Kohji Dohsaka, Noboru Miyazaki, Jun-ichi Hirasawa, Masafumi Tamoto, Masahito Kawamori, Akira Sugiyama, Takeshi Kawabata
Thus spoke the user to the wizard
Hannes Pirker, Georg Loderer, Harald Trost
Automatic dialogue generator creates user defined applications
Andrew Pargellis, Hon-Kwang Jeff Kuo, Chin-Hui Lee
Flexible mixed-initiative dialogue for telephone services
José Relaño Gil, Daniel Tapias, Juan Manuel Villar-Navarro, Maria C. Gancedo, Luis A. Hernández-Gómez
User modelling in adaptive dialogue management
Gert Veldhuijzen van Zanten
Characterization of speech during imitation
Gal Ashour, Isak Gath
The analysis of speaker individual features based on autoregressive hidden Markov models
Evgeny I. Bovbel, Polina P. Tkachova, Igor E. Kheidorov
Detection of speaker changes in an audio document
Perrine Delacourt, David Kryze, Christian J. Wellekens
Dynamic test durations for text-independent speaker verification systems
Axel Glaeser
Combination of vector quantization and gaussian mixture models for speaker verification with sparse training data
Guido Kolano, Peter Regel-Brietzmann
A language-independent personal voice controller with embedded speaker verification
Qi Li, Augustine Tsai, Weon-Goo Kim
Vulnerability in speaker verification - a study of technical impostor techniques
Johan Lindberg, Mats Blomberg
A study of computation speed-UPS of the GMM-UBM speaker recognition system
Jack McLaughlin, Douglas A. Reynolds, Terry Gleason
Conversational biometrics
Stéphane H. Maes
On the security of HMM-based speaker verification systems against imposture using synthetic speech
Takashi Masuko, Takafumi Hitotsumatsu, Keiichi Tokuda, Takao Kobayashi
Speech signal parametrization for speaker recognition under voice disguise conditions
Wojciech Majewski, Grazyna Mazur-Majewska
On the relevance of language in speaker recognition
Antonio Satué-Villar, Marcos Faúndez-Zanuy
Prediction of keyword spotting accuracy based on simulation
Yoichi Yamashita
A fast version of the atros system
M. J. Castro, D. Llorens, Joan-Andreu Sánchez, F. Casacuberta, P. Aibar, E. Segarra
Task dependent loss functions in speech recognition: a* search over recognition lattices
Vaibhava Goel, William Byrne
Theory of structured cogitation in speech recognition
Václav Hanzl
Efficient general lattice generation and rescoring
Andrej Ljolje, Fernando Pereira, Michael Riley
A fast and effective state decoding algorithm
Mingxing Xu, Fang Zheng, Wenhu Wu
A multimodal, multilingual telephone application: the wildfire electronic assistant
Philippe Jeanrenaud, Greg Cockroft, Allard VanderHeidjen
Speaker verification as a user-friendly access for the visually impaired
Els den Os, Hans Jongebloed, Alice Stijsiger, Lou Boves
Recognition, indexing and retrieval of british broadcast news with the THISL system
Tony Robinson, Dave Abberley, David Kirby, Steve Renals
Organization, communication, and control in the GALAXY-II conversational system
Stephanie Seneff, Raymond Lau, Joseph Polifroni
A mixed-initiative natural dialogue system for conference room reservation
Claudia Pateras, Nicolas Chapados, Remi Kwan, Dominic Lavoie, Réal Tremblay
Audio-visual synthesis of talking faces from speech production correlates
Takaaki Kuratate, Kevin G. Munhall, Philip E. Rubin, Eric Vatikiotis-Bateson, Hani Yehia
Hearing by eye: visual spatial degradation and the mcgurk effect
John MacDonald, Soren Andersen, Talis Bachmann
Intensity- and location-normalized training for HMM-based visual speech recognition
Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura
Speaker adaptation for audio-visual speech recognition
Gerasimos Potamianos, Alexandros Potamianos
The role of spatial separation on ventriloquism and mcgurk illusions
M. Radeau, C. Colin
Hybrid connectionist-structural acoustical modeling in the ATROS system
M. J. Castro, F. Casacuberta
Path-dependent kalman estimation of a cepstral bias
Lionel Delphin-Poulat, Jérôme Idier
Acoustical modelling of phone transitions: biphones and diphones - what are the differences?
S. Dobrisek, F. Mihelic, N. Pavesic
Optimal feature sub-space selection based on discriminant analysis
Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle
Phoneme recognition system based on HMM with distributed VQ codebook
Mohamed Debyeche, Mohamed Afify, Jean-Paul Haton
Research on speech units modeling in continuous speech recognition
Xiaodong He, Jian Liu, Jianlai Zhou, Tiecheng Yu
An investigation of cepstral parameterisations for large vocabulary speech recognition
Reinhold Haeb-Umbach, Marco Loog
Dynamic HMM selection for continuous speech recognition
T. Hain, P. C. Woodland
Unified decoding and feature representation for improved speech recognition
Li Jiang, Xuedong Huang
High accuracy acoustic modeling based on multi-stage decision tree
DongHwa Kim, Chaojun Liu, Xintian Wu, Yonghong Yan
Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition
Jeff Ma, Li Deng
Top-down bottom-up hybrid clustering algorithm for acoustic-phonetic modeling of speech
José B. Marino, Albino Nogueiras-Rodríguez
Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition
Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takeshi Yamada, Takashi Endo
Diphone subspace models for phone-based HMM complementation
Klaus Reinhard, Mahesan Niranjan
Unified framework for acoustic topology modelling: ML-SSS and question-based decision trees
Harald Singer, Atsushi Nakamura
Linear transformations in sub-band groups for speech recognition
B. Doherty, Saeed Vaseghi, Paul McCourt
A prototype of Mandarin speech telephone number inquiry system
Peng-Ren Lu, Wei-Tyng Hong, Sheng-Lun Chiang, Yih-Ru Wang, Sin-Horng Chen
Off-line acoustic modelling of non-native accents
Silke Witt, Steve Young
Semi-supervised adaptation of acoustic models for large-volume dictation
Colin W. Wightman, Ted A. Harder
Enhanced likelihood computation using regression
Peter de Souza, Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny
High accuracy acoustic modeling using two-level decision-tree based state-tying
Chaojun Liu, Xintian Wu, Yonghong Yan
Domain adduced state tying for cross-domain acoustic modelling
R. Singh, B. Raj, Richard M. Stern
Parameter tying and gaussian clustering for faster, better, and smaller speech recognition
Ananth Sankar, Venkata Ramana Rao Gadde
A combined maximum mutual information and maximum likelihood approach for mixture density splitting
Ralf Schlüter, Wolfgang Macherey, Boris Müller, Hermann Ney
Knowledge collection for natural language spoken dialog systems
Egbert Ammicht, Allen Gorin, Tirso Alonso
Improving discourse management in TRIPS-98
Donna K. Byron
Speech act modeling in a spoken dialogue system using fuzzy hidden Markov model and bayes' decision criterion
Chung-Hsien Wu, Gwo-Lang Yan, Chien-Liang Lin
Task hierarchies representing sub-dialogs in speech dialog systems
Ute Ehrlich
Effects of system barge-in responses on user impressions
Jun-Ichi Hirasawa, Mikio Nakano, Takeshi Kawabata, Kiyoaki Aikawa
A new word-confidence threshold technique to enhance the performance of spoken dialogue systems
R. López-Cózar, Antonio J. Rubio, P. García, J. C. Segura
Confirmation strategies to improve correction rates in a telephonic inquiry dialogue system
C. Alexia Lavelle, Martine de Calmés, Guy Pérennou
Mathematical analysis of dialogue control strategies
Yasuhisa Niimi, Takuya Nishimoto
Processing of anaphoric and elliptic sentences in a spoken dialog system
Jana Ocelikova, Vaclav Matousek
Free-flow dialog management using forms
K. A. Papineni, Salim Roukos, T. Ward
Towards the detection and description of textual meaning indicators in spontaneous conversations
Klaus Ries
Dialogue management in the dutch ARISE train timetable information system
Janienke Sturm, Els den Os, Lou Boves
Problem spotting in human-machine interaction
Emiel Krahmer, Marc Swerts, Mariet Theune, Mieke Weegels
Consistent dialogue across concurrent topics based on an expert system model
Bor-shen Lin, Hsin-min Wang, Lin-shan Lee
Secondary codebook storage quantisation
Thomas M. Chapman, C. S. Xydeas
Pseudo-articulatory representations: promise, progress and problems
W. H. Edmondson, D. J. Iskra, P. Kienzle
A 1.7KBPS waveform interpolation speech coder using decomposition of pitch cycle waveform
Ge Gao, P. C. Ching
Enhanced analysis-by-synthesis waveform interpolative coding at 4 KBPS
Oded Gottesman, Allen Gersho
Joint source-channel decoding by channel-coded optimal estimation (CCOE) for a CELP speech codec
Norbert Görtz
Analysis-by-synthesis low-rate multimode harmonic speech coding
Chunyan Li, Allen Gersho, Vladimir Cuperman
Variable length coding of transformed LSF coefficients
László Lois
Low bit-rate speech coding using quantization of variable length segments
R. Mayrench, D. Malah
Low delay analysis/synthesis schemes for joint speech enhancement and low bit rate speech coding
Rainer Martin, Hong-Goo Kang, Richard V. Cox
A comparative study of several ADPCM schemes with linear and nonlinear prediction
Oscar Oliva, Marcos Faúndez-Zanuy
Segmental feature extraction and coding for speech synthesis
H. Ohmura, K. Tanaka
Backward adaptive RBF-based hybrid predictors for CELP-type coders at medium bit-rates
C. Peláez-Moreno, F. Díaz-de-María
An improved speech model with allowance for time-varying pitch harmonic amplitudes and frequencies in low bit-rate MBE coders
Valentin V. Sercov, Alexander A. Petrovsky
Sparse vector linear prediction matrices with multidiagonal structure
Davor Petrinovic, Davorka Petrinovic
Source-dependent variable rate speech coding below 3 KBPS
M. Stefanovic, A. Kondoz
A novel speech coding approach based on half-wave vector quantization *
Xiaoping Chen, Yantao Song, Tiecheng Yu
Speech coding using mixture of gaussians polynomial model
Parham Zolfaghari, Tony Robinson
Form-based reasoning for mixed-initiative dialogue management in information-query systems
Jennifer Chu-Carroll
Knowledge sources in spoken dialogue systems
Nils Dahlbäck, Arne Jönsson
Overview of the ARISE project
Els den Os, Lou Boves, Lori Lamel, Paolo Baggia
Creating natural dialogs in the carnegie mellon communicator system
Alexander I. Rudnicky, E. Thayer, Paul Constantinides, C. Tchou, R. Shern, Kevin Lenzo, W. Xu, A. Oh
Design strategies for spoken language dialog systems
Sophie Rosset, Samir Bennacef, Lori Lamel
A wideband speech coder based on harmonic coding at 16KBS
A. Amodio, Gang Feng
Perceptually based and embedded wideband CELP coding of speech
Alexis Bernard, Abeer Alwan
Very low bit rate voice coder based on a nonlinear hearing model
Rudolf Földvári, László Gyimesi
Low complexity bit allocation algorithm with psychoacoustical optimisation
Marcos Perreau-Guimaraes, Madeleine Bonnet, Nicolas Moreau
A novel approach of low bit-rate speech coding based on sinusoidal representation and auditory model
Wanggen Wan, Oscar C. Au, Cyan L. Keung, Chi H. Yim
Language modeling based on automatic word concatenations
Christel Beaujard, Michéle Jardino
Recognition performance of a structured language model
Ciprian Chelba, Frederick Jelinek
On the use of right context in sense-disambiguating language models
Vincent Chow, Dekai Wu
Language modelling with hierarchical domains
Paul G. Donnelly, F. J. Smith, E. Sicilia, Ji Ming
Integration of several information sources for robust class-based statistical language modelling
Géraldine Damnati
Efficient language model adaptation through MDI estimation
Marcello Federico
Syntax-based speech recognition: how a syntactic parser can help a recognition system
Arnaud Gaudinat, Jean-Philippe Goldman, Eric Wehrli
A new metric for stochastic language model evaluation
Akinori Ito, Masaki Kohda, Mari Ostendorf
Phrase-based language models for speech recognition
Hong-Kwang Jeff Kuo, Wolfgang Reichl
Class-combined word n-gram for robust language modeling
Norihiko Kobayashi, Tetsunori Kobayashi
Using partial morphological analysis in language modeling estimation for large vocabulary portuguese speech recognition
Ciro Martins, Joao P. Neto, Luís B. Almeida
Discriminative training of language model classifiers
Uwe Ohler, Stefan Harbeck, Heinrich Niemann
Improving n-gram modeling using distance-related unit association maximum entropy language modeling
Shuwu Zhang, Harald Singer, Dekai Wu, Yoshinori Sagisaka
Language modeling for broadcast news transcription
Gilles Adda, Michéle Jardino, Jean-Luc Gauvain
Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French
Frédéric Béchet, Alexis Nasr, Thierry Spriet, Renato de Mori
Language modelling and spoken dialogue systems - the ARISE experience
P. Baggia, A. Kellner, Guy Pérennou, C. Popovici, Janienke Sturm, Frank Wessel
Language model level vs. lexical level for modeling pronunciation variation in a French CSR
Laure Brieussel-Pousse, Guy Perennou
Characteristics of Chinese language models for large vocabulary telephone speech
Roger H.Y. Leung, Chi-Yan Choy, Hong C. Leung
A new based distance language model for a dictation machine: application to MAUD
D. Langlois, K. Smadli
Using various language model smoothing techniques for the transcription of a weather forecast broadcasted by the czech radio
Ludek Müller, Josef Psutka
Studies in acoustic training and language modeling using simulated speech data
Don McAllaster, Larry Gillick
Language model adaptation using minimum discrimination information
Wolfgang Reichl
Automatic and manual clustering for large vocabulary speech recognition: a comparative study
K. Smadli, A. Brun, I. Zitouni, Jean-Paul Haton
Learning of stochastic context-free grammars by means of estimation algorithms
Joan-Andreu Sánchez, José-Miguel Benedí
Part-of-speech n-gram and word n-gram fused language model
Hirofumi Yamamoto, Yoshinori Sagisaka
Linguistic features for whole sentence maximum entropy language models
Xiaojin Zhu, Stanley F. Chen, Ronald Rosenfeld
Variable-length sequence language model for large vocabulary continuous dictation machine
I. Zitouni, J. F. Mari, K. Smadli, Jean-Paul Haton
Using detailed linguistic structure in language modelling
Ruiqiang Zhang, Ezra Black, Andrew Finch
Decision tree micro-prosody structures for text to speech synthesis
Aimin Chen, Shu Lian Wong, Saeed Vaseghi, Charles Ho
Automatic modeling of duration in a Spanish text-to-speech system using neural networks
R. Córdoba, J. A. Vallejo, J. M. Montero, J. Gutierrez-Arriola, M. A. López, Juan Manuel Pardo
Objective methods for evaluating synthetic intonation
Robert A.J. Clark, Kurt E. Dusterhoff
Using decision trees within the tilt intonation model to predict F0 contours
Kurt E. Dusterhoff, Alan W. Black, Paul Taylor
Levels of prosodic representation in spoken discourse: an empirical approach
Richard Esposito, Li-chiung Yang
Segmental duration modelling in a text-to-speech system for the galician language
Xavier Fernández-Salgado, Eduardo R. Banga
The symbolic coding of segmental duration and tonal alignment: an extension to the INTSINT system.
Daniel Hirst
Training an application-dependent prosodic model corpus, model and evaluation
Yann Morlec, Gérard Bailly, Véronique Aubergé
Farsi language prosodic structure, research and implementation using a speech synthesizer
H. Sheikhzadeh, A. Eshkevari, M. Khayatian, R. Sadigh, S. M. Ahadi
Acoustical characterisation of the accented syllable in portuguese, a contribution to the naturalness of speech synthesis
Joao Paulo Teixeira, Elisabete Rosa Paulo, Diamantino Freitas, Maria da Graca Pinto
Analysis and synthesis of the four tones in connected speech of the standard Chinese based on a command-response model
Changfu Wang, Hiroya Fujisaki, Sumio Ohno, Tomohiro Kodama
A profile of the discourse and intonational structures of route descriptions
Sandra Williams, Catherine I. Watson
Neighborhood effects on spoken word recognition in Japanese
Shigeaki Amano, Tadahisa Kondo
Interference between surface form and abstract representation in spoken word perception
C. Chéreau, P. A. Hallé, J. Segui
Are the mcgurk illusions affected by left or right presentation of the speaker face?
C. Colin, M. Radeau
Prelexical locus of an illusory vowel effect in Japanese
Emmanuel Dupoux, Takao Fushimi, Kazuhiko Kakehi, Jacques Mehler
Acoustic and perceptual characteristics of the Spanish fricatives
Sergio Feijóo, Santiago Fernández, Nieves Barros, Ramón Balsa
Techniques for robust speech recognition in the car environment
Philippe Gelin, Jean-Claude Junqua
Difference limen for formant frequency discrimination at high fundamental frequencies
Fredrik Karlsson, Anders Eriksson
Auditory features for human communication of stop consonants under full-band and low-pass conditions
Eduardo Sá Marta, Luis Vieira de Sá
Levels of reduction for German tense vowels
Christina Widera, Thomas Portele
Is talking to virtual more realistic?
Luc Julia, Adam Cheyer
Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -
Yosuke Matsusaka, Tsuyoshi Tojo, Sentaro Kubota, Kenji Furukawa, Daisuke Tamiya, Keisuke Hayata, Yuichiro Nakano, Tetsunori Kobayashi
Multimodal systems for children: building a prototype
Shrikanth Narayanan, Alexandros Potamianos, Haohong Wang
Social bonding in talking with social autonomous creatures
Michio Okada, Noriko Suzuki, Masaaki Date
The relationships between voice and gesture: eyebrow movements and questioning.
A. Purson, S. Santi, R. Bertrand, Isabelle Guaitella, J. Boyer, C. Cavé
Robust vector quantization for channels with memory
Wen-Whei Chang, Heng-Iang Hsu, De-Yu Wang
A multi-rate codec family based on GSM EFR and ITU-t g.729
Balázs Kövesi, Claude Lamblin, Catherine Quinquis, Philippe Thiérion, William Navarro
A novel channel distortion measure for vector quantization and a fuzzy model for codebook index assignment
J. S. Pan, C. S. Shieh, T. F. Chiang
A full-rate GSM-AMR candidate
C. Sriratanaban, A. Kondoz
A multi-rate speech and channel codec: a GSM AMR half-rate candidate
S. Villette, M. Stefanovic, A. Kondoz
Predicting gradient F0 variation: pitch range and accent prominence
Ivan Bulyko, Mari Ostendorf
CART-based duration modeling using a novel method of extracting prosodic features
Paul Deans, Andrew Breen, Peter Jackson
A primary study on the randomness control of the prosodic boundary index for natural synthetic speech
Ki-Wan Eom, Jin-Young Kim, Sun-Mi Kim
On a hybrid time domain-LPC technique for prosody superimposing used for speech synthesis
Attila Ferencz, István Nagy, Tünde-Csilla Kovács, Teodora Ratiu, Maria Ferencz
Multilingual prosody modelling using cascades of regression trees and neural networks
J. W. A. Fackrell, H. Vereecken, J.-P. Martens, Bert Van Coile
An efficient speaker adaptation method for TTS duration model
Wentao Gu, Chilin Shih, Jan P.H. van Santen
Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program
David House, Linda Bell, Kjell Gustafson, Linn Johansson
Representation and processing of linguistic structures for an all-prosodic synthesis system using XML
Mark Huckvale
A study on a pitch alteration by using the formant and phase compensation technique
Won Park, Hyung-Bin Park, Myung-Jin Bae
Micro-prosodic control in cantonese text-to-speech synthesis
Tan Lee, Helen M. Meng, Wai H. Lau, W. K. Lo, P. C. Ching
Exploring the naturalness of several German high-quality-text-to-speech systems
Hansjörg Mixdorff, Dieter Mehnert
Detecting accent sandhi in Japanese using a superpositional F0 model
A. Sakurai, Hiromichi Kawanami, Keikichi Hirose
Focus detection by comparison of speech waveforms
Satoshi Kitagawa, Nick Campbell
An advanced intonation model for synthesis
Mark Tatham, Eric Lewis, Katherine Morton
A new F0 modification algorithm by manipulating harmonics of magnitude spectrum
Satoshi Takano, Masanobu Abe
A mixed strategy approach to Spanish prosody
Juan Manuel Villar Navarro, Eduardo López Gonzalo, José Relaño Gil
Perception of overlapping syllables
William A. Ainsworth
Are transcriptions of speech material recorded by means of bugs reliable?
Loredana Cerrato, Andrea Paoloni
Influence of morphology on phoneme identification in spoken croatian
Vlasta Erdeljac, Damir Horga
Modeling the masking of formant transitions in noise
James J. Hant, Abeer Alwan
Stabilised wavelet mellin transform: an auditory strategy for normalising sound-source size
Toshio Irino, Roy D. Patterson
Unintended preferences in the perceptive evaluation of rhythmical units in czech
Zdena Palková, Jitka Janíková
Phonological representations and repetition priming
Christophe Pallier, Nuria Sebastián Gallés, Angels Colomé
Distance score evaluation of the visualised speech spectra at audio-visual articulation training
Klára Vicsi, F. Csatari, Zs. Bakcsi, A. Tantos
Objective and subjective evaluation of the acoustic models of a continuous speech recognition system
David A. van Leeuwen, Michael de Louwere
Verbo-motor priming in the phonetic encoding of real and non-words
S. P. Whiteside, R. A. Varley
An improved MAP method for language model adaptation
Langzhou Chen, Taiyi Huang
Towards improved language model evaluation measures
Philip Clarkson, Tony Robinson
A novel language model based on self-organized learning
Taiyi Huang, Langzhou Chen
Combining syntactical and statistical language constraints in context-dependent language models for interactive speech applications
Ute Kilian, Fritz Class
Assessment of smoothing methods and complex stochastic language modeling
Sven Martin, Christoph Hamacher, Jorg Liermann, Frank Wessel, Hermann Ney
Domain adaptation for robust automatic speech recognition in car environments
Rolf Bippus, Alexander Fischer, Volker Stahl
A DCT-based fast enhancement technique for robust speech recognition in automobile usage
Jun Huang, Yunxin Zhao, Stephen Levinson
Fully adaptive SVD-based noise removal for robust speech recognition
Kris Hermus, Ioannis Dologlou, Patrick Wambacq, Dirk Van Compernolle
Towards spontaneous speech recognition for on-board car navigation and information systems
Martin Westphal, Alex Waibel
Towards robust speech recognition in the telephony network environment - cellular and landline conditions
Subrata Das, David Lubensky, Cheng Wu
An overview of the PICASSO project research activities in speaker verification for telephone applications
Frédéric Bimbot, Mats Blomberg, Louis Boves, Gérard Chollet, Cédric Jaboulet, Bruno Jacob, Jamal Kharroubi, Johan Koolwaaij, Johan Lindberg, Johnny Mariethoz, Chafic Mokbel, Houda Mokbel
Integrating time-alignment information into the decision making for text-dependent HMM-based speaker verification
D. Charlet
Deliberate imposture: a challenge for automatic speaker verification systems.
Dominique Genoud, Gérard Chollet
Variance flooring, scaling and tying for text-dependent speaker verification
H. Melin, Johan Lindberg
Client / world model synchronous alignement for speaker verification
Johnny Mariethoz, Dominique Genoud, Frédéric Bimbot, Chafic Mokbel
Linguistic phrase spotting in a simple application spoken dialogue system
Manuela Boros, Paul Heisterkamp
Learning of domain dependent knowledge in semantic networks
F. Deinzer, J. Fischer, U. Ahlrichs, Elmar Nöth
Combining words and prosody for information extraction from speech
Dilek Hakkani-Tür, Gökhan Tür, Andreas Stolcke, Elizabeth Shriberg
Error correction translation using text corpora
Kai Ishikawa, Eiichiro Sumita
Efficient sentence disambiguation by preferred constituent order
S. Kronenberg, K. Skuplik
Identifying linguistic segmentations in Chinese spoken dialogue
Yue-Shi Lee, Hsin-Hsi Chen
Error recovery for robust language understanding in spoken dialogue systems
Tung-Hui Chiang, Yi-Chung Lin
A monolingual semantic decoder based on word sense disambiguation for mixed language understanding
Xiaohu Liu, Pascale Fung, Chi Shun Cheung
To believe is to understand
Helen M. Meng, Wai Lam, Carmen Wai
A hybrid approach to spoken dialogue understanding: prosody, statistics and partial parsing
Elmar Nöth, Jürgen Haas, Volker Warnke, Florian Gallwitz, Manuela Boros
Portable speech interpreter which has voice input and sophisticated correction functions
Yasunari Obuchi, Atsuko Koizumi, Yoshinori Kitahara, Jun'ichi Matsuda, Toshihisa Tsukada
Categorical understanding using statistical ngram models
Alexandros Potamianos, Giuseppe Riccardi, Shrikanth Narayanan
Detection and correction of speech repairs in word lattices
Jörg Spilker, Hans Weber, Günther Görz
Connectionist language models for speech understanding: the problem of word order variation
Igor Schadle, Jean-Yves Antoine, Daniel Memmi
Semi-automatic acquisition of domain-specific semantic structures
Kai-Chung Siu, Helen M. Meng
Transformation into language processing units by dividing and connecting utterance units
Toshiyuki Takezawa
Learning a lightweight robust deterministic parser
Aboy Wong, Dekai Wu
An information-based method for selecting feature types for word prediction
Dekai Wu, Zhifang Sui, Jun Zhao
A robust parser for spoken language understanding
Ye-Yi Wang
Aiuruete: a high-quality concatenative text-to-speech system for brazilian portuguese with demisyllabic analysis-based units and a hierarchical model of rhythm production
Plínio A. Barbosa, Fábio Violaro, Eleonora C. Albano, Flávio Simoes, Patrícia Aquino, Sandra Madureira, Edson Francozo
A parser-based text preprocessor for romanian language TTS synthesis
Dragos Burileanu, Claudius Dan, Mihai Sima, Corneliu Burileanu
Nparse - a shallow n-gram-based grammatical-phrase parser
Alice Carlberger
A language-independent probabilistic model for automatic conversion between graphemic and phonemic transcription of words
Evangelos Dermatas, George Kokkinakis
Acquisition of an extensive rule set for slovene grapheme-to-allophone transcription
Jerneja Gros, F. Mihelic
Voice conversion between UK and US accented English
Ching-Hsiang Ho, Saeed Vaseghi, Aimin Chen
Development of speech design tool "SESIGN99" to enhance synthesized speech
Hideyuki Mizuno, Masanobu ABE, Shin'ya Nakajima
Automation of the training procedures for neural networks performing multi-lingual grapheme to phoneme conversion
Horst-Udo Hain
Parsing hungarian sentences in order to determine their prosodic structures in a multilingual TTS system
Ilona Koutny
Text-to-speech synthesis of estonian
Meelis Mihkla, Arvo Eek, Einar Meister
Development of an emotional speech synthesiser in Spanish
J. M. Montero, J. Gutiérrez-Arriola, J. Colás, J. Macías-Guarasa, E. Enríquez, Juan Manuel Pardo
S5: the SQEL slovene speech synthesis system
N. Pavesic, Jerneja Gros
A multilingual text processing engine for the PAPAGENO text-to-speech synthesis system
Matej Rojc, Janez Stergar, Ralph Wilhelm, Horst-Udo Hain, Martin Holzapfel, Bogomir Horvat
Toshiba English text-to-speech synthesizer (TESS)
Chang K. Suh, Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine
Towards the generation of French phonetic inflected forms
Frédérique Sannier, Véronique Aubergé
Canadian French text-to-speech synthesis: modeling an optimal set of realizations for dialect markers
Evelyne Tzoukermann, Lucie Ménard, Marise Ouellet
Machine learning of word pronunciation: the case against abstraction
Bertjan Busser, Walter Daelemans, Antal van den Bosch
A public domain speech-to-text system
M. Ordowski, N. Deshmukh, A. Ganapathiraju, J. Hamaker, Joseph Picone
Human speech production - an internet-based interactive multimodal tutorial
Klaus Fellbaum, Joerg Richter
Terminology principles and support for spoken language system development
Dafydd Gibbon, Silke Kölsch, Inge Mertins, Michaela Schulte, Thorsten Trippel
New WWW browser for visually impaired people using interactive voice technology
Yasuo Horiuchi, Fujiwara Atsushi, Akira Ichikawa
Text to speech control protocol
Jirí Hanika, Petr Horák
Multilinguality and human language technology courseware
Bojan Petek
Multimodal information seeking dialogues on the world wide web
José Rouillard, Jean Caelen
Compression of acoustic features - are perceptual quality and recognition performance incompatible goals?
Roger Tucker, Tony Robinson, James Christie
A network architecture for building applications that use speech recognition and/or synthesis
Dominique Vaufreydaz, José Rouillard, Mohammad Akbar
Criteria for evaluating internet tutorials in speech communication sciences
Chris Bowerman, Anders Eriksson, Mark Huckvale, Mike Rosner, Mark Tatham, Maria Wolters
Javaspeechlab - interactive speech analysis laboratory on the world-wide web
Andrzej Drygajlo, Guy Delafontaine
Reviving discrete HMMs: the myth about the superiority of continuous HMMs
Vassilis Digalakis, Stavros Tsakalidis, Leonardo Neumeyer
Principles and design of an intelligent system for information retrieval over the internet with a multimodal dialogue interface
Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Kenji Abe, Michio Iijima, Masayoshi Suzuki, Kazunari Taketa
An asynchronous virtual meeting system for bi-directional speech dialog
Takuya Nishimoto, Hidehiro Yuki, Takehiko Kawahara, Yasuhisa Niimi
Context scope selection in multi-Span statistical language modeling
Jerome R. Bellegarda
Topic-based language models using EM
Daniel Gildea, Thomas Hofmann
Augmenting words with linguistic information for n-gram language models
Lucian Galescu, Eric K. Ringger
A language model combining n-grams and stochastic finite state automata
Alexis Nasr, Yannick Estéve, Frédéric Béchet, Thierry Spriet, Renato de Mori
Combining nonlocal, syntactic and n-gram dependencies in language modeling
Jun Wu, Sanjeev Khudanpur
Robust feature vector compression algorithm for distributed speech recognition
Imre Kiss, Pekka Kapanen
Separation of speech signals using iterative multi-pitch analysis and prediction
Matti Karjalainen, Tero Tolonen
Feature fusion for music detection
Eluned S. Parris, Michael J. Carey, Harvey Lloyd-Thomas
Speech variability in the modulation spectral domain - SANOVA technique -
Sarel van Vuuren, Hynek Hermansky
Improving harmonic selection for speech intelligibility enhancement by the reassignment method
Dekun Yang, Georg F. Meyer, William A. Ainsworth
A hierarchical approach to large-scale speaker recognition
Homayoon S. M. Beigi, Stéphane H. Maes, Upendra V. Chaudhari, Jeffrey S. Sorensen
A segmental approach to text-independent speaker verification
J. Cernocky, D. Petrovska-Delacrélaz, S. Pigeon, P. Verlinde, Gérard Chollet
Who spoke when? - automatic segmentation and clustering for determining speaker turns
S. E. Johnson
The 1999 NIST speaker recognition evaluation, using summed two-channel telephone data for speaker detection and speaker tracking
Mark A. Przybocki, Alvin F. Martin
Speaker tracking and detection with multiple speakers
Kemal Sönmez, Larry Heck, Mitchel Weintraub
The acquisition of a speech corpus for limited domain translation
Demetrio Aiello, Loredana Cerrato, Cristina Delogu, Andrea Di Carlo
Tagging spoken corpus
Yue-Shi Lee, Hsin-Hsi Chen
A hungarian child database for speech processing applications
F. Csatári, Zs. Bakcsi, Klára Vicsi
A generic lexicon tool for word model definition in multimodal applications
Julie Carson-Berndsen
Compiling multi-tiered speech databases into the relational model: experiments with the emu system
Steve Cassidy
Two Swedish Speechdat databases - some experiences and results
Kjell Elenius
A multimodal database of gestures and speech
Satoru Hayamizu, Shigeki Nagaya, Keiko Watanuki, Masayuki Nakazawa, Shuichi Nobe, Takashi Yoshimura
Japanese spontaneous speech database with wide regional and age distribution
Tomoko Matsui, Masaki Naito, Harald Singer, Atsushi Nakamura, Yoshinori Sagisaka
Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition
Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takeshi Yamada, Takashi Endo
Automatic labeling of Japanese prosody using j-toBI style description
Hiroaki Noguchi, Kazuhisa Kiriyama, Hiroshi Matsuda, Miki Taniguchi, Yasuharu Den, Yasuhiro Katagiri
Czech language database of car speech and environmental noise
Petr Pollák, Josef Vopièka, Pavel Sovka
Language model selection based on the analysis of Japanese spontaneous speech on travel arrangement task
Akira Kurematsu, Atsushi Sukenori
New resources at BAS: acoustic, multimodal, linguistic
Florian Schiel, Christoph Draxler, Phil Hoole, Hans G. Tillmann
Building speech databases for cellular networks
Eric Sanders, Henk van den Heuvel, Khalid Choukri
The speechdat-car multilingual speech databases for in-car applications: some first validation results
Henk van den Heuvel, Jerôme Boudy, Robrecht Comeyne, Stephan Euler, Asuncion Moreno, Gael Richard
A welsh speech database: preliminary results
Briony Williams
New developments within the european language resources association (ELRA)
Khalid Choukri, ValéRie Mapelli, Jeff Allen
Data collection and processing in the carnegie mellon communicator
Maxine Eskenazi, Alexander I. Rudnicky, Karin Gregory, Paul Constantinides, Robert Brennan, Christina Bennett, Jwan Allen
Speechdat multilingual speech databases for teleservices: across the finish line
Harald Höge, Christoph Draxler, Henk van den Heuvel, Finn Tore Johansen, Eric Sanders, Herbert S. Tropf
Enhancing reusability of speech corpora by hyperlinked query output
Andreas Mengel, Ulrich Heid
Design and ccollection of a corpus of polyphones and prosodic contexts for speech synthesis research and development
Kim Silverman, Victoria Anderson, Jerome Bellegarda, Kevin Lenzo, Devang Naik
Sinusoidal representation and auditory model-based parametric matching and smoothing and its application in speech analysis/synthesis
Oscar C. Au, Wanggen Wan, Cyan L. Keung, Chi H. Yim
Choose the best to modify the least: a new generation concatenative synthesis system
Marcello Balestri, Alberto Pacchiotti, Silvia Quazza, Pier Luigi Salza, Stefano Sandri
Selection of waveform units for corpus-based Mandarin speech synthesis based on decision trees and prosodic modification costs
Fu-chiang Chou, Chiu-yu Tseng, Lin-shan Lee
Improving quality in a speech synthesizer based on the MBROLA algorithm
B. Etxebarria, I. Hernáez, I. Madariaga, E. Navas, J. C. Rodríguez, R. Gándara
A novel model TD-PSPTP for speech synthesis
Yan Huang, Bo Xu
Detection of non-stationarity in speech signals and its application to time-scaling
David Kapilow, Yannis Stylianou, Juergen Schroeter
A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence
Takao Koyama, Jun-ichi Takahashi
Stable speech synthesis using recurrent radial basis functions
Iain Mann, Steve McLaughlin
Efficient weight training for selection based synthesis
Yoram Meron, Keikichi Hirose
Speech synthesis using HMM-based acoustic unit inventory
Jindrich Matousek
An enhanced ABS/OLA sinusoidal model for waveform synthesis in TTS
Michael W. Macon, Mark A. Clements
High vowel /i y u/ in canadian and continental French: an analysis for a TTS system
Marise Ouellet, Evelyne Tzoukermann, Lucie Ménard
Speech production based on the mel-frequency cepstral coefficients
Zbynìk Tychtl, Josef Psutka
Exploiting improved parameter smoothing within a hybrid concatenative/LPC speech synthesizer
Erhard Rank
Synchronization of speech frames based on phase data with application to concatenative speech synthesis
Yannis Stylianou
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura
A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition
Hervé Glotin, Frédéric Berthommier, Emmanuel Tessier
Noise-invariant representation for speech signals
Aruna Bayya, B. Yegnanarayana
Natural-quality background noise coding using residual substitution
Khaled El-Maleh, Peter Kabal
Microphone array design for robust speech acquisition and recognition
Julian Fernández, Eduardo Lleida, Enrique Masgrau
Study of the influence of noise pre-processing on the performance of a low bit rate parametric speech coder
Gwénaél Guilmin, Régine Le Bouquin-Jeannès, Philippe Gournay
MLP network for enhancement of noisy MFCC vectors
Hemmo Haverinen, Petri Salmela, Juha Häkkinen, Mikko Lehtokangas, Jukka Saarinen
Hands-free voice activation in noisy car environment
J. Iso-Sipilä, K. Laurila, Ramalingam Hariharan, Olli Viikki
A wavelet denoising technique to improve endpoint detection in adverse conditions
Lamia Karray, Emmanuel Polard
Speech enhancement for linear-predictive-analysis-by-synthesis coders
Marcin Kuropatwinski, Dieter Leckschat, Kristian Kroschel, Andrzej Czyzewski, Chaz Hales
Robust HMM to variation of noisy environments based on variance extension of noise models
Hiroshi Matsumoto, Hiroaki Ubukata
The fourth-order cumulant of speech signals with application to voice activity detection
Elias Nemer, Rafik Goubran, Samy Mahmoud
The dependence of feature vectors under adverse noise
Woei-Chyang Shieh, Sen-Chia Chang
Speech detection and SNR prediction basing on amplitude modulation pattern recognition
Jürgen Tchorz, Birger Kollmeier
Fast active noise control for robust speech acquisition
Luis Vicente, Stephen J. Elliott, Enrique Masgrau
Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study
Ascension Vizinho, Phil Green, M. Cooke, Ljubomir Josifovski
Single channel speech enhancement using principal component analysis and MDL subspace selection
Rolf Vetter, Nathalie Virag, Philippe Renevey, Jean-Marc Vesin
Automatically deriving categories for translation
Sergio Barrachina, Juan Miguel Vilar
An inter-domain portable approach to interchange format construction
A. Corazza
Distributed representation of vocabularies in the RECONTRA neural translator
G. A. Casañ, M. A. Castaño
Robust information extraction in a speech translation system
Norbert Reithinger
End-to-end evaluation in ATR-MATRIX: speech translation system between English and Japanese
Fumiaki Sugaya, Toshiyuki Takezawa, Akio Yokoo, Seiichi Yamamoto
Story segmentation and topic detection for recognized speech
S. Dharanipragada, Martin Franz, J. S. McCarley, Salim Roukos, T. Ward
Topic tracking for radio, TV broadcast, and newswire
Hubert Jin, Richard Schwartz, Sreenivasa Sista, Frederick Walls
The beta-binomial mixture model for word frequencies in documents with applications to information retrieval
Stephen A. Lowe
Topic spotting and its description of summary from spontaneous speech
Masayuki Nakazawa, Jianxin Zhang, Ryuichi Oka
Topic detection in broadcast news
Frederick Walls, Hubert Jin, Sreenivasa Sista, Richard Schwartz
Prosodic effects on segmental durations in greek
Antonis Botinis, Marios Fourakis, Irini Prinou
Within-utterance correlation for speech recognition
Mats Blomberg
Techniques for robust speech recognition in the car environment
Philippe Gelin, Jean-Claude Junqua
An on-line acoustic compensation technique for robust speech recognition
Diego Giuliani
Using adaptive signal limiter together with noise-robust techniques for noisy speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang
A robust environment-effects suppression training algorithm for adverse Mandarin speech recognition
Wei-Tyng Hong, Sin-Horng Chen
Robust speaker adaptation of continuous density HMMS using multilayer perceptron network
Mikko Harju, Petri Salmela, Olli Viikki, Mikko Lehtokangas, Jukka Saarinen
Regression class selection and speaker adaptation with MLLR in Mandarin continuous speech recognition
Chengrong Li, Jingdong Chen, Bo Xu
Regression transformation of prior means for speaker adaptation
Guoqiang Li, Limin Du, Ziqiang Hou
Linguistic tree based maximum likelihood model interpolation
Liu Feng, Chi-wei Che, Peng Yu, Zuoying Wang
Model-based speaker normalization methods for speech recognition
Masaki Naito, Li Deng, Yoshinori Sagisaka
Maximum likelihood eigenspace and MLLR for speech recognition in noisy environments
Patrick Nguyen, Christian Wellekens, Jean-Claude Junqua
A study of speaker adaptation for speaker independent speech recognition method using phoneme similarity vector
Yoshio Ono, Maki Yamada, Masakatsu Hoshimi
An investigation into vocal tract length normalisation
L. F. Uebel, P. C. Woodland
Adaptation to environment and speaker using maximum likelihood neural networks
Zong Suk Yuk, James Flanagan, Mahesh Krishnamoorthy, Krishna Dayanidhi
Corrective training for speaker adaptation
Xiuyang Yu, Wayne Ward
A robust speaker-independent CPU-based ASR system
R. Obradovic, D. Pekar, S. Krco, V. Delic, V. Senk
Delay estimation for transform domain acoustical echo cancellation
Rabih Abouchakra, Peter Kabal
Noise reduction using perceptual spectral change
C. Beaugeant, Pascal Scalart
Intelligibility improvements using diverse sub-band processing applied to noisy speech
Amir Hussain, Douglas R. Campbell
Recognizing simultaneous speech: a genetic algorithm approach
Athanasios Koutras, Evangelos Dermatas, George Kokkinakis
Speech enhancement system for hands-free telephone based on the psychoacoustically motivated filter bank with allpass frequency transformation #
Krzysztof Bielawski, Alexander A. Petrovsky
Speech enhancement using a multi-microphone sub-band adaptive griffiths-jim noise canceller
P. W. Shields, Douglas R. Campbell
Qualiphone-a: a perceptual speech quality evaluation system for analog mobile networks
M. Szarvas, T. Fegyó, P. Tatai, Géza Gordos
Speech enhancement using nonlinear microphone array under nonstationary noise conditions
Hiroshi Saruwatari, Shoji Kajita, Kazuya Takeda, Fumitada Itakura
Auditory masking threshold estimation for broadband noise sources with application to speech enhancement
Ruhi Sarikaya, John H. L. Hansen
Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis
Masashi Unoki, Masato Akagi
Analysis and on-line detection of audible distortions in GSM telephony
Christophe Veaux, Pascal Scalart, André Gilloire
A parameter-based 2-talker detection apparatus for echo cancellation
Wen Rong Ru, Shih-Chen Lin, Po-Cheng Chen, Chun-Hung Kuo
Co-channel speech separation in the presence of correlated and uncorrelated noises
Kuan-Chieh Yen, Jun Huang, Yunxin Zhao
Speech enhancement using a mixture-maximum model
David Burshtein, Sharon Gannot
Concurrent speakers separation through binaural processing of stereo recordings
Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia
Spectral subtraction with adaptive averaging of the gain function
Harald Gustafsson, Sven Nordholm, Ingvar Claesson
A reliability criterion for time-frequency labeling based on periodicity in an auditory scene
François Gaillard, Frédéric Berthommier, Gang Feng, Jean-Luc Schwartz
Broadband noise cancellation systems: new approach to working performance optimization
Serguei Koval, Mikhail Stolbov, Mikhail Khitrov
Noise subtraction with parametric recursive gain curves
Klaus Linhard, Tim Haulick
Performance comparison of several adaptive schemes for microphone array beamforming
Enrique Masgrau, Luis Aguilar, Eduardo Lleida
An objective distortion estimator for hearing aids and its application to noise reduction
Mitsunori Mizumachi, Masato Akagi
Speech enhancement using fourth-order cumulants and time-domain optimal filters
Elias Nemer, Rafik Goubran, Samy Mahmoud
Missing feature theory and probabilistic estimation of clean speech components for robust speech recognition
Philippe Renevey, Andrzej Drygajlo
Distortion effects of several cumulant-based wiener filtering algorithms
Josep M. Salavedra, Xavier Bou
Combined noise suppression system for monaural cochlear implants
Milan Svoboda, Pavel Sovka, Petr Pollák
Objective prediction of speech intelligibility at high ambient noise levels using the speech transmission index
Sander J. van Wijngaarden, Herman J. M. Steeneken
Noise-regularized adaptive filtering for speech enhancement
Eric A. Wan, Rudolph van der Merwe
Speech enhancement using karhunen-love transformation and wiener filtering in critical bands
F. Zarubin, A. Kovtonyuk, K. Zadiraka
The CPK NLP suite for spoken language understanding
Tom Brondsted
Towards multi-domain speech understanding using a two-stage recognizer
Grace Chung, Stephanie Seneff, Lee Hetherington
A slovenian spoken dialog system for air flight inquiries
I. Ipsic, F. Mihelic, S. Dobrisek, Jerneja Gros, N. Pavesic
A pervasive conversational interface for information interaction
Ganesh Ramaswamy, Jan Kleindienst, Daniel Coffman, Ponani Gopalakrishnan, Chalapathy Neti
Extending the SUSI system with negative knowledge
B. Vromans, R.J. van Vark, B. Rueber, A. Kellner
Phonological constraints in speech segmentation processes: investigating levels of implementation.
Olivier Crouzet, Nicole Bacri
Learning phonetic distinctions from speech signals
Robert I. Damper, Steve R. Gunn
Phonotactics in the perception of Japanese vowel length: evidence for long-distance dependencies
Elliott Moreton, Shigeaki Amano
Perception of stress by French, Spanish, and bilingual subjects
Sharon Peperkamp, Emmanuel Dupoux, Núria Sebastián-Gallés
Temporal constraints on speech intelligibility as deduced from exceedingly sparse spectral representations
Rosaria Silipo, Steven Greenberg, Takayuki Arai
Mandarin telephone speech recognition using MCE/GPD-based speaker cluster HMM
Sen-Chia Chang, Shih-Chieh Chien, Woei-Chyang Shieh
Combining length restrictions and n-best techniques in multiple-pass search strategies
A. R. Fonollosa, Eloi Batlle
The deterministic annealing approach for discriminative continuous HMM design
Cecile Gelin-Huet, Kenneth Rose, Ajit Rao
On-line adaptive learning of CDHMM parameters based on multiple-stream prior evolution and posterior pooling
Qiang Huo, Bin Ma
Unsupervised training of a speech recognizer: recent experiments
Thomas Kemp, Alex Waibel
Piecewise HMM discriminative training
C. Chesta, Pietro Laface, M. Nigra
A MLE algorithm for the k-NN HMM system
Fabrice Lefévre, Claude Montacié, Marie-José Caraty
Single-pass adapted training with all-pass transforms
John McDonough, William Byrne
Minimum confusibility training of context dependent demiphones
Albino Nogueiras-Rodríguez, José B. Marino
Phoneme recognition in fixed context using regularized discriminant analysis
A. Rudzionis, V. Rudzionis
Hidden Markov models using fuzzy estimation
Dat Tran, Michael Wagner
Incremental training of CDHMMs using bayesian learning
Claudio Vair, Massimiliano Mercogliano, Luciano Fissore
A discriminative training procedure based on language model and dictionary for LVCSR
Daniel Willett, Stefan Müller, Gerhard Rigoll
A novel discriminative method for HMM in automatic speech recognition
Jian Wu, Qing Guo
EARLYZER: perceptualy motivated robust TFR of speech
J. V. Avadhanulu, M. Mathew, T. V. Sreenivas
Frequency lowering using a discrete exponential transform
C. M. Aguilera, A. Navas, R. Urquiza, A. Gago
An efficient F0 determination algorithm based on the implicit calculation of the autocorrelation of the temporal excitation signal
Joseph Di Martino, Yves Laprie
Vowel landmark detection
Andrew Wilson Howitt
Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity
Hideki Kawahara, Haruhiro Katayose, Alain de Cheveigné, Roy D. Patterson
A novel high quality efficient algorithm for time-scale modification of speech
B. Lawlor, A. D. Fagan
Formant tracking using segmental phonemic information
Minkyu Lee, Jan van Santen, Bernd Möbius, Joseph Olive
Tailoring kalman filtering towards speaker characterisation
John McKenna, Stephen Isard
Automatic detection of manner events based on temporal parameters
Ariel Salomon, Carol Espy-Wilson
Two-class signal segmentation for speech/music detection in audio tracks
Mouhamadou Seck, Frédéric Bimbot, Didier Zugaj, Bernard Delyon
Robust glottal closure detection using the wavelet transform
Vu Ngoc Tuan, Christophe d'Alessandro
High-accuracy automatic segmentation
Jan P. H. van Santen, Richard W. Sproat
Single complex sinusoid and ARHE model based pitch extractors
Ilija Zeljkovic, Yannis Stylianou
A robust isolated word recognizer for highly non-stationary environments. recognition results
A. Álvarez, R. Martínez, P. Gómez, V. Nieto, M. M. Pérez
Sequential bias compensation for robust speech recognition
Mohamed Afify
Use of simulated data for robust telephone speech recognition
Coianiz Tarcisio, Falavigna Daniele, Gretter Roberto, Orlandi Marco
On the use of time alignments for noisy speech recognition
Y. Hauptman, Y. Bistritz
Improved feature vector normalization for noise robust connected speech recognition
Juha Häkkinen, J. Suontausta, Ramalingam Hariharan, M. Vasilache, K. Laurila
State based imputation of missing data for robust speech recognition and speech enhancement
Ljubomir Josifovski, Martin Cooke, Phil Green, Ascension Vizinho
A comparison of two strategies for ASR in additive noise: missing data and spectral subtraction
Christopher Kermorvant, Andrew Morris
A comparison of techniques for tone compensation in payphone-based speech recognition
Ben Milner, Mark Farrell
Front-end improvements to reduce stationary & variable channel and noise distortions in continuous speech recognition tasks
Xavier Menéndez-Pidal, Ruxin Chen, Duanpei Wu, Mick Tanaka
Speech recognition in noisy reverberant rooms using a frequency domain blind deconvolution method
G. Nokas, E. Dermatas
Optimization of a speech recognizer for aircraft environments
Volker Schless, Fritz Class, Peter Sandl
Temporal constraints in viterbi alignment for speech recognition in noise
Nestor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump
HMM composition of segmental unit input HMM for noisy speech recognition
Kazumasa Yamamoto, Seiichi Nakagawa
Robust connected word speech recognition using weighted viterbi algorithm and context-dependent temporal constraints
Nestor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump
Liftered forward masking procedure for robust digits recognition
Kaisheng Yao, Bertram Shi, Pascale Fung, Zhigang Cao
Channel identification and spectrum estimation for robust automatic speech recognition
Yunxin Zhao
Article |
---|