doi: 10.21437/Eurospeech.2003
ISSN: 1018-4074
A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments
Kaisheng Yao, Erik Visser, Oh-Wook Kwon, Te-Won Lee
Maximum likelihood normalization for robust speech recognition
Yiu-Pong Lai, Man-Hung Siu
Robust speech recognition using model-based feature enhancement
Veronique Stouten, Hugo van Hamme, Kris Demuynck, Patrick Wambacq
Several HKU approaches for robust speech recognition and their evaluation on Aurora connected digit recognition tasks
Jian Wu, Qiang Huo
Average instantaneous frequency (AIF) and average log-envelopes (ALE) for ASR with the Aurora 2 database
Yadong Wang, Jesse Hansen, Gopi Krishna Allu, Ramdas Kumaresan
Adaptation of acoustic model using the gain-adapted HMM decomposition method
Akira Sasou, Futoshi Asano, Kazuyo Tanaka, Satoshi Nakamura
Person authentication by voice: a need for caution
Jean-Francois Bonastre, Frédéric Bimbot, Louis-Jean Boe, Joseph P. Campbell, Douglas A. Reynolds, Ivan Magrin-Chagnolleau
ISCA special session: hot topics in speech synthesis
Gérard Bailly, Nick Campbell, Bernd Möbius
Perceiving emotions by ear and by eye
Beatrice de Gelder
Strategies for automatic multi-tier annotation of spoken language corpora
Steven Greenberg
Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization
Lin-shan Lee, Yuan Ho, Jia-fu Chen, Shun-Chuan Chen
Speech analysis with the short-time chirp transform
Luis Weruaga, Marian Kepesi
Glottal spectrum based inverse filtering
Ixone Arroabarren, Alfonso Carlosena
A novel method of analysing and comparing responses of hearing aid algorithms using auditory time-frequency representation
G.V. Kiran, T.V. Sreenivas
Frequency-related representation of speech
Kuldip K. Paliwal, Bishnu S. Atal
Tracking a moving speaker using excitation source information
Vikas C. Raykar, Ramani Duraiswami, B. Yegnanarayana, S.R. Mahadeva Prasanna
Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint
Li Deng, Issam Bazzi, Alex Acero
Optimization of the CELP model in the LSP domain
Khosrow Lashkari, Toshio Miki
Transforming voice quality
Ben Gillett, Simon King
DOA estimation of speech signal using equilateral-triangular microphone array
Yusuke Hioka, Nozomu Hamada
Multi-array fusion for beamforming and localization of moving speakers
Ilyas Potamitis, George Tremoulis, Nikos Fakotakis, George Kokkinakis
Integrated pitch and MFCC extraction for speech reconstruction and speech recognition applications
Xu Shao, Ben P. Milner, Stephen J. Cox
Exploiting time warping in AMR-NB and AMR-WB speech coders
Lasse Laaksonen, Sakari Himanen, Ari Heikkinen, Jani Nurminen
A new approach to voice activity detection based on self-organizing maps
Stephan Grashey
Estimating the spectral envelope of voiced speech using multi-frame analysis
Yoshinori Shiga, Simon King
Adaptive noise estimation using second generation and perceptual wavelet transforms
Essa Jafer, Abdulhussain E. Mahdi
A clustering approach to on-line audio source separation
Julien Bourgeois
Estimation of voice source and vocal tract characteristics based on multi-frame analysis
Yoshinori Shiga, Simon King
A new method for pitch prediction from spectral envelope and its application in voice conversion
Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel
Maximum likelihood endpoint detection with time-domain features
Marco Orlandi, Alfiero Santarelli, Daniele Falavigna
Unified analysis of glottal source spectrum
Ixone Arroabarren, Alfonso Carlosena
Local regularity analysis at glottal opening and closure instants in electroglottogram signal using wavelet transform modulus maxima
Aicha Bouzid, Noureddine Ellouze
Improved robustness of automatic speech recognition using a new class definition in linear discriminant analysis
M. Schaffoner, M. Katz, S.E. Kruger, A. Wendemuth
Voice conversion methods for vocal tract and pitch contour modification
Oytun Turk, Levent M. Arslan
Modulation spectrum for pitch and speech pause detection
Olaf Schreiner
Robust energy demodulation based on continuous models with application to speech recognition
Dimitrios Dimitriadis, Petros Maragos
A robust and sensitive word boundary decision algorithm
Jong Uk Kim, SangGyun Kim, Chang D. Yoo
A novel transcoding algorithm for SMV and g.723.1 speech coders via direct parameter transformation
Seongho Seo, Dalwon Jang, Sunil Lee, Chang D. Yoo
A novel rate selection algorithm for transcoding CELP-type codec and SMV
Dalwon Jang, Seongho Seo, Sunil Lee, Chang D. Yoo
Subband-based acoustic shock limiting algorithm on a low-resource DSP system
G. Choy, D. Hermann, R.L. Brennan, T. Schneider, H. Sheikhzadeh, E. Cornu
Pitch estimation using phase locked loops
Patricia A. Pelle, Matias L. Capeletto
Performance evaluation of IFAS-based fundamental frequency estimator in noisy environment
Dhany Arifianto, Takao Kobayashi
Estimation of the parameters of the quantitative intonation model with continuous wavelet analysis
Hans Kruschke, Michael Lenz
Morphological filtering of speech spectrograms in the context of additive noise
Francisco Romero Rodriguez, Wei M. Liu, Nicholas W.D. Evans, John S.D. Mason
Segmenting multiple concurrent speakers using microphone arrays
Guillaume Lathoud, Iain A. McCowan, Darren C. Moore
Segmentation of speech into syllable-like units
T. Nagarajan, Hema A. Murthy, Rajesh M. Hegde
A syllable segmentation algorithm for English and italian
Massimo Petrillo, Francesco Cutugno
Modeling speaking rate for voice fonts
Ashish Verma, Arun Kumar
A new HMM-based approach to broad phonetic classification of speech
Jouni Pohjalainen
Acoustic change detection and segment clustering of two-way telephone conversations
Xin Zhong, Mark A. Clements, Sung Lim
Blind normalization of speech from different channels
David N. Levin
Speech watermarking by parametric embedding with an l_(infinity) fidelity criterion
A.R. Gurijala, J.R. Deller Jr.
Features of contracted syllables of spontaneous Mandarin
Shu-Chuan Tseng
Durational characteristics of hindi stop consonants
K. Samudravijaya
Quantity comparison of Japanese and finnish in various word structures
Toshiko Isei-Jaakkola
Broad focus across sentence types in greek
Mary Baltazani
Analysis and modeling of syllable duration for Thai speech synthesis
Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Rungkarn Siricharoenchai, Yoshinori Sagisaka
Reaction time as an indicator of discrete intonational contrasts in English
Aoju Chen
Corpus-based syntax-prosody tree matching
Dafydd Gibbon
A new approach to segment and detect syllables from high-speed speech
D.W. Ying, W. Gao, W.Q. Wang
Information structure and efficiency in speech production
R.J.J.H. van Son, Louis C.W. Pols
Learning rule ranking by dynamic construction of context-free grammars using AND/OR graphs
Anna Corazza, Louis ten Bosch
The effect of surrounding phrase lengths on pause duration
Elena Zvonik, Fred Cummins
Statistical estimation of phoneme's most stable point based on universal constraint
Shigeki Okawa, Katsuhiko Shirai
Independent automatic segmentation by self-learning categorial pronunciation rules
N. Beringer
Prosodic correlates of contrastive and non-contrastive themes in German
Bettina Braun, D. Robert Ladd
Accentual lengthening in standard Chinese: evidence from four-syllable constituents
Yiya Chen
Syllable structure based phonetic units for context-dependent continuous Thai speech recognition
Supphanat Kanokphara
An acoustic phonetic analysis of diphthongs in ningbo Chinese
Fang Hu
Latent ability to manipulate phonemes by Japanese preliterates in roman alphabet
Takashi Otake, Yoko Sakamoto
The /i/-/a/-/u/-ness of spoken vowels
Hartmut R. Pfitzinger
Transforming F0 contours
Ben Gillett, Simon King
Evaluation of the affect of speech intonation using a model of the perception of interval dissonance and harmonic tension
Norman D. Cook, Takeshi Fujisawa, Kazuaki Takami
A new pitch modeling approach for Mandarin speech
Wen-Hsing Lai, Yih-Ru Wang, Sin-Horng Chen
Bayesian induction of intonational phrase breaks
P. Zervas, M. Maragoudakis, Nikos Fakotakis, George Kokkinakis
Predicting the perceptive judgment of voices in a telecom context: selection of acoustic parameters
T. Ehrette, N. Chateau, Christophe d'Alessandro, V. Maffiolo
Stress-based speech segmentation revisited
Sven L. Mattys
Emotion recognition by speech signals
Oh-Wook Kwon, Kwokleung Chan, Jiucang Hao, Te-Won Lee
Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system
Fabio Tamburini
Improved emotion recognition with large set of statistical features
Vladimir Hozjan, Zdravko Kacic
Recognition of intonation patterns in Thai utterance
Patavee Charnvivit, Nuttakorn Thubthong, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin, Somchai Jitapunkul
Use of linguistic information for automatic extraction of f_0 contour generation process model parameters
Keikichi Hirose, Yusuke Furuyama, Shuichi Narusawa, Nobuaki Minematsu, Hiroya Fujisaki
Potential audiovisual correlates of contrastive focus in French
Marion Dohen, Hélène Loevenbruck, Marie-Agnes Cathiard, Jean-Luc Schwartz
How does human segment the speech by prosody ?
Toshie Hatano, Yasuo Horiuchi, Akira Ichikawa
Language-reconfigurable universal phone recognition
B.D. Walker, B.C. Lackey, J.S. Muller, P.J. Schone
Emotion recognition using a data-driven fuzzy inference system
Chul Min Lee, Shrikanth Narayanan
Effects of voice prosody by computers on human behaviors
Noriko Suzuki, Yohei Yabuta, Yugo Takeuchi, Yasuhiro Katagiri
An investigation of intensity patterns for German
Oliver Jokisch, Marco Kuhne
Segmental durations predicted with a neural network
Joao Paulo Teixeira, Diamantino Freitas
Generation and perception of f_0 markedness in conversational speech with adverbs expressing degrees
Takumi Yamashita, Yoshinori Sagisaka
Quantitative analysis and synthesis of syllabic tones in vietnamese
Hansjorg Mixdorff, Nguyen Hung Bach, Hiroya Fujisaki, Mai Chi Luong
Japanese prosodic labeling support system utilizing linguistic information
Shinya Kiriyama, Yoshifumi Mitsuta, Yuta Hosokawa, Yoshikazu Hashimoto, Toshihiko Ito, Shigeyoshi Kitazawa
Why and how to control the authentic emotional speech corpora
Veronique Auberge, Nicolas Audibert, Albert Rilliard
Prosodic cues for emotion characterization in real-life spoken dialogs
Laurence Devillers, Ioana Vasilescu
Towards the automatic generation of mixed-initiative dialogue systems from web content
Joseph Polifroni, Grace Chung, Stephanie Seneff
A context resolution server for the galaxy conversational systems
Edward Filisko, Stephanie Seneff
Semantic and dialogic annotation for automated multilingual customer service
Hilda Hardy, Kirk Baker, Hélène Bonneau-Maynard, Laurence Devillers, Sophie Rosset, Tomek Strzalkowski
Disfluency under feedback and time-pressure
H.B.M. Nicholson, E.G. Bard, A.H. Anderson, M.L. Flecha-Garcia, D. Kenicer, L. Smallwood, J. Mullin, R.J. Lickley, Y. Chen
Control in task-oriented dialogues
Peter A. Heeman, Fan Yang, Susan E. Strayer
The 300k LIMSI German broadcast news transcription system
Kevin McTait, Martine Adda-Decker
Weighted entropy training for the decision tree based text-to-phoneme mapping
Jilei Tian, Janne Suontausta, Juha Hakkinen
Word class modeling for speech recognition with out-of-task words using a hierarchical language model
Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka, Genichiro Kikui
Compound decomposition in dutch large vocabulary speech recognition
Roeland Ordelman, Arjan van Hessen, Franciska de Jong
Designing for errors: similarities and differences of disfluency rates and prosodic characteristics across domains
Guergana Savova, Joan Bachenko
Syllable classification using articulatory-acoustic features
Mirjam Wester
Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition
Imed Zitouni, Olivier Siohan, Chin-Hui Lee
Incremental and iterative monolingual clustering algorithms
Sergio Barrachina, Juan Miguel Vilar
Techniques for effective vocabulary selection
Anand Venkataraman, Wen Wang
Recognition of out-of-vocabulary words with sub-lexical language models
Lucian Galescu
A semantic representation for spoken dialogs
Hélène Bonneau-Maynard, Sophie Rosset
A corpus-based decompounding algorithm for German lexical modeling in LVCSR
Martine Adda-Decker
Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition
Kyong-Nim Lee, Minhwa Chung
Unit selection based on voice recognition
Yi Zhou, Yiqing Zu
On unit analysis for Cantonese corpus-based TTS
Jun Xu, Thomas Choy, Minghui Dong, Cuntai Guan, Haizhou Li
Unit selection in concatenative TTS synthesis systems based on mel filter bank amplitudes and phonetic context
T. Lambert, Andrew P. Breen, Barry Eggleton, Stephen J. Cox, Ben P. Milner
Text design for TTS speech corpus building using a modified greedy selection
Baris Bozkurt, Ozlem Ozturk, Thierry Dutoit
Discriminative weight training for unit-selection based speech synthesis
Seung Seop Park, Chong Kyu Kim, Nam Soo Kim
The application of interactive speech unit selection in TTS systems
Peter Rutten, Justin Fackrell
On the design of cost functions for unit-selection speech synthesis
Francisco Campillo Diaz, Eduardo R. Banga
Kalman-filter based join cost for unit-selection speech synthesis
Jithendra Vepa, Simon King
Optimizing integrated cost function for segment selection in concatenative speech synthesis based on perceptual evaluations
Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki
Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction
Jindrich Matousek, Daniel Tihelka, Josef Psutka
Automatic speech segmentation and verification for concatenative synthesis
Chih-Chung Kuo, Chi-Shiang Kuo, Jau-Hung Chen, Sen-Chia Chang
DTW-based phonetic alignment using multiple acoustic features
Sergio Paulo, Luis C. Oliveira
Evaluating and correcting phoneme segmentation for unit selection synthesis
John Kominek, Christina L. Bennett, Alan W. Black
Control and prediction of the impact of pitch modification on synthetic speech quality
Esther Klabbers, Jan P.H. van Santen
My voice, your prosody: sharing a speaker specific prosody model across speakers in unit selection TTS
Matthew Aylett, Justin Fackrell, Peter Rutten
Learning phrase break detection in Thai text-to-speech
Virongrong Tesprasit, Paisarn Charoenpornsawat, Virach Sornlertlamvanich
A speech model of acoustic inventories based on asynchronous interpolation
Alexander B. Kain, Jan P.H. van Santen
Corpus-based synthesis of fundamental frequency contours of Japanese using automatically-generated prosodic corpus and generation process model
Keikichi Hirose, Takayuki Ono, Nobuaki Minematsu
Unit size in unit selection speech synthesis
S.P. Kishore, Alan W. Black
Restricted unlimited domain synthesis
Antje Schweitzer, Norbert Braunschweiler, Tanja Klankert, Bernd Möbius, Bettina Sauberlich
Evaluation of units selection criteria in corpus-based speech synthesis
Hélène Francois, Olivier Boeffard
Combining non-uniform unit selection with diphone based synthesis
Michael Pucher, Friedrich Neubarth, Erhard Rank, Georg Niklfeld, Qi Guan
Evolutionary weight tuning based on diphone pairs for unit selection speech synthesis
Francesc Alias, Xavier Llora
Keeping rare events rare
Ove Andersen, Charles Hoequist
Analysis of the Aurora large vocabulary evaluations
N. Parihar, Joseph Picone
Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 databases
Florian Hilger, Hermann Ney
Large vocabulary noise robustness on Aurora4
Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua
Evaluation of model-based feature enhancement on the AURORA-4 task
Veronique Stouten, Hugo van Hamme, Jacques Duchateau, Patrick Wambacq
Improved feature extraction based on spectral noise reduction and nonlinear feature normalization
Jose C. Segura, Javier Ramirez, Carmen Benitez, Angel de la Torre, Antonio J. Rubio
Feature compensation technique for robust speech recognition in noisy environments
Young Joon Kim, Hyun Woo Kim, Woohyung Lim, Nam Soo Kim
The statistical approach to machine translation and a roadmap for speech translation
Hermann Ney
Coupling vs. unifying: modeling techniques for speech-to-speech translation
Yuqing Gao
Speechalator: two-way speech-to-speech translation on a consumer PDA
Alex Waibel, Ahmed Badran, Alan W. Black, Robert Frederking, Donna Gates, Alon Lavie, Lori Levin, Kevin A. Lenzo, Laura Mayfield Tomokiyo, Jurgen Reichert, Tanja Schultz, Dorcas Wallace, Monika Woszczyna, Jing Zhang
Development of phrase translation systems for handheld computers: from concept to field
Horacio Franco, Jing Zheng, Kristin Precoda, Federico Cesari, Victor Abrash, Dimitra Vergyri, Anand Venkataraman, Harry Bratt, Colleen Richey, Ace Sarich
Evaluation frameworks for speech translation technologies
Marcello Federico
Creating corpora for speech-to-speech translation
Genichiro Kikui, Eiichiro Sumita, Toshiyuki Takezawa, Seiichi Yamamoto
Prosodic analysis and modeling of the NAGAUTA singing to synthesize its prosodic patterns from the standard notation
Nobuaki Minematsu, Bungo Matsuoka, Keikichi Hirose
Statistical evaluation of the influence of stress on pitch frequency and phoneme durations in farsi language
D. Gharavian, S.M. Ahadi
Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries
K. Chen, S. Borys, Mark Hasegawa-Johnson, J. Cole
Prediction of fujisaki model's phrase commands
Joao Paulo Teixeira, Diamantino Freitas, Hiroya Fujisaki
Corpus-based modeling of naturalness estimation in timing control for non-native speech
Makiko Muto, Yoshinori Sagisaka, Takuro Naito, Daiju Maeki, Aki Kondo, Katsuhiko Shirai
Perceptually-related acoustic-prosodic features of phrase finals in spontaneous speech
Carlos Toshinori Ishi, Parham Mokhtari, Nick Campbell
Efficient linear combination for distant n-gram models
David Langlois, Kamel Smaili, Jean-Paul Haton
Improving a connectionist based syntactical language model
Ahmad Emami
Using untranscribed user utterances for improving language models based on confidence scoring
Mikio Nakano, Timothy J. Hazen
Improved Chinese broadcast news transcription by language modeling with temporally consistent training corpora and iterative phrase extraction
Pi-Chuan Chang, Shuo-Peng Liao, Lin-shan Lee
Language model adaptation using word clustering
Shinsuke Mori, Masafumi Nishimura, Nobuyasu Itoh
Hierarchical topic classification for dialog speech recognition based on language model switching
Ian R. Lane, Tatsuya Kawahara, Tomoko Matsui, Satoshi Nakamura
Linear predictive method with low-frequency emphasis
Paavo Alku, Tom Backstrom
Beyond a single critical-band in TRAP based ASR
Pratibha Jain, Hynek Hermansky
Variational Bayesian GMM for speech recognition
Fabio Valente, Christian Wellekens
Time alignment for scenario and sounds with voice, music and BGM
Yamato Wada, Masahide Sugiyama
Efficient quantization of speech excitation parameters using temporal decomposition
Phu Chien Nguyen, Masato Akagi
Distributed genetic algorithm to discover a wavelet packet best basis for speech recognition
Robert van Kommer, Beat Hirsbrunner
New model-based HMM distances with applications to run-time ASR error estimation and model tuning
Chao-Shih Huang, Chin-Hui Lee, Hsiao-Chuan Wang
Analysis of voice source characteristics using a constrained polynomial model
Tokihiko Kaburagi, Koji Kawai
Tone pattern discrimination combining parametric modeling and maximum likelihood estimation
Jinfu Ni, Hisashi Kawai
Feature selection for the classification of crosstalk in multi-channel audio
Stuart N. Wrigley, Guy J. Brown, Vincent Wan, Steve Renals
A DTW-based DAG technique for speech and speaker feature analysis
Jingwei Liu
Feature transformations and combinations for improving ASR performance
Panu Somervuo, Barry Chen, Qifeng Zhu
On the role of intonation in the organization of Mandarin Chinese speech prosody
Chiu-yu Tseng
An optimized multi-duration HMM for spontaneous speech recognition
Yuichi Ohkawa, Akihiro Yoshida, Motoyuki Suzuki, Akinori Ito, Shozo Makino
Speaker recognition using MPEG-7 descriptors
Hyoung-Gook Kim, Edgar Berdahl, Nicolas Moreau, Thomas Sikora
A comparative study on maximum entropy and discriminative training for acoustic modeling in automatic speech recognition
Wolfgang Macherey, Hermann Ney
Extraction methods of voicing feature for robust speech recognition
Andras Zolnay, Ralf Schluter, Hermann Ney
Use of a CSP-based voice activity detector for distant-talking ASR
Luca Armani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer
Maximum conditional mutual information projection for speech recognition
Mohamed Kamal Omar, Mark Hasegawa-Johnson
A computational model of arm gestures in conversation
Dafydd Gibbon, Ulrike Gut, Benjamin Hell, Karin Looks, Alexandra Thies, Thorsten Trippel
Nonlinear analysis of speech signals: generalized dimensions and lyapunov exponents
Vassilis Pitsikalis, Iasonas Kokkinos, Petros Maragos
Time-domain based temporal processing with application of orthogonal transformations
Petr Motlicek, Jan Cernocký
Recognition of phoneme strings using TRAP technique
Petr Schwarz, Pavel Matejka, Jan Cernocký
Comparative study on hungarian acoustic model sets and training methods
Tibor Fegyo, Peter Mihajlik, Peter Tatai
F_0 estimation of one or several voices
Alain de Cheveigne, Alexis Baskind
In search of target class definition in tandem feature extraction
Sunil Sivadas, Hynek Hermansky
Segmentation of speech for speaker and language recognition
Andre G. Adami, Hynek Hermansky
Feature generation based on maximum classification probability for improved speech recognition
Xiang Li, Richard M. Stern
Speech recognition with a generative factor analyzed hidden Markov model
Kaisheng Yao, Kuldip K. Paliwal, Te-Won Lee
Learning discriminative temporal patterns in speech: development of novel TRAPS-like classifiers
Barry Chen, Shuangyu Chang, Sunil Sivadas
Using mutual information to design class-specific phone recognizers
Patricia Scanlon, Daniel P.W. Ellis, Richard Reilly
Estimation of GMM in voice conversion including unaligned data
Helenca Duxans, Antonio Bonafonte
Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features
Keiichi Tokuda, Heiga Zen, Tadashi Kitamura
On the advantage of frequency-filtering features for speech recognition with variable sampling frequencies. experiments with speechdatcar databases
Hermann Bauerecker, Climent Nadeu, Jaume Padrell
Towards the automatic extraction of fujisaki model parameters for Mandarin
Hansjorg Mixdorff, Hiroya Fujisaki, Gao Peng Chen, Yu Hu
Product of Gaussians as a distributed representation for speech recognition
S.S. Airey, M.J.F. Gales
Harmonic weighting for all-pole modeling of the voiced speech
Davor Petrinovic
Estimation of resonant characteristics based on AR-HMM modeling and spectral envelope conversion of vowel sounds
Nobuyuki Nishizawa, Keikichi Hirose, Nobuaki Minematsu
Band-independent speech-event categories for TRAP based ASR
Hynek Hermansky, Pratibha Jain
Local averaging and differentiating of spectral plane for TRAP-based ASR
Frantisek Grezl, Hynek Hermansky
Minimum variance distortionless response on a warped frequency scale
Matthias Wolfel, John McDonough, Alex Waibel
Improving the efficiency of automatic speech recognition by feature transformation and dimensionality reduction
Xuechuan Wang, Douglas O'Shaughnessy
Distributed speech recognition on the WSJ task
Jan Stadermann, Gerhard Rigoll
Integrating multilingual articulatory features into speech recognition
Sebastian Stuker, Florian Metze, Tanja Schultz, Alex Waibel
Locus equations determination using the speechdat(II)
Bojan Petek
A memory-based approach to Cantonese tone recognition
Michael Emonts, Deryle Lonsdale
Experimental evaluation of the relevance of prosodic features in Spanish using machine learning techniques
David Escudero, Valentin Cardenoso, Antonio Bonafonte
Dominance spectrum based v/UV classification and f_0 estimation
Tomohiro Nakatani, Toshio Irino, Parham Zolfaghari
Analysis and modeling of f_0 contours of portuguese utterances based on the command-response model
Hiroya Fujisaki, Shuichi Narusawa, Sumio Ohno, Diamantino Freitas
Covariation and weighting of harmonically decomposed streams for ASR
Philip J.B. Jackson, David M. Moreno, Martin J. Russell, Javier Hernando
A semi-blind source separation method for hands-free speech recognition of multiple talkers
Panikos Heracleous, Satoshi Nakamura, Kiyohiro Shikano
Influence of the waveguide propagation on the antenna performance in a car cabin
Leonid Krasny, Ali Khayrallah
Multi-speaker DOA tracking using interactive multiple models and probabilistic data association
Ilyas Potamitis, George Tremoulis, Nikos Fakotakis
Speech enhancement using weighting function based on the variance of wavelet coefficients
Ching-Ta Lu, Hsiao-Chuan Wang
Microphone array voice activity detection and noise suppression using wideband generalized likelihood ratio
Ilyas Potamitis, Eran Fishler
Adaptive beamforming in room with reverberation
Zoran Saric, Slobodan Jovicic
Perceptually-constrained generalized singular value decomposition-based approach for enhancing speech corrupted by colored noise
Gwo-hwa Ju, Lin-shan Lee
Blind separation and deconvolution for convolutive mixture of speech using SIMO-model-based ICA and multichannel inverse filtering
Hiroaki Yamajo, Hiroshi Saruwatari, Tomoya Takatani, Tsuyoki Nishikawa, Kiyohiro Shikano
Quality enhancement of CELP coded speech by using an MFCC based Gaussian mixture model
D.G. Raza, C.F. Chan
Enhancement of noisy speech for noise robust front-end and speech reconstruction at back-end of DSR system
Hyoung-Gook Kim, Markus Schwab, Nicolas Moreau, Thomas Sikora
Improved kalman filter-based speech enhancement
Jianqiang Wei, Limin Du, Zhaoli Yan, Hui Zeng
Speech segregation based on fundamental event information using an auditory vocoder
Toshio Irino, Roy D. Patterson, Hideki Kawahara
Time delay estimation based on hearing characteristic
Zhaoli Yan, Limin Du, Jianqiang Wei, Hui Zeng
Parametric multi-band automatic gain control for noisy speech enhancement
M. Stolbov, S. Koval, M. Khitrov
Neural networks versus codebooks in an application for bandwidth extension of speech signals
Bernd Iser, Gerhard Schmidt
Wavelet-based perceptual speech enhancement using adaptive threshold estimation
Essa Jafer, Abdulhussain E. Mahdi
A trainable speech enhancement technique based on mixture models for speech and noise
Ilyas Potamitis, Nikos Fakotakis, George Kokkinakis
Perceptual wavelet adaptive denoising of speech
Qiang Fu, Eric A. Wan
Enhancement of speech in multispeaker environment
B. Yegnanarayana, S.R. Mahadeva Prasanna, Mathew Magimai Doss
Noise reduction using paired-microphones on non-equally-spaced microphone arrangement
Mitsunori Mizumachi, Satoshi Nakamura
Improving speech intelligibility by steady-state suppression as pre-processing in small to medium sized halls
Nao Hodoshima, Takayuki Arai, Tsuyoshi Inoue, Keisuke Kinoshita, Akiko Kusumoto
Enhancement of hearing-impaired Mandarin speech
Chen-Long Lee, Ya-Ru Yang, Wen-Whei Chang, Yuan-Chuan Chiang
Speech enhancement for a car environment using LP residual signal and spectral subtraction
A. Alvarez, V. Nieto, P. Gomez, R. Martinez
Speech enhancement and improved recognition accuracy by integrating wavelet transform and spectral subtraction algorithm
Gwo-hwa Ju, Lin-shan Lee
Multi-referenced correction of the voice timbre distortions in telephone networks
Gael Mahe, Andre Gilloire
Efficient speech enhancement based on left-right HMM with state sequence detection using LRT
J.J. Lee, J.H. Lee, K.Y. Lee
Introduction of the CELP structure of the GSM coder in the acoustic echo canceller for the GSM network
H. Gnaba, M. Turki-Hadj Alouane, M. Jaidane-Saidane, P. Scalart
Extracting an AV speech source from a mixture of signals
David Sodoyer, Laurent Girin, Christian Jutten, Jean-Luc Schwartz
Speech enhancement for hands-free car phones by adaptive compensation of harmonic engine noise components
Henning Puder
Enhance low-frequency suppression of GSC beamforming
Zhaorong Hou, Ying Jia
Speech enhancement using a-priori information
Sriram Srinivasan, Jonas Samuelsson, W. Bastiaan Kleijn
Blind inversion of multidimensional functions for speech enhancement
John Hogden, Patrick Valdez, Shigeru Katagiri, Erik McDermott
Convergence improvement for oversampled subband adaptive noise and echo cancellation
H.R. Abutalebi, H. Sheikhzadeh, R.L. Brennan, G.H. Freeman
A speech dereverberation method based on the MTF concept
Masashi Unoki, Keigo Sakata, Masato Akagi
Accuracy improved double-talk detector based on state transition diagram
SangGyun Kim, Jong Uk Kim, Chang D. Yoo
Perceptual based speech enhancement for normal-hearing and hearing-impaired individuals
Ajay Natarajan, John H.L. Hansen, Kathryn Arehart, Jessica A. Rossi-Katz
Residual echo power estimation for speech reinforcement systems in vehicles
Alfonso Ortega, Eduardo Lleida, Enrique Masgrau
Dual-mode wideband speech recovery from narrowband speech
Yasheng Qian, Peter Kabal
A robust noise and echo canceller
Khaldoon Al-Naimi, Christian Sturt, Ahmet Kondoz
Computational auditory scene analysis by using statistics of high-dimensional speech dynamics and sound source direction
Johannes Nix, Michael Kleinschmidt, Volker Hohmann
Two studies of open vs. directed dialog strategies in spoken dialog systems
Silke M. Witt, Jason D. Williams
The queen's communicator: an object-oriented dialogue manager
Ian O'Neill, Philip Hanna, Xingkun Liu, Michael McTear
Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda
Dan Bohus, Alexander I. Rudnicky
Features for tree based dialogue course management
Klaus Macherey, Hermann Ney
Development of a stochastic dialog manager driven by semantics
Francisco Torres, Emilio Sanchis, Encarna Segarra
Generation of natural response timing using decision tree based on prosodic and linguistic information
Masashi Takeuchi, Norihide Kitaoka, Seiichi Nakagawa
Child and adult speaker adaptation during error resolution in a publicly available spoken dialogue system
Linda Bell, Joakim Gustafson
Conceptual decoding for spoken dialog systems
Yannick Esteve, Christian Raymond, Frédéric Bechet, Renato De Mori
Sentence verification in spoken dialogue system
Huei-Ming Wang, Yi-Chung Lin
Detection and recognition of correction utterance in spontaneously spoken dialog
Norihide Kitaoka, Naoko Kakutani, Seiichi Nakagawa
Topic-specific parser design in an air travel natural language understanding application
Chaitanya J.K. Ekanadham, Juan M. Huerta
The use of confidence measures in vector based call-routing
Stephen J. Cox, Gavin Cawley
Multi-channel sentence classification for spoken dialogue language modeling
Frédéric Bechet, Giuseppe Riccardi, Dilek Z. Hakkani-Tur
Automatic induction of n-gram language models from a natural language grammar
Stephanie Seneff, Chao Wang, Timothy J. Hazen
Connectionist classification and specific stochastic models in the understanding process of a dialogue system
David Vilar, Maria Jose Castro, Emilio Sanchis
Robust parsing of utterances in negotiative dialogue
Johan Boye, Mats Wiren
Flexible speech act identification of spontaneous speech with disfluency
Chung-Hsien Wu, Gwo-Lang Yan
Efficient spoken dialogue control depending on the speech recognition rate and system's database
Kohji Dohsaka, Norihito Yasuda, Kiyoaki Aikawa
Robust speech understanding based on expected discourse plan
Shin-ya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta
A study on domain recognition of spoken dialogue systems
T. Isobe, S. Hayakawa, H. Murao, T. Mizutani, Kazuya Takeda, Fumitada Itakura
Domain adaptation augmented by state-dependence in spoken dialog systems
Wei He, Honglian Li, Baozong Yuan
Smartkom-home - an advanced multi-modal interface to home entertainment
Thomas Portele, Silke Goronzy, Martin Emele, Andreas Kellner, Sunna Torge, Jurgen te Vrugt
Methods to improve its portability of a spoken dialog system both on task domains and languages
Yunbiao Xu, Fengying Di, Masahiro Araki, Yasuhisa Niimi
Voxenter^TM - intelligent voice enabled call center for hungarian
Tibor Fegyo, Peter Mihajlik, Mate Szarvas, Peter Tatai, Gabor Tatai
Automatic call-routing without transcriptions
Qiang Huang, Stephen J. Cox
Jaspis^2 - an architecture for supporting distributed spoken dialogues
Markku Turunen, Jaakko Hakulinen
Development of a bilingual spoken dialog system for weather information retrieval
Janez Zibert, Sanda Martincic-Ipsic, Melita Hajdinjak, Ivo Ipsic, France Mihelic
Improving "how may i help you?" systems using the output of recognition lattices
James Allen, David Attwater, Peter Durston, Mark Farrell
Incremental learning of new user formulations in automatic directory assistance
M. Andorno, L. Fissore, P. Laface, M. Nigra, C. Popovici, F. Ravera, C. Vair
Dialog systems for automotive environments
Julie A. Baca, Feng Zheng, Hualin Gao, Joseph Picone
The development of a multi-purpose spoken dialogue system
Joao P. Neto, Nuno J. Mamede, Renato Cassaca, Luis C. Oliveira
The dynamic, multi-lingual lexicon in smartkom
Silke Goronzy, Zica Valsan, Martin Emele, Juergen Schimanowski
Evaluating discourse understanding in spoken dialogue systems
Ryuichiro Higashinaka, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa
Assessment of spoken dialogue system usability - what are we really measuring?
Lars Bo Larsen
Evaluation of a speech-driven telephone information service using the PARADISE framework: a closer look at subjective measures
Paula M.T. Smeele, Juliette A.J.S. Waals
Quantifying the impact of system characteristics on perceived quality dimensions of a spoken dialogue service
Sebastian Moller, Janto Skowronek
A programmable policy manager for conversational biometrics
Ganesh N. Ramaswamy, Ran D. Zilca, Oleg Alecksandrovich
Integration of speaker recognition into conversational spoken dialogue systems
Timothy J. Hazen, Douglas A. Jones, Alex Park, Linda C. Kukolich, Douglas A. Reynolds
Normalization of time-derivative parameters using histogram equalization
Yasunari Obuchi, Richard M. Stern
Tree-structured noise-adapted HMM modeling for piecewise linear-transformation-based adaptation
Zhipeng Zhang, Kiyotaka Otsuji, Sadaoki Furui
Maximum likelihood sub-band weighting for robust speech recognition
Donglai Zhu, Satoshi Nakamura, Kuldip K. Paliwal, Renhua Wang
Feature compensation scheme based on parallel combined mixture model
Wooil Kim, Sungjoo Ahn, Hanseok Ko
A comparison of three non-linear observation models for noisy speech features
Jasha Droppo, Li Deng, Alex Acero
A new supervised-predictive compensation scheme for noisy speech recognition
Khalid Daoudi, Murat Deviren
Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition
Andrzej Drygajlo, Didier Meuwly, Anil Alexander
Robust likelihood ratio estimation in Bayesian forensic speaker recognition
J. Gonzalez-Rodriguez, D. Garcia-Romero, M. Garcia-Gomar, D. Ramos-Castro, J. Ortega-Garcia
Automated speaker recognition in real world conditions: controlling the uncontrollable
Hirotaka Nakasone
Estimating the weight of evidence in forensic speaker verification
Beat Pfister, Rene Beutler
Auditory-instrumental forensic speaker recognition
Stefan Gfroerer
Earwitness line-ups: effects of speech duration, retention interval and acoustic environment on identification accuracy
J.H. Kerstholt, E.J.M. Jansen, A.G. van Amelsvoort, A.P.A. Broeders
Characteristics of authentic anger in hebrew speech
Noam Amir, Shirley Ziv, Rachel Cohen
Prosody-based classification of emotions in spoken finnish
Tapio Seppanen, Eero Vayrynen, Juhani Toivanen
Frequency distribution based weighted sub-band approach for classification of emotional/stressful content in speech
Mandar A. Rahurkar, John H.L. Hansen
Classifying subject ratings of emotional speech using acoustic features
Jackson Liscombe, Jennifer Venditti, Julia Hirschberg
Recognition of emotions in interactive voice response systems
Sherif Yacoub, Steve Simske, Xiaofan Lin, John Burns
We are not amused - but how do you know? user states in a multi-modal dialogue system
Anton Batliner, Viktor Zeissler, Carmen Frank, Johann Adelhardt, Rui P. Shi, Elmar Nöth
On-line user modelling in a mobile spoken dialogue system
Niels Ole Bernsen
Towards dynamic multi-domain dialogue processing
Botond Pakucs
User modeling in spoken dialogue systems for flexible guidance generation
Kazunori Komatani, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno
Empowering end users to personalize dialogue systems through spoken interaction
Stephanie Seneff, Grace Chung, Chao Wang
LET's GO: improving spoken dialog systems for the elderly and non-natives
Antoine Raux, Brian Langner, Alan W. Black, Maxine Eskenazi
Agents for integrated tutoring in spoken dialogue systems
Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen
Utterance verification under distributed detection and fusion framework
Taeyoon Kim, Hanseok Ko
Joint estimation of thresholds in a bi-threshold verification problem
Simon Ho, Brian Mak
Confidence measures for phonetic segmentation of continuous speech
Samir Nefti, Olivier Boeffard, Thierry Moudenc
Using confidence measures and domain knowledge to improve speech recognition
Pascal Wiggers, Leon J.M. Rothkrantz
Isolated word verification using cohort word-level verification
K. Thambiratnam, Sridha Sridharan
A new approach to minimize utterance verification error rate for a specific operating point
Wing-Hei Au, Man-Hung Siu
Continuous speech recognition and verification based on a combination score
Binfeng Yan, Rui Guo, Xiaoyan Zhu
Impact of word graph density on the quality of posterior probability based confidence measures
Tibor Fabian, Robert Lieb, Gunther Ruske, Matthias Thomae
An efficient keyword spotting technique using a complementary language for filler models training
Panikos Heracleous, Tohru Shimizu
Context-sensitive evaluation and correction of phone recognition output
Michael Levit, Hiyan Alshawi, Allen Gorin, Elmar Nöth
Estimating speech recognition error rate without acoustic test data
Yonggang Deng, Milind Mahajan, Alex Acero
Multigram-based grapheme-to-phoneme conversion for LVCSR
M. Bisani, Hermann Ney
Integrating statistical and rule-based knowledge for continuous German speech recognition
Rene Beutler, Beat Pfister
A fast, accurate and stream-based speaker segmentation and clustering algorithm
An Vandecatseye, Jean-Pierre Martens
A sequential metric-based audio segmentation method via the Bayesian information criterion
Shi-sian Cheng, Hsin-Min Wang
Sentence boundary detection in arabic speech
Amit Srivastava, Francis Kubala
Automated transcription and topic segmentation of large spoken archives
Martin Franz, Bhuvana Ramabhadran, Todd Ward, Michael Picheny
Automatic disfluency identification in conversational speech using multiple knowledge sources
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition
Natsuo Yamamoto, Jun Ogata, Yasuo Ariki
Hybrid HMM/BN ASR system integrating spectrum and articulatory features
Konstantin Markov, Jianwu Dang, Yosuke Iizuka, Satoshi Nakamura
Context-dependent output densities for hidden Markov models in speech recognition
Georg Stemmer, Viktor Zeissler, Christian Hacker, Elmar Nöth, Heinrich Niemann
Time adjustable mixture weights for speaking rate fluctuation
Takahiro Shinozaki, Sadaoki Furui
A switching linear Gaussian hidden Markov model and its application to nonstationary noise compensation for robust speech recognition
Jian Wu, Qiang Huo
On factorizing spectral dynamics for robust speech recognition
Vivek Tyagi, Iain A. McCowan, Hervé Bourlard, Hemant Misra
Joint model and feature based compensation for robust speech recognition under non-stationary noise environments
Chuan Jia, Peng Ding, Bo Xu
Weighted automata kernels - general framework and algorithms
Corinna Cortes, Patrick Haffner, Mehryar Mohri
Large margin methods for label sequence learning
Yasemin Altun, Thomas Hofmann
Robust multi-class boosting
Gunnar Ratsch
Statistical signal processing with nonnegativity constraints
Lawrence K. Saul, Fei Sha, Daniel D. Lee
Inline updates for HMMs
Ashutosh Garg, Manfred K. Warmuth
Factorial models and refiltering for speech separation and denoising
Sam T. Roweis
Using corpus-based methods for spoken access to news texts on the web
Alexandra Klein, Harald Trost
Cross-modal informational masking due to mismatched audio cues in a speechreading task
Douglas S. Brungart, Brian D. Simpson, Alex Kordik
Audiovisual speech enhancement based on the association between speech envelope and video features
Frédéric Berthommier
Robust speech interaction in a mobile environment through the use of multiple and different media input types
Rainer Wasinger, Christoph Stahl, Antonio Krueger
Speech-based, manual-visual, and multi-modal interaction with an in-car computer - evaluation of a pilot study
Rogier Woltjer, Wah Jin Tan, Fang Chen
Bayesian networks for spoken dialogue management in multimodal systems of tour-guide robots
Plamen Prodanov, Andrzej Drygajlo
Optimization of window and LSF interpolation factor for the ITU-t g.729 speech coding standard
Wai C. Chu, Toshio Miki
Likelihood ratio test with complex laplacian model for voice activity detection
Joon-Hyuk Chang, Jong-Won Shin, Nam Soo Kim
Multi-mode quantization of adjacent speech parameters using a low-complexity prediction scheme
Jani Nurminen
Multi-mode matrix quantizer for low bit rate LSF quantization
Ulpu Sinervo, Jani Nurminen, Ari Heikkinen, Jukka Saarinen
Voicing controlled frame loss concealment for adaptive multi-rate (AMR) speech frames in voice-over-IP
Frank Mertz, Herve Taddei, Imre Varga, Peter Vary
Perceptual irrelevancy removal in narrowband speech coding
Marja Lahdekorpi, Jani Nurminen, Ari Heikkinen, Jukka Saarinen
Very-low-rate speech compression by indexation of polyphones
Charles du Jeu, Maurice Charbit, Gérard Chollet
Entropy-optimized channel error mitigation with application to speech recognition over wireless
Victoria Sanchez, Antonio M. Peinado, Angel M. Gomez, Jose L. Perez-Cordoba
Robust jointly optimized multistage vector quantization for speech coding
Venkatesh Krishnan, David V. Anderson
Polar quantization of sinusoids from speech signal blocks
Harald Pobloth, Renat Vafin, W. Bastiaan Kleijn
Transcoding algorithm for g.723.1 and AMR speech coders: for interoperability between voIP and mobile networks
Sung-Wan Yoon, Jin-Kyu Choi, Hong-Goo Kang, Dae-Hee Youn
Quality-complexity trade-off in predictive LSF quantization
Davorka Petrinovic, Davor Petrinovic
Variable bit rate control with trellis diagram approximation
Kei Kikuiri, Nobuhiko Naka, Tomoyuki Ohya
Towards optimal encoding for classification with applications to distributed speech recognition
Naveen Srinivasamurthy, Antonio Ortega, Shrikanth Narayanan
Multi-rate extension of the scalable to lossless PSPIHT audio coder
Mohammed Raad, Ian Burnett, Alfred Mertins
Entropy constrained quantization of LSP parameters
Turaj Zakizadeh Shabestary, Per Hedelin, Fredrik Norden
Named entity extraction from Japanese broadcast news
Akio Kobayashi, Franz J. Och, Hermann Ney
Morpheme-based lexical modeling for korean broadcast news transcription
Young-Hee Park, Dong-Hoon Ahn, Minhwa Chung
Data driven example based continuous speech recognition
Mathias De Wachter, Kris Demuynck, Dirk van Compernolle, Patrick Wambacq
Large vocabulary speaker independent isolated word recognition for embedded systems
Sergey Astrov, Bernt Andrassy
Low-latency incremental speech transcription in the synface project
Alexander Seward
Multilingual acoustic modeling using graphemes
S. Kanthak, Hermann Ney
A cross-media retrieval system for lecture videos
Atsushi Fujii, Katunobu Itou, Tomoyosi Akiba, Tetsuya Ishikawa
Building a test collection for speech-driven web retrieval
Atsushi Fujii, Katunobu Itou
Confidence measure driven scalable two-pass recognition strategy for large list grammars
Miroslav Novak, Diego Ruiz
An efficient, fast matching approach using posterior probability estimates in speech recognition
Sherif Abdou, Michael S. Scordilis
On lexicon creation for turkish LVCSR
Kadri Hacioglu, Bryan Pellom, Tolga Ciloglu, Ozlem Ozturk, Mikko Kurimo, Mathias Creutz
Compiling large-context phonetic decision trees into finite-state transducers
Stanley F. Chen
Automatic summarization of broadcast news using structural features
Sameer Raj Maskey, Julia Hirschberg
A dynamic cross-reference pruning strategy for multiple feature fusion at decoder run time
Yonghong Yan, Chengyi Zheng, Jianping Zhang, Jielin Pan, Jiang Han, Jian Liu
Design of the CMU sphinx-4 decoder
Paul Lamere, Philip Kwok, William Walker, Evandro Gouvea, Rita Singh, Bhiksha Raj, Peter Wolf
A new decoder design for large vocabulary turkish speech recognition
Onur Cilingir, Mubeccel Demirekler
Automatic speech recognition with sparse training data for dysarthric speakers
Phil Green, James Carmichael, Athanassios Hatzis, Pam Enderby, Mark Hawley, Mark Parker
Prediction of sentence importance for speech summarization using prosodic parameters
Akira Inoue, Takayoshi Mikami, Yoichi Yamashita
An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker
Chong-kai Wang, Ren-Yuan Lyu, Yuang-Chin Chiang
Speech shift: direct speech-input-mode switching through intentional control of voice pitch
Masataka Goto, Yukihiro Omoto, Katunobu Itou, Tetsunori Kobayashi
Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task
Masahiko Matsushita, Hiromitsu Nishizaki, Takehito Utsuro, Yasuhiro Kodama, Seiichi Nakagawa
Semantic object synchronous understanding in SALT for highly interactive user interface
Kuansan Wang
Information retrieval based call classification
Jan Kneissler, Anne K. Kienappel, Dietrich Klakow
Using syllable-based indexing features and language models to improve German spoken document retrieval
Martha Larson, Stefan Eickeler
An empirical text transformation method for spontaneous speech synthesizers
Shiva Sundaram, Shrikanth Narayanan
A new approach to reducing alarm noise in speech
Yilmaz Gul, Aladdin M. Ariyaeeinia, Oliver Dewhirst
Improved name recognition with user modeling
Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, Alex Acero
Speech recognition over bluetooth wireless channels
Ziad Al Bawab, Ivo Locher, Jianxia Xue, Abeer Alwan
Speech starter: noise-robust endpoint detection by using filled pauses
Koji Kitayama, Masataka Goto, Katunobu Itou, Tetsunori Kobayashi
Automatic segmentation of film dialogues into phonemes and graphemes
Gilles Boulianne, Jean-Francois Beaumont, Patrick Cardinal, Michel Comeau, Pierre Ouellet, Pierre Dumouchel
Automated closed-captioning of live TV broadcast news in French
Julie Brousseau, Jean-Francois Beaumont, Gilles Boulianne, Patrick Cardinal, Claude Chapdelaine, Michel Comeau, Frédéric Osterrath, Pierre Ouellet
Automatic construction of unique signatures and confusable sets for natural language directory assistance applications
E.E. Jan, Benoit Maison, Lidia Mangu, Geoffrey Zweig
Recent enhancements in CU VOCAL for Chinese TTS-enabled applications
Helen M. Meng, Yuk-Chi Li, Tien-Ying Fung, Man-Cheuk Ho, Chi-Kin Keung, Tin-Hang Lo, Wai-Kit Lo, P.C. Ching
Evaluation of an alert system for selective dissemination of broadcast news
Isabel Trancoso, Joao P. Neto, Hugo Meinedo, Rui Amaral
Low complexity joint optimization of excitation parameters in analysis-by-synthesis speech coding
U. Mittal, J.P. Ashley, E.M. Cruz-Zeno
Named entity extraction from word lattices
James Horlock, Simon King
A topic classification system based on parametric trajectory mixture models
William Belfield, Herbert Gish
Model based noisy speech recognition with environment parameters estimated by noise adaptive speech recognition with prior
Kaisheng Yao, Kuldip K. Paliwal, Satoshi Nakamura
A harmonic-model-based front end for robust speech recognition
Michael L. Seltzer, Jasha Droppo, Alex Acero
A new perspective on feature extraction for robust in-vehicle speech recognition
Umit H. Yapanel, John H.L. Hansen
Speech recognition of double talk using SAFIA-based audio segregation
Toshiyuki Sekiya, Tetsuji Ogawa, Tetsunori Kobayashi
CFA-BF: a novel combined fixed/adaptive beamforming for robust speech recognition in real car environments
Xianxian Zhang, John H.L. Hansen
Audio-visual speech recognition in challenging environments
Gerasimos Potamianos, Chalapathy Neti
SYNFACE - a talking face telephone
Inger Karlsson, Andrew Faulkner, Giampiero Salvi
A voice-driven web browser for blind people
Bostjan Vesnicer, Janez Zibert, Simon Dobrisek, Nikola Pavesic, France Mihelic
Exploiting speech for recognizing elderly users to respond to their special needs
Christian Muller, Frank Wittig, Jorg Baus
Spoken language and e-inclusion
Alan F. Newell
Acoustic normalization of children's speech
Georg Stemmer, Christian Hacker, Stefan Steidl, Elmar Nöth
NIST 2003 language recognition evaluation
Alvin F. Martin, Mark A. Przybocki
Acoustic, phonetic, and discriminative approaches to automatic language identification
E. Singer, P.A. Torres-Carrasquillo, T.P. Gleason, W.M. Campbell, Douglas A. Reynolds
Using place name data to train language identification models
Stanley F. Chen, Benoit Maison
Use of trajectory models for automatic accent classification
Pongtep Angkititrakul, John H.L. Hansen
Language identification using parallel sub-word recognition - an ergodic HMM equivalence
V. Ramasubramanian, A.K.V. Sai Jayram, T.V. Sreenivas
On the combination of speech and speaker recognition
Mohamed Faouzi BenZeghiba, Hervé Bourlard
Vocal tract normalization as linear transformation of MFCC
Michael Pitz, Hermann Ney
Non-native spontaneous speech recognition through polyphone decision tree specialization
Zhirong Wang, Tanja Schultz
Live speech recognition in sports games by adaptation of acoustic model and language model
Yasuo Ariki, Takeru Shigemori, Tsuyoshi Kaneko, Jun Ogata, Masakiyo Fujimoto
Speaker adaptation using regression classes generated by phonetic decision tree-based successive state splitting
Se-Jin Oh, Kwang-Dong Kim, Duk-Gyoo Roh, Woo-Chang Sung, Hyun-Yeol Chung
Reduction of dimension of HMM parameters using ICA and PCA in MLLR framework for speaker adaptation
Jiun Kim, Jaeho Chung
Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation
Huayun Zhang, Bo Xu
Adapting language models for frequent fixed phrases by emphasizing n-gram subsets
Tomoyosi Akiba, Katunobu Itou, Atsushi Fujii
Learning intra-speaker model parameter correlations from many short speaker segments
Anne K. Kienappel
Modeling Cantonese pronunciation variation by acoustic model refinement
Patgi Kam, Tan Lee, Frank K. Soong
Performance improvement of rapid speaker adaptation based on eigenvoice and bias compensation
Jong Se Park, Hwa Jeon Song, Hyung Soon Kim
Training data optimization for language model adaptation
Xiaoshan Fang, Jianfeng Gao, Jianfeng Li, Huanye Sheng
Approaches to foreign-accented speaker-independent speech recognition
Stefanie Aalburg, Harald Hoege
Unsupervised speaker adaptation based on HMM sufficient statistics in various noisy environments
Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
Using genetic algorithms for rapid speaker adaptation
Fabrice Lauri, Irina Illina, Dominique Fohr, Filipp Korkmazsky
Structural state-based frame synchronous compensation
Vincent Barreaud, Irina Illina, Dominique Fohr, Filipp Korkmazsky
Effect of foreign accent on speech recognition in the NATO n-4 corpus
Aaron D. Lawson, David M. Harris, John J. Grieco
Duration normalization and hypothesis combination for improved spontaneous speech recognition
Jon P. Nedel, Richard M. Stern
Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMS
Wu Chou, Xiaodong He
On divergence based clustering of normal distributions and its application to HMM adaptation
Tor Andre Myrvoll, Frank K. Soong
Fast incremental adaptation using maximum likelihood regression and stochastic gradient descent
Sreeram V. Balakrishnan
Large vocabulary conversational speech recognition with a subspace constraint on inverse covariance matrices
Scott Axelrod, Vaibhava Goel, Brian Kingsbury, Karthik Visweswariah, Ramesh Gopinath
Speaker adaptation based on confidence-weighted training
Gyucheol Jang, Minho Jin, Chang D. Yoo
Jacobian adaptation based on the frequency-filtered spectral energies
Alberto Abad, Climent Nadeu, Javier Hernando, Jaume Padrell
Structural linear model-space transformations for speaker adaptation
Driss Matrouf, Olivier Bellot, Pascal Nocera, Georges Linares, Jean-Francois Bonastre
Minimum classification error (MCE) model adaptation of continuous density HMMS
Xiaodong He, Wu Chou
Adapting acoustic models to new domains and conditions using untranscribed data
Asela Gunawardana, Alex Acero
Tfarsdat - the telephone farsi speech database
Mahmood Bijankhan, Javad Sheykhzadegan, Mahmood R. Roohani, Rahman Zarrintare, Seyyed Z. Ghasemi, Mohammad E. Ghasedi
Large lexica for speech-to-speech translation: from specification to creation
Elviira Hartikainen, Giulio Maltese, Asunción Moreno, Shaunie Shammass, Ute Ziegenhain
A pronunciation lexicon for turkish based on two-level morphology
Kemal Oflazer, Sharon Inkelas
Using both global and local hidden Markov models for automatic speech unit segmentation
Hong Zheng, Yiqing Lu
Quality control of language resources at ELRA
Henk van den Heuvel, Khalid Choukri, Harald Hoge, Bente Maegaard, Jan Odijk, Valerie Mapelli
Validation of phonetic transcriptions based on recognition performance
Christophe van Bael, Diana Binnenpoorte, Helmer Strik, Henk van den Heuvel
The basque speech_dat (II) database: a description and first test recognition results
I. Hernaez, I. Luengo, E. Navas, M. Zubizarreta, I. Gaminde, J. Sanchez
Towards an evaluation standard for speech control concepts in real-world scenarios
Jens Maase, Diane Hirschfeld, Uwe Koloska, Timo Westfeld, Jorg Helbig
Orientel: recording telephone speech of turkish speakers in Germany
Chr. Draxler
Spanish broadcast news transcription
Gerhard Backfried, Roser Jaquemot Caldes
Large vocabulary continuous speech recognition in greek: corpus and an automatic dictation system
Vassilios Digalakis, Dimitrios Oikonomidis, D. Pratsolis, N. Tsourakis, C. Vosnidis, N. Chatzichrisafis, V. Diakoloukas
The LIUM-AVS database : a corpus to test lip segmentation and speechreading systems in natural conditions
Philippe Daubias, Paul Deleglise
Implementation and evaluation of a text-to-speech synthesis system for turkish
Ozgul Salor, Bryan Pellom, Mubeccel Demirekler
The czech speech and prosody database both for ASR and TTS purposes
Jachym Kolar, Jan Romportl, Josef Psutka
Construction of an advanced in-car spoken dialogue corpus and its characteristic analysis
Itsuki Kishida, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki
Measuring the readability of automatic speech-to-text transcripts
Douglas A. Jones, Florian Wolf, Edward Gibson, Elliott Williams, Evelina Fedorenko, Douglas A. Reynolds, Marc Zissman
The NESPOLE! voIP multilingual corpora in tourism and medical domains
Nadia Mana, Susanne Burger, Roldano Cattoni, Laurent Besacier, Victoria MacLaren, John McDonough, Florian Metze
Lexica and corpora for speech-to-speech translation: a trilingual approach
David Conejero, Jesus Gimenez, Victoria Arranz, Antonio Bonafonte, Neus Pascual, Nuria Castell, Asunción Moreno
From switchboard to fisher: telephone collection protocols, their uses and yields
Christopher Cieri, David Miller, Kevin Walker
Development of the estonian speechdat-like database
Einar Meister, Jurgen Lasn, Lya Meister
Towards a repository of digital talking books
Antonio Serralheiro, Isabel Trancoso, Diamantino Caseiro, Teresa Chambel, Luis Carrico, Nuno Guimaraes
Shared resources for robust speech-to-text technology
Stephanie Strassel, David Miller, Kevin Walker, Christopher Cieri
Towards synthesising expressive speech; designing and collecting expressive speech data
Nick Campbell
Is there an emotion signature in intonational patterns? and can it be used in synthesis?
Tanja Banziger, Michel Morel, Klaus R. Scherer
Multilayered extensions to the speech synthesis markup language for describing expressiveness
E. Eide, R. Bakis, W. Hamza, J. Pitrelli
Unit selection and emotional speech
Alan W. Black
Voice quality modification for emotional speech synthesis
Christophe d'Alessandro, Boris Doval
Applications of computer generated expressive speech for communication disorders
Jan P.H. van Santen, Lois Black, Gilead Cohen, Alexander B. Kain, Esther Klabbers, Taniya Mishra, Jacques de Villiers, Xiaochuan Niu
Speaker verification systems and security considerations
David A. van Leeuwen
Phonetic class-based speaker verification
Matthieu Hebert, Larry P. Heck
An evaluation of VTS and IMM for speaker verification in noise
Suhadi Suhadi, Sorel Stan, Tim Fingscheidt, Christophe Beaugeant
Locally recurrent probabilistic neural network for text-independent speaker verification
Todor Ganchev, Dimitris K. Tasoulis, Michael N. Vrahatis, Nikos Fakotakis
Learning to boost GMM based speaker verification
Stan Z. Li, Dong Zhang, Chengyuan Ma, Heung-Yeung Shum, Eric Chang
Speaker verification based on g.729 and g.723.1 coder parameters and handset mismatch compensation
Eric W.M. Yu, Man-Wai Mak, Chin-Hung Sit, Sun-Yuan Kung
Should i tell all?: an experiment on conciseness in spoken dialogue
Stephen Whittaker, Marilyn Walker, Preetam Maloor
Natural language response generation in mixed-initiative dialogs using task goals and dialog acts
Helen M. Meng, Wing Lin Yip, Oi Yan Mok, Shuk Fong Chan
Speech generation from concept for realizing conversation with an agent in a virtual room
Keikichi Hirose, Junji Tago, Nobuaki Minematsu
A trainable generator for recommendations in multimodal dialog
Marilyn Walker, Rashmi Prasad, Amanda Stent
Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy
Tatsuya Kawahara, Ryosuke Ito, Kazunori Komatani
SAG: a procedural tactical generator for dialog systems
Dalina Kallulli
A hidden Markov model-based missing data imputation approach
Yu Luo, Limin Du
Integration of noise reduction algorithms for Aurora2 task
Takeshi Yamada, Jiro Okada, Kazuya Takeda, Norihide Kitaoka, Masakiyo Fujimoto, Shingo Kuroiwa, Kazumasa Yamamoto, Takanobu Nishiura, Mitsunori Mizumachi, Satoshi Nakamura
Classification with free energy at raised temperatures
Rita Singh, Manfred K. Warmuth, Bhiksha Raj, Paul Lamere
Flooring the observation probability for robust ASR in impulsive noise
Pei Ding, Bertram E. Shi, Pascale Fung, Zhigang Cao
Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -
Masakiyo Fujimoto, Yasuo Ariki
Additive noise and channel distortion-robust parametrization tool - performance evaluation on Aurora 2 & 3
Petr Fousek, Petr Pollak
Robust feature extraction and acoustic modeling at multitel: experiments on the Aurora databases
Stephane Dupont, Christophe Ris
Noise robust speech parameterization based on joint wavelet packet decomposition and autoregressive modeling
Bojan Kotnik, Zdravko Kacic, Bogomir Horvat
Database adaptation for ASR in cross-environmental conditions in the SPEECON project
Christophe Couvreur, Oren Gedge, Klaus Linhard, Shaunie Shammass, Johan Vantieghem
Autoregressive modeling based feature extraction for Aurora3 DSR task
Petr Motlicek, Jan Cernocký
Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive
Edmondo Trentin, Marco Matassoni, Marco Gori
Revisiting scenarios and methods for variable frame rate analysis in automatic speech recognition
J. Macias-Guarasa, J. Ordonez, J.M. Montero, J. Ferreiros, R. Cordoba, L.F. D'Haro
Multitask learning in connectionist robust ASR using recurrent neural networks
Shahla Parveen, Phil Green
Confusion matrix based entropy correction in multi-stream combination
Hemant Misra, Andrew Morris
Dynamic channel compensation based on maximum a posteriori estimation
Huayun Zhang, Zhaobing Han, Bo Xu
Far-field ASR on inexpensive microphones
Laura Docio-Fernandez, David Gelbart, Nelson Morgan
Evaluation of ETSI advanced DSR front-end and bias removal method on the Japanese newspaper article sentences speech corpus
Satoru Tsuge, Shingo Kuroiwa, Kenji Kita
Environment adaptive control of noise reduction parameters for improved robustness of ASR
Chng Chin Soon, Bernt Andrassy, Josef Bauer, Gunther Ruske
Speech enhancement with microphone array and fourier / wavelet spectral subtraction in real noisy environments
Yuki Denda, Takanobu Nishiura, Hideki Kawahara
Environmental sound source identification based on hidden Markov model for robust speech recognition
Takanobu Nishiura, Satoshi Nakamura, Kazuhiro Miki, Kiyohiro Shikano
High-likelihood model based on reliability statistics for robust combination of features: application to noisy speech recognition
Peter Jancovic, Munevver Kokuer, Fionn Murtagh
Noise robust digit recognition with missing frames
Cenk Demiroglu, David V. Anderson
A noise-robust ASR back-end technique based on weighted viterbi recognition
Xiaodong Cui, Alexis Bernard, Abeer Alwan
Voice quality normalization in an utterance for robust ASR
Muhammad Ghulam, Takashi Fukuda, Tsuneo Nitta
Environmental sniffing: robust digit recognition for an in-vehicle environment
Murat Akbacak, John H.L. Hansen
Energy contour extraction for in-car speech recognition
Tai-Hwei Hwang
Noise-robust ASR by using distinctive phonetic features approximated with logarithmic normal distribution of HMM
Takashi Fukuda, Tsuneo Nitta
Noise-robust automatic speech recognition using orthogonalized distinctive phonetic feature vectors
Takashi Fukuda, Tsuneo Nitta
Language model accuracy and uncertainty in noise cancelling in the stochastic weighted viterbi algorithm
Nestor Becerra Yoma, Ivan Brito, Jorge Silva
Assessment of dereverberation algorithms for large vocabulary speech recognition systems
Koen Eneman, Jacques Duchateau, Marc Moonen, Dirk van Compernolle, Hugo van Hamme
Analysis and compensation of packet loss in distributed speech recognition using interleaving
Ben P. Milner, A.B. James
Non-linear compression of feature vectors using transform coding and non-uniform bit allocation
Ben P. Milner
Predictive hidden Markov model selection for decision tree state tying
Jen-Tzung Chien, Sadaoki Furui
Three simultaneous speech recognition by integration of active audition and face recognition for humanoid
Kazuhiro Nakadai, Daisuke Matsuura, Hiroshi G. Okuno, Hiroshi Tsujino
Mis-recognized utterance detection using multiple language models generated by clustered sentences
Katsuhisa Fujinaga, Hiroaki Kokubo, Hirofumi Yamamoto, Genichiro Kikui, Hiroshi Shimodaira
Using word confidence measure for OOV words detection in a spontaneous spoken dialog system
Hui Sun, Guoliang Zhang, Fang Zheng, Mingxing Xu
Speech recognition using EMG; mime speech recognition
Hiroyuki Manabe, Akira Hiraiwa, Toshiaki Sugimura
Automatic generation of non-uniform context-dependent HMM topologies based on the MDL criterion
Takatoshi Jitsuhiro, Tomoko Matsui, Satoshi Nakamura
Comparison of effects of acoustic and language knowledge on spontaneous speech perception/recognition between human and automatic speech recognizer
Norihide Kitaoka, Masahisa Shingu, Seiichi Nakagawa
Using statistical language modelling to identify new vocabulary in a grammar-based speech recognition system
Genevieve Gorrell
A source model mitigation technique for distributed speech recognition over lossy packet channels
Angel M. Gomez, Antonio M. Peinado, Victoria Sanchez, Antonio J. Rubio
The effect of an intermediate articulatory layer on the performance of a segmental HMM
Martin J. Russell, Philip J.B. Jackson
Automatic phone set extension with confidence measure for spontaneous speech
Yi Liu, Pascale Fung
Utterance verification using an optimized k-nearest neighbour classifier
R. Paredes, A. Sanchis, E. Vidal, A. Juan
A segment-based algorithm of speech enhancement for robust speech recognition
Guokang Fu, Ta-Hsin Li
Robust multiple resolution analysis for automatic speech recognition
Roberto Gemello, Franco Mana, Dario Albesano, Renato De Mori
An accurate noise compensation algorithm in the log-spectral domain for robust speech recognition
Mohamed Afify
A new adaptive long-term spectral estimation voice activity detector
Javier Ramirez, Jose C. Segura, Carmen Benitez, Angel de la Torre, Antonio J. Rubio
Robust speech recognition using non-linear spectral smoothing
Michael J. Carey
A novel use of residual noise model for modified PMC
Cailian Miao, Yangsheng Wang
Robust speech recognition to non-stationary noise based on model-driven approaches
Christophe Cerisara, Irina Illina
Towards missing data recognition with cepstral features
Christophe Cerisara
On-line parametric histogram equalization techniques for noise robust embedded speech recognition
Hemmo Haverinen, Imre Kiss
Compensation of channel distortion in line spectrum frequency domain
An-Tze Yu, Hsiao-Chuan Wang
Voicing parameter and energy based speech/non-speech detection for speech recognition in adverse conditions
Arnaud Martin, Laurent Mauuary
Two correction models for likelihoods in robust speech recognition using missing feature theory
Hugo van Hamme
Spectral maxima representation for robust automatic speech recognition
J. Sujatha, K.R. Prasanna Kumar, K.R. Ramakrishnan, N. Balakrishnan
Missing feature theory applied to robust speech recognition over IP network
Toshiki Endo, Shingo Kuroiwa, Satoshi Nakamura
Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-SNR car environments
Hesham Tolba, Sid-Ahmed Selouani, Douglas O'Shaughnessy
Robust speech recognition using missing feature theory in the cepstral or LDA domain
Hugo van Hamme
Bandwidth mismatch compensation for robust speech recognition
Yuan-Fu Liao, Jeng-Shien Lin, Wei-Ho Tsai
Markov chain monte carlo methods for noise robust feature extraction using the autoregressive model
Robert W. Morris, Jon A. Arrowood, Mark A. Clements
A comparative study of some discriminative feature reduction algorithms on the AURORA 2000 and the daimlerchrysler in-car ASR tasks
Joan Mari Hilario, Fritz Class
Large vocabulary ASR for spontaneous czech in the MALACH project
Josef Psutka, Pavel Ircing, J.V. Psutka, Vlasta Radova, William J. Byrne, Jan Hajic, Jiri Mirovsky, Samuel Gustman
Active and unsupervised learning for automatic speech recognition
Giuseppe Riccardi, Dilek Z. Hakkani-Tur
Perceptual MVDR-based cepstral coefficients (PMCCs) for high accuracy speech recognition
Umit H. Yapanel, Satya Dharanipragada, John H.L. Hansen
A discriminative decision tree learning approach to acoustic modeling
Sheng Gao, Chin-Hui Lee
Large corpus experiments for broadcast news recognition
Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua
Performance evaluation of phonotactic and contextual onset-rhyme models for speech recognition of Thai language
Somchai Jitapunkul, Ekkarit Maneenoi, Visarut Ahkuputra, Sudaporn Luksaneeyanawin
Overlapped di-tone modeling for tone recognition in continuous Cantonese speech
Yao Qian, Tan Lee, Yujia Li
Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation
Masafumi Nishida, Tatsuya Kawahara
Automatic transcription of football commentaries in the MUMIS project
Janienke Sturm, Judith M. Kessens, Mirjam Wester, Febe de Wet, Eric Sanders, Helmer Strik
On the limits of cluster-based acoustic modeling
S. Douglas Peters
Large vocabulary taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling
Dau-Cheng Lyu, Min-Siong Liang, Yuang-Chin Chiang, Chun-Nan Hsu, Ren-Yuan Lyu
A new spectral transformation for speaker normalization
Pierre L. Dognin, Amro El-Jaroudi
Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition
Hua Yu, Tanja Schultz
Fitting class-based language models into weighted finite-state transducer framework
Pavel Ircing, Josef Psutka
Multi-source training and adaptation for generic speech recognition
Fabrice Lefevre, Jean-Luc Gauvain, Lori Lamel
Toward domain-independent conversational speech recognition
Brian Kingsbury, Lidia Mangu, George Saon, Geoffrey Zweig, Scott Axelrod, Vaibhava Goel, Karthik Visweswariah, Michael Picheny
Comparative study of boosting and non-boosting training for constructing ensembles of acoustic models
Rong Zhang, Alexander I. Rudnicky
Discriminative optimization of large vocabulary Mandarin conversational speech recognition system
Peng Ding, Zhenbiao Chen, Sheng Hu, Shuwu Zhang, Bo Xu
Speech recognition with dynamic grammars using finite-state transducers
Johan Schalkwyk, Lee Hetherington, Ezra Story
FLavor: a flexible architecture for LVCSR
Kris Demuynck, Tom Laureys, Dirk van Compernolle, Hugo van Hamme
An architecture for rapid decoding of large vocabulary conversational speech
George Saon, Geoffrey Zweig, Brian Kingsbury, Lidia Mangu, Upendra Chaudhari
MMI-MAP and MPE-MAP for acoustic model adaptation
D. Povey, M.J.F. Gales, D.Y. Kim, P.C. Woodland
Lattice segmentation and minimum Bayes risk discriminative training
Vlasios Doumpiotis, Stavros Tsakalidis, William J. Byrne
Spoken language condensation in the 21st century
Klaus Zechner
Robust methods in automatic speech recognition and understanding
Sadaoki Furui
Parsing spontaneous speech
Rodolfo Delmonte
Model compression for GMM based speaker recognition systems
Douglas A. Reynolds
The awe and mystery of t-norm
Jiri Navratil, Ganesh N. Ramaswamy
Gaussian dynamic warping (GDW) method applied to text-dependent speaker detection and verification
Jean-Francois Bonastre, Philippe Morin, Jean-Claude Junqua
Modeling duration patterns for speaker recognition
Luciana Ferrer, Harry Bratt, Venkata R.R. Gadde, Sachin S. Kajarekar, Elizabeth Shriberg, Kemal Sonmez, Andreas Stolcke, Anand Venkataraman
Improved speaker verification through probabilistic subspace adaptation
Simon Lucey, Tsuhan Chen
An improved model-based speaker segmentation system
Peng Yu, Frank Seide, Chengyuan Ma, Eric Chang
A latent analogy framework for grapheme-to-phoneme conversion
Jerome R. Bellegarda
Conditional and joint models for grapheme-to-phoneme conversion
Stanley F. Chen
Mixed-lingual text analysis for polyglot TTS synthesis
Beat Pfister, Harald Romsdorfer
Identifying speakers in children's stories for speech synthesis
Jason Y. Zhang, Alan W. Black, Richard Sproat
Experimental tools to evaluate intelligibility of text-to-speech (TTS) synthesis: effects of voice gender and signal quality
Catherine Stevens, Nicole Lees, Julie Vonwiller
Arabic in my hand: small-footprint synthesis of egyptian arabic
Laura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo
Using acoustic models to choose pronunciation variations for synthetic voices
Christina L. Bennett, Alan W. Black
Comparative analysis and synthesis of formant trajectories of british and broad australian accents
Qin Yan, Saeed Vaseghi, Ching-Hsiang Ho, Dimitrios Rentzos, Emir Turajlic
Cycle extraction for perfect reconstruction and rate scalability
Miguel Arjona Ramirez
Adding fricatives to the portuguese articulatory synthesiser
Antonio Teixeira, Luis M.T. Jesus, Roberto Martinez
A hybrid method oriented to concatenative text-to-speech synthesis
Ignasi Iriondo, Francesc Alias, Javier Sanchis, Javier Melenchon
Custom-tailoring TTS voice font - keeping the naturalness when reducing database size
Yong Zhao, Min Chu, Hu Peng, Eric Chang
Schema-based modeling of phonemic restoration
Soundararajan Srinivasan, DeLiang Wang
Perception of voice-individuality for distortions of resonance/source characteristics and waveforms
Hisao Kuwabara
The perceptual cues of a high level pitch-accent pattern in Japanese: pitch-accent patterns and duration
Tsutomu Sato
Illusory continuity of intermittent pure tone in binaural listening and its dependency on interaural time difference
Mamoru Iwaki, Norio Nakamura
CART-based factor analysis of intelligibility reduction in Japanese English
Nobuaki Minematsu, Changchen Guo, Keikichi Hirose
Harmonic alternatives to sine-wave speech
Laszlo Toth, Andras Kocsor
Non-intrusive assessment of perceptual speech quality using a self-organising map
Dorel Picovici, Abdulhussain E. Mahdi
Inhibitory priming effect in auditory word recognition: the role of the phonological mismatch length between primes and targets
Sophie Dufour, Ronald Peereman
Recognising `real-life' speech with spem: a speech-based computational model of human speech recognition
Odette Scharenborg, Louis ten Bosch, Lou Boves
The effect of speech rate and noise on bilinguals' speech perception: the case of native speakers of arabic in israel
Judith Rosenhouse, Liat Kishon-Rabin
Subjective evaluations for perception of speaker identity through acoustic feature transplantations
Oytun Turk, Levent M. Arslan
Modelling human speech recognition using automatic speech recognition paradigms in speM
Odette Scharenborg, James M. McQueen, Louis ten Bosch, Dennis Norris
The effect of amplitude compression on wide band telephone speech for hearing-impaired elderly people
Mutsumi Saito, Kimio Shiraishi, Kimitoshi Fukudome
Word activation model by Japanese school children without knowledge of roman alphabet
Takashi Otake, Miki Komatsu
Multi-resolution auditory scene analysis: robust speech recognition using pattern-matching from a noisy signal
Sue Harding, Georg Meyer
Investigation of emotionally morphed speech perception and its structure using a high quality speech manipulation system
Hisami Matsui, Hideki Kawahara
Usefulness of phase spectrum in human speech perception
Kuldip K. Paliwal, Leigh Alsteris
Perception of English lexical stress by English and Japanese speakers: effect of duration and "realistic" intensity change
Shinichi Tokuma
French intonational rises and their role in speech seg mentation [sic]
Pauline Welby
Physical and perceptual configurations of Japanese fricatives from multidimensional scaling analyses
Won Tokuma
An acquisition model of speech perception with considerations of temporal information
Ching-Pong Au
An integrated system for smart-home control of appliances based on remote speech interaction
Ilyas Potamitis, K. Georgila, Nikos Fakotakis, George Kokkinakis
A spoken language interface to an electronic programme guide
Jianhong Jin, Martin J. Russell, Michael J. Carey, James Chapman, Harvey Lloyd-Thomas, Graham Tattersall
Towards a personal robot with language interface
L. Seabra Lopes, Antonio Teixeira, M. Rodrigues, D. Gomes, C. Teixeira, L. Ferreira, P. Soares, J. Girao, N. Senica
Preference, perception, and task completion of open, menu-based, and directed prompts for call routing: a case study
Jason D. Williams, Andrew T. Shaw, Lawrence Piano, Michael Abt
An integrated toolkit deploying speech technology for computer based speech training with application to dysarthric speakers
Athanassios Hatzis, Phil Green, James Carmichael, Stuart Cunningham, Rebecca Palmer, Mark Parker, Peter O'Neill
Towards best practices for speech user interface design
Bernhard Suhm
Design and evaluation of a limited two-way speech translator
David Stallard, John Makhoul, Frederick Choi, Ehry Macrostie, Premkumar Natarajan, Richard Schwartz, Bushra Zawaydeh
Multimodal interaction on PDA's integrating speech and pen inputs
Sorin Dusan, Gregory J. Gadbois, James Flanagan
Towards multimodal interaction with an intelligent room
Petra Gieselmann, Matthias Denecke
A multimodal conversational interface for a concept vehicle
Roberto Pieraccini, Krishna Dayanidhi, Jonathan Bloom, Jean-Gui Dahan, Michael Phillips, Bryan R. Goodman, K. Venkatesh Prasad
Context awareness using environmental noise classification
L. Ma, D.J. Smith, Ben P. Milner
Simple designing methods of corpus-based visual speech synthesis
Tatsuya Shiraishi, Tomoki Toda, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano
Comparing the usability of a user driven and a mixed initiative multimodal dialogue system for train timetable information
Janienke Sturm, Ilse Bakx, Bert Cranen, Jacques Terken
Read my tongue movements: bimodal learning to perceive and produce non-native speech /r/ and /l/
Dominic W. Massaro, Joanna Light
Low resource lip finding and tracking algorithm for embedded devices
Jesus F. Guitarte Perez, Klaus Lukas, Alejandro F. Frangi
Detection and separation of speech segment using audio and video information fusion
Futoshi Asano, Yoichi Motomura, Hideki Asoh, Takashi Yoshimura, Naoyuki Ichimura, Kiyoshi Yamamoto, Nobuhiko Kitawaki, Satoshi Nakamura
Resynthesis of 3d tongue movements from facial data
Olov Engwall, Jonas Beskow
Acquiring lexical information from multilevel temporal annotations
Thorsten Trippel, Felix Sasaki, Benjamin Hell, Dafydd Gibbon
LUCIA a new italian talking-head based on a modified cohen-massaro's labial coarticulation model
Piero Cosi, Andrea Fusaro, Graziano Tisato
A visual context-aware multimodal system for spoken language processing
Niloy Mukherjee, Deb Roy
Maximum entropy good-turing estimator for language modeling
Juan P. Piantanida, Claudio F. Estienne
Exploiting order-preserving perfect hashing to speedup n-gram language model lookahead
Xiaolong Li, Yunxin Zhao
Stem-based maximum entropy language models for inflectional languages
Dimitrios Oikonomidis, Vassilios Digalakis
Combination of a hidden tag model and a traditional n-gram model: a case study in czech speech recognition
Pavel Krbec, Petr Podvesky, Jan Hajic
Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner
Vesa Siivola, Teemu Hirsimaki, Mathias Creutz, Mikko Kurimo
Evaluation of the stochastic morphosyntactic language model on a one million word hungarian dictation task
Mate Szarvas, Sadaoki Furui
Automatic title generation for Chinese spoken documents considering the special structure of the language
Lin-shan Lee, Shun-Chuan Chen
Statistical speech-to-speech translation with multilingual speech recognition and bilingual-chunk parsing
Bo Xu, Shuwu Zhang, Chengqing Zong
Automatic extraction of bilingual chunk lexicon for spoken language translation
Limin Du, Boxing Chen
Multi-scale document expansion in English-Mandarin cross-language spoken document retrieval
Wai-Kit Lo, Yuk-Chi Li, Gina Levow, Hsin-Min Wang, Helen M. Meng
Mandarin speech prosody: issues, pitfalls and directions
Chiu-yu Tseng
A contrastive investigation of standard Mandarin and accented Mandarin
Aijun Li, Xia Wang
Emotion control of Chinese speech synthesis in natural environment
Jianhua Tao
Optimality criteria in inverse problems for tongue-jaw interaction
A.S. Leonov, V.N. Sorokin
FEM analysis based on 3-d time-varying vocal tract shape
Koji Sasaki, Nobuhiro Miki, Yoshikazu Miyanaga
Consideration of muscle co-contraction in a physiological articulatory model
Jianwu Dang, Kiyoshi Honda
Robust techniques for pre- and post-surgical voice analysis
Claudia Manfredi, Giorgio Peretti
Analysis of lossy vocal tract models for speech production
K. Schnell, A. Lacroix
Temporal properties of the nasals and nasalization in Cantonese
Beatrice Fung-Wah Khioe
Estimation of vocal noise in running speech by means of bi-directional double linear prediction
F. Bettens, F. Grenez, J. Schoentgen
Visualisation of the vocal tract based on estimation of vocal area functions and formant frequencies
Abdulhussain E. Mahdi
Reproducing laryngeal mechanisms with a two-mass model
Denisse Sciamarella, Christophe d'Alessandro
Methods for estimation of glottal pulses waveforms exciting voiced speech
Milan Bostik, Milan Sigmund
Acoustic modeling of american English lateral approximants
Zhaoyan Zhang, Carol Espy-Wilson, Mark Tiede
Translation and rotation of the cricothyroid joint revealed by phonation-synchronized high-resolution MRI
Sayoko Takano, Kiyoshi Honda, Shinobu Masaki, Yasuhiro Shimada, Ichiro Fujimoto
GMM-based voice conversion applied to emotional speech synthesis
Hiromichi Kawanami, Yohei Iwami, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
Probability models of formant parameters for voice conversion
Dimitrios Rentzos, Saeed Vaseghi, Qin Yan, Ching-Hsiang Ho, Emir Turajlic
Perceptually weighted linear transformations for voice conversion
Hui Ye, Steve Young
Voice conversion with smoothed GMM and MAP adaptation
Yining Chen, Min Chu, Eric Chang, Jia Liu, Runsheng Liu
A system for voice conversion based on adaptive filtering and line spectral frequency distance optimization for text-to-speech synthesis
Ozgul Salor, Mubeccel Demirekler, Bryan Pellom
Speaker conversion in ARX-based source-formant type speech synthesis
Hiroki Mori, Hideki Kasuya
Implementing an SSML compliant concatenative TTS system
Andrew P. Breen, Steve Minnis, Barry Eggleton
Acoustic variations of focused disyllabic words in Mandarin Chinese: analysis, synthesis and perception
Zhenglai Gu, Hiroki Mori, Hideki Kasuya
An approach to common acoustical pole and zero modeling of consecutive periods of voiced speech
Pedro Quintana-Morales, Juan L. Navarro-Mesa
Estimating the vocal-tract area function and the derivative of the glottal wave from a speech signal
Huiqun Deng, Michael Beddoes, Rabab Ward, Murray Hodgson
Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis
Parham Zolfaghari, Tomohiro Nakatani, Toshio Irino, Hideki Kawahara, Fumitada Itakura
Mixed physical modeling techniques applied to speech production
Matti Karjalainen
An expandable web-based audiovisual text-to-speech synthesis system
Sascha Fagel, Walter F. Sendlmeier
A reconstruction of farkas kempelen's speaking machine
P. Nikleczy, G. Olaszy
Acoustic model selection and voice quality assessment for HMM-based Mandarin speech synthesis
Wentao Gu, Keikichi Hirose
Modeling of various speaking styles and emotions for HMM-based speech synthesis
Junichi Yamagishi, Koji Onishi, Takashi Masuko, Takao Kobayashi
Towards the development of a brazilian portuguese text-to-speech system based on HMM
R. da S. Maia, Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, F.G.V. Resende Jr.
Grapheme to phoneme conversion and dictionary verification using graphonemes
Paul Vozila, Jeff Adams, Yuliya Lobacheva, Ryan Thomas
Improving the accuracy of pronunciation prediction for unit selection TTS
Justin Fackrell, Wojciech Skut, Kathrine Hammervold
Detection of list-type sentences
Taniya Mishra, Esther Klabbers, Jan P.H. van Santen
A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering
Ramon Prieto, Jing Jiang, Chi-Ho Choi
Mixed-lingual spoken word recognition by using VQ codebook sequences of variable length segments
Hiroaki Kojima, Kazuyo Tanaka
Low memory acoustic models for HMM based speech recognition
Tommi Lahti, Olli Viikki, Marcel Vasilache
Nearest-neighbor search algorithms based on subcodebook selection and its application to speech recognition
Jose A.R. Fonollosa
Non-linear maximum likelihood feature transformation for speech recognition
Mohamed Kamal Omar, Mark Hasegawa-Johnson
Automatic generation of context-independent variable parameter models using successive state and mixture splitting
Soo-Young Suk, Ho-Youl Jung, Hyun-Yeol Chung
Data driven generation of broad classes for decision tree construction in acoustic modeling
Andrej Zgank, Zdravko Kacic, Bogomir Horvat
An efficient integrated gender detection scheme and time mediated averaging of gender dependent acoustic models
Peder A. Olsen, Satya Dharanipragada
Syllable-based acoustic modeling for Japanese spontaneous speech recognition
Jun Ogata, Yasuo Ariki
Cross-stream observation dependencies for multi-stream speech recognition
Ozgur Cetin, Mari Ostendorf
Pruning transitions in a hidden Markov model with optimal brain surgeon
Brian Mak, Kin-Wah Chan
Using pitch frequency information in speech recognition
Mathew Magimai-Doss, Todd A. Stephenson, Hervé Bourlard
Hidden feature models for speech recognition using dynamic Bayesian networks
Karen Livescu, James Glass, Jeff Bilmes
An efficient viterbi algorithm on DBNs
Wei Hu, Yimin Zhang, Qian Diao, Shan Huang
Speech recognition based on syllable recovery
Li Zhang, William Edmondson
HARTFEX: a multi-dimensional system of HMM based recognisers for articulatory features extraction
Tarek Abu-Amer, Julie Carson-Berndsen
Automatic baseform generation from acoustic data
Benoit Maison
Data-driven pronunciation modeling for ASR using acoustic subword units
Thurid Spiess, Britta Wrede, Gernot A. Fink, Franz Kummert
Variable length mixtures of inverse covariances
Vincent Vanhoucke, Ananth Sankar
Semi-tied full deviation matrices for laplacian density models
Christoph Neukirchen
Acoustic modeling with mixtures of subspace constrained exponential models
Karthik Visweswariah, Scott Axelrod, Ramesh Gopinath
Discriminative estimation of subspace precision and mean (SPAM) models
Vaibhava Goel, Scott Axelrod, Ramesh Gopinath, Peder A. Olsen, Karthik Visweswariah
Model-integration rapid training based on maximum likelihood for speech recognition
Shinichi Yoshizawa, Kiyohiro Shikano
On the use of kernel PCA for feature extraction in speech recognition
Amaro Lima, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura
Time is of the essence - dynamic approaches to spoken language
Steven Greenberg
Spectro-temporal interactions in auditory and auditory-visual speech processing
Ken W. Grant, Steven Greenberg
Brain imaging correlates of temporal quantization in spoken language
David Poeppel
Temporal aspects of articulatory control
Elliot Saltzman
The temporal organisation of speech as gauged by speech synthesis
Brigitte Zellner Keller
Localized spectro-temporal features for automatic speech recognition
Michael Kleinschmidt
Modulation spectral filtering of speech
Les Atlas
A comparison of the data requirements of automatic speech recognition systems and human listeners
Roger K. Moore
Modeling linguistic features in speech recognition
Min Tang, Stephanie Seneff, Victor W. Zue
Impact of audio segmentation and segment clustering on automated transcription accuracy of large spoken archives
Bhuvana Ramabhadran, Jing Huang, Upendra Chaudhari, Giridharan Iyengar, Harriet J. Nock
Learning linguistically valid pronunciations from acoustic data
Francoise Beaufays, Ananth Sankar, Shaun Williams, Mitch Weintraub
Improvement of non-native speech recognition by effectively modeling frequently observed pronunciation habits
Nobuaki Minematsu, Koichi Osaki, Keikichi Hirose
Non-audible murmur recognition
Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell
Speaker modeling from selected neighbors applied to speaker recognition
Yassine Mami, Delphine Charlet
Who knows carl bildt? - and what if you don't?
Elisabeth Zetterholm, Kirk P.H. Sullivan, James Green, Erik Eriksson, Jan van Doorn, Peter E. Czigler
Improving the competitiveness of discriminant neural networks in speaker verification
C. Vivaracho-Pascual, J. Ortega-Garcia, L. Alonso-Romero, Q. Moro-Sancho
On the fusion of dissimilarity-based classifiers for speaker identification
Tomi Kinnunen, Ville Hautamaki, Pasi Franti
Robust speaker identification using posterior union models
Ji Ming, Darryl Stewart, Philip Hanna, Pat Corr, Jack Smith, Saeed Vaseghi
syncpitch: a pseudo pitch synchronous algorithm for speaker recognition
Ran D. Zilca, Jiri Navratil, Ganesh N. Ramaswamy
A method for on-line speaker indexing using generic reference models
Soonil Kwon, Shrikanth Narayanan
Discriminative training and maximum likelihood detector for speaker identification
M. Mihoubi, Gilles Boulianne, Pierre Dumouchel
Novel approaches for one- and two-speaker detection
Sachin S. Kajarekar, Andre G. Adami, Hynek Hermansky
Fusing high- and low-level features for speaker recognition
Joseph P. Campbell, Douglas A. Reynolds, Robert B. Dunn
Score normalisation applied to open-set, text-independent speaker identification
P. Sivakumaran, J. Fortuna, Aladdin M. Ariyaeeinia
On the number of Gaussian components in a mixture: an application to speaker verification tasks
Mijail Arcienega, Andrzej Drygajlo
Using accent information in ASR models for Swedish
Giampiero Salvi
Estimating Japanese word accent from syllable sequence using support vector machine
Hideharu Nakajima, Masaaki Nagata, Hisako Asano, Masanobu Abe
PPRLM optimization for language identification in air traffic control tasks
R. Cordoba, G. Prime, J. Macias-Guarasa, J.M. Montero, J. Ferreiros, J.M. Pardo
Spoken cross-language access to image collection via captions
Hsin-Hsi Chen
Understanding process for speech recognition
Salma Jamoussi, Kamel Smaili, Jean-Paul Haton
Collecting machine-translation-aided bilingual dialogues for corpus-based speech translation
Toshiyuki Takezawa, Genichiro Kikui
Combination of finite state automata and neural network for spoken language understanding
Chai Wutiwiwatchai, Sadaoki Furui
Discriminative methods for improving named entity extraction on speech data
James Horlock, Simon King
Improving statistical natural concept generation in interlingua-based speech-to-speech translation
Liang Gu, Yuqing Gao, Michael Picheny
How NLP techniques can improve speech understanding: ROMUS - a robust chunk based message understanding system using link grammars
Jerome Goulian, Jean-Yves Antoine, Franck Poirier
Discriminative training of n-gram classifiers for speech and text routing
Ciprian Chelba, Alex Acero
Correction of disfluencies in spontaneous speech using a noisy-channel approach
Matthias Honal, Tanja Schultz
Multi-class extractive voicemail summarization
Konstantinos Koumpis, Steve Renals
Active labeling for spoken language understanding
Gokhan Tur, Mazin Rahim, Dilek Z. Hakkani-Tur
Exploiting unlabeled utterances for spoken language understanding
Gokhan Tur, Dilek Z. Hakkani-Tur
Noise robustness in speech to speech translation
Fu-Hua Liu, Yuqing Gao, Liang Gu, Michael Picheny
Example-based bi-directional Chinese-English machine translation with semi-automatically induced grammars
K.C. Siu, Helen M. Meng, C.C. Wong
Spotting "hot spots" in meetings: human judgments and prosodic cues
Britta Wrede, Elizabeth Shriberg
Combination of CFG and n-gram modeling in semantic grammar learning
Ye-Yi Wang, Alex Acero
Automatic title generation for Chinese spoken documents using an adaptive k nearest-neighbor approach
Shun-Chuan Chen, Lin-shan Lee
Speech summarization using weighted finite-state transducers
Takaaki Hori, Chiori Hori, Yasuhiro Minami
Cross domain Chinese speech understanding and answering based on named-entity extraction
Yun-Tien Lee, Shun-Chuan Chen, Lin-shan Lee
Evaluation method for automatic speech summarization
Chiori Hori, Takaaki Hori, Sadaoki Furui
An information theoretic approach for using word cluster information in natural language call routing
Li Li, Feng Liu, Wu Chou
Unsupervised topic discovery applied to segmentation of news transcriptions
Sreenivasa Sista, Amit Srivastava, Francis Kubala, Richard Schwartz
do not attempt to light with match!: some thoughts on progress and research goals in spoken dialog systems
Paul Heisterkamp
Multimodality and speech technology: verbal and non-verbal communication in talking agents
Bjorn Granstrom, David House
Roadmaps, journeys and destinations speculations on the future of speech technology research
Ronald A. Cole
Spoken language output: realising the vision
Roger K. Moore
New MAP estimators for speaker recognition
P. Kenny, M. Mihoubi, Pierre Dumouchel
A new SVM approach to speaker identification and verification using probabilistic distance kernels
Pedro J. Moreno, Purdy P. Ho
Adaptive decision fusion for multi-sample speaker verification over GSM networks
Ming-Cheung Cheung, Man-Wai Mak, Sun-Yuan Kung
Environment adaptation for robust speaker verification
Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung
On cohort selection for speaker verification
Yaniv Zigel, Arnon Cohen
Speaker characterization using principal component analysis and wavelet transform for speaker verification
C. Tadj, A. Benlahouar
Unsupervised speaker indexing using anchor models and automatic transcription of discussions
Yuya Akita, Tatsuya Kawahara
A statistical approach to assessing speech and voice variability in speaker verification
Klaus R. Scherer, D. Grandjean, T. Johnstone, G. Klasmeyer, Tanja Banziger
Automatic singer identification of popular music recordings via estimation and modeling of solo vocal signal
Wei-Ho Tsai, Hsin-Min Wang, Dwight Rodgers
A DP algorithm for speaker change detection
Michele Vescovi, Mauro Cettolo, Romeo Rizzi
SOM as likelihood estimator for speaker clustering
Itshak Lapidot
Automatic estimation of perceptual age using speaker modeling techniques
Nobuaki Minematsu, Keita Yamauchi, Keikichi Hirose
Speaker recognition using local models
Ryan Rifkin
Dependence of GMM adaptation on feature post-processing for speaker recognition
Robbie Vogt, Jason Pelecanos, Sridha Sridharan
Text-independent speaker recognition by speaker-specific GMM and speaker adapted syllable-based HMM
Seiichi Nakagawa, Wei Zhang
On the amount of speech data necessary for successful speaker identification
Ales Padrta, Vlasta Radova
Speaker verification based on the German veridat database
Ulrich Turk, Florian Schiel
Recent progress in the decoding of non-native speech with multilingual acoustic models
V. Fischer, E. Janke, S. Kunzmann
An NN-based approach to prosodic information generation for synthesizing English words embedded in Chinese text
Wei-Chih Kuo, Li-Feng Lin, Yih-Ru Wang, Sin-Horng Chen
Speaker adaptation for non-native speakers using bilingual English lexicon and acoustic models
S. Matsunaga, A. Ogawa, Yoshikazu Yamaguchi, A. Imamura
Using the web for fast language model construction in minority languages
Viet Bac Le, Brigitte Bigi, Laurent Besacier, Eric Castelli
An approach to multilingual acoustic modeling for portable devices
Yan Ming Cheng, Chen Liu, Yuan-Jun Wei, Lynette Melnar, Changxue Ma
Cross-lingual pronunciation modelling for indonesian speech recognition
Terrence Martin, Torbjorn Svendsen, Sridha Sridharan
Language model adaptation using cross-lingual information
Woosung Kim, Sanjeev Khudanpur
Multilingual phone clustering for recognition of spontaneous indonesian speech utilising pronunciation modelling techniques
Eddie Wong, Terrence Martin, Torbjorn Svendsen, Sridha Sridharan
Language-adaptive persian speech recognition
Naveen Srinivasamurthy, Shrikanth Narayanan
Grapheme based speech recognition
Mirjam Killer, Sebastian Stuker, Tanja Schultz
Learning Chinese tones
Valery A. Petrushin
A pronunciation training system for Japanese lexical accents with corrective feedback in learner's voice
Keikichi Hirose, Frédéric Gendrin, Nobuaki Minematsu
Considerations on vowel durations for Japanese CALL system
Taro Mouri, Keikichi Hirose, Nobuaki Minematsu
Influence of recording equipment on the identification of second language phoneme contrasts
Hiroaki Kato, Masumi Nukinay, Hideki Kawaharay, Reiko Akahane-Yamada
Training a confidence measure for a reading tutor that listens
Yik-Cheung Tam, Jack Mostow, Joseph E. Beck, Satanjeev Banerjee
Evaluating the effect of predicting oral reading miscues
Satanjeev Banerjee, Joseph E. Beck, Jack Mostow
VISPER II - enhanced version of the educational software for speech processing courses
Miroslav Holada, Jan Nouza
The use of multiple pause information in dependency structure analysis of spoken Japanese sentences
Meirong Lu, Kazuyuki Takagi, Kazuhiko Ozeki
A neural network approach to dependency analysis of Japanese sentences using prosodic information
Kazuyuki Takagi, Mamiko Okimoto, Yoshio Ogawa, Kazuhiko Ozeki
Say-as classification for alphabetic words in Japanese texts
Hisako Asano, Masaaki Nagata, Masanobu Abe
Automatic transformation of environmental sounds into sound-imitation words based on Japanese syllable structure
Kazushi Ishihara, Yasushi Tsubota, Hiroshi G. Okuno
Decision tree-based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling
Heiga Zen, Keiichi Tokuda, Tadashi Kitamura
A statistical method of evaluating pronunciation proficiency for English words spoken by Japanese
Seiichi Nakagawa, Kazumasa Mori, Naoki Nakamura
Article |
---|