doi: 10.21437/Eurospeech.2001
ISSN: 1018-4074
Whither speech technology? - a twenty-first century perspective
Steven Greenberg
3g mobile networks and mobile internet as a promotor for new applications - challenges to industry and universities
Stefan Dobler, Hans Hermansson, Tor-Björn Minde
Universities and industry: marriage or co-operation between independent partners?
Ilkka Niiniluoto
Considerations on what industry expects from universities
Yrjö Neuvo
A perspective on industry/university relationships in the US
Gary Strong
ELRA contribution to bridge the gap between industry and academia
Khalid Choukri
Combining word- and class-based language models: a comparative study in several languages using automatic and manual word-clustering techniques
G. Maltese, P. Bravetti, H. Crépy, B. J. Grainger, M. Herzog, F. Palou
Multi-class composite n-gram language model using multiple word clusters and word successions
Shuntaro Isogai, Katsuhiko Shirai, Hirofumi Yamamoto, Yoshinori Sagisaka
Statistical language model based on a hierarchical approach: MCnv
Imed Zitouni, Kamel Smaili, Jean-Paul Haton
Quantization-based language model compression
E. W. D. Whittaker, Bhiksha Raj
Relations between vocal registers in voice breaks
Gerrit Bloothooft, Mieke van Wijck, Peter Pabon
A quasi-one-dimensional model of aerodynamic and acoustic flow in the time-varying vocal tract: source and excitation mechanisms
Gordon Ramsay
Spectral correlates of voice open quotient and glottal flow asymmetry : theory, limits and experimental data
Nathalie Henrich, Christophe d'Alessandro, Boris Doval
One-delayed-mass model for efficient synthesis of glottal flow
Federico Avanzini, Paavo Alku, Matti Karjalainen
Modeling pronunciation variation using context-dependent weighting and b/s refined acoustic modeling
Fang Zheng, Zhanjiang Song, Pascale Fung, William Byrne
Learning units for domain-independent out-of- vocabulary word modelling
Issam Bazzi, James Glass
Pronunciation variant analysis using speaking style parallel corpus
Hideharu Nakajima, Izumi Hirano, Yoshinori Sagisaka, Katsuhiko Shirai
Speech recognition for huge vocabularies by using optimized sub-word units
Jan Kneissler, Dietrich Klakow
Dynamic lexicon using phonetic features
Kyung-Tak Lee, Christian J. Wellekens
Triphone tying techniques combining a-priori rules and data driven methods
Ute Ziegenhain, Josef G. Bauer
Pronunciation modeling and lexical adaptation in midsize vocabulary ASR
Louis F. M. ten Bosch, Nick Cremelie
Estimating pronunciation variations from acoustic likelihood score for HMM reconstruction
Liu Yi, Pascale Fung
Breadth-first search for finding the optimal phonetic transcription from multiple utterances
M. Bisani, Hermann Ney
Improved data-driven generation of pronunciation dictionaries using an adapted word list
Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann
Segment-based recognition on the phonebook task: initial results and observations on duration modeling
Karen Livescu, James Glass
Multilingual text-to-phoneme mapping
Søren Kamaric Riis, Morten With Pedersen, Kare Jean Jensen
Pronunciation variation analysis with respect to various linguistic levels and contextual conditions for Mandarin Chinese
Ming-yi Tsai, Fu-chiang Chou, Lin-shan Lee
Hypothesis-driven accent discrimination
Laura Mayfield Tomokiyo
An approach to automatic phonetic baseform generation based on Bayesian networks
Changxue Ma, Mark A. Randolph
Towards discriminative lexicon optimization
Hauke Schramm, Peter Beyerlein
Model complexity optimization for nonnative English speakers
Xiaodong He, Yunxin Zhao
Pronunciation modeling in hungarian number recognition
Tibor Fegyó, Péter Mihajlik, Péter Tatai, Géza Gordos
Factors affecting schwa-insertion in final consonant clusters in standard dutch
Marc Swerts, Hanne Kloots, Steven Gillis, Georges De Schutter
Vowel height is intimately associated with stress accent in spontaneous american English discourse
Leah Hitchcock, Steven Greenberg
Finite state prosodic analysis of african corpus resources
Dafydd Gibbon
Acoustic correlates of emotion dimensions in view of speech synthesis
Marc Schröder, Roddy Cowie, Ellen Douglas-Cowie, Machiel Westerdijk, Stan Gielen
Measuring pitch range
Hanny den Ouden, Jacques Terken
Measuring speech rhythm
Dafydd Gibbon, Ulrike Gut
Tonal alignment, scaling and slope in Italian question and statement tunes
Mariapaola DImperio
Pragmatic temporal voice range profile as a tool in the research of speech styles
Antti Iivonen
Model based stress decision method
Wooil Kim, Taeyun Kim, Sungjoo Ahn, Hanseok Ko
Reduction of alternative pronunciations in the norwegian computational lexicon norkompleks
Torbjørn Nordgård, Arne Kjell Foldvik
The role of duration as a correlate of accent in lekeitio basque
Gorka Elordieta, José Ignacio Hualde
Word final aspiration as a phrase boundary cue: data from spontaneous Swedish discourse
Victoria Johansson, Merle Horne, Sven Strömqvist
Study and auto-detection of stress based on tonal pitch range in Mandarin
Xipeng Shen, Bo Xu
Classifying emotions in speech: a comparison of methods
Noam Amir, Ori Kerret, Dimitry Karlinski
Development of vowel quantity perception in late childhood
Dawn M. Behne, Peter E. Czigler, Kirk P.H. Sullivan
A study on the production-perception link of English vowels produced by native and non-native speakers
Byunggon Yang
Japanese can be aware of syllables and morae: evidence from Japanese-English bilingual children
Takashi Otake, Yuka Yamaguchi
Neural processes underlying perceptual learning of a difficult second language phonetic contrast
Daniel Callan, Keiichi Tajima, Akiko Callan, Reiko Akahane-Yamada, Shinobu Masaki
Human language identification with reduced segmental information: comparison between monolinguals and bilinguals
Masahiko Komatsu, Kazuya Mori, Takayuki Arai, Yuji Murahara
Coarticulatory effects in perception
Santiago Fernández, Sergio Feijóo
A case for multi-resolution auditory scene analysis
Sue Harding, Georg Meyer
Perceptual identification and normalization of synthesized French vowels from birth to adulthood
Lucie Ménard, Jean-Luc Schwartz, Louis-Jean Boë, Sonia Kandel, Nathalie Vallée
Perceptual categorization of maximal vowel spaces from birth to adulthood simulated by an articulatory model
Lucie Ménard, Louis-Jean Boë
A study on speech over the telephone and aging
Maxine Eskenazi, Alan W. Black
On the perception of voicing for plosives in noise
Marcia Chen, Abeer Alwan
Predicting visual consonant perception from physical measures
Jintao Jiang, Abeer Alwan, Edward T. Auer, Lynne E. Bernstein
Effects of noise adaptation on the perception of voiced plosives in isolated syllables
William A. Ainsworth, T. Cervera
On differential limen of word-based local speechrate variation in Japanese expressed by duration ratio
Makoto Hiroshige, Kenji Araki, Koji Tochinai
A multidimensional scaling study of fricatives; a comparison of perceptual and physical dimensions
Wan Tokuma
Reconstructing dialogue history
Marc Swerts, Emiel Krahmer
Timing and interaction of visual cues for prominence in audiovisual speech perception
David House, Jonas Beskow, Björn Granström
Modelling the perceptual identification of Japanese consonants from LPC cepstral distances
Masahiko Komatsu, Shinichi Tokuma, Won Tokuma, Takayuki Arai
Auditory-visual perception of lexical tone
Denis Burnham, Valter Ciocca, Stephanie Stokes
Syllable prominence: a matter of vocal effort, phonetic distinct-ness and top-down processing
Anders Eriksson, Gunilla C. Thunberg, Hartmut Traunmüller
Perceived prominence in terms of a linguistically motivated quantitative intonation model
Hansjörg Mixdorff, Christina Widera
Perception of coda voicing from properties of the onset and nucleus of 'led' and 'let'
Sarah Hawkins, Noël Nguyen
Auditory filter bank design using masking curves
L. Lin, E. Ambikairajah, W. H. Holmes
A new feature driven cochlear implant speech processing strategy
Dashtseren Erdenebat, Kitazawa Shigeyoshi, Kitamura Tatsuya
Noise robust feature extraction for ASR using the Aurora 2 database
Qifeng Zhu, Markus Iseli, Xiaodong Cui, Abeer Alwan
Investigations into tandem acoustic modeling for the Aurora task
Daniel P.W. Ellis, Manuel J. Reyes Gomez
Recognition performance of the siemens front-end with and without frame dropping on the Aurora 2 database
Bernt Andrassy, Damjan Vlaj, Christophe Beaugeant
A multiconditional robust front-end feature extraction with a noise reduction procedure based on improved spectral subtraction algorithm
Bojan Kotnik, Zdravko Kacic, Bogomir Horvat
Feature vector selection to improve ASR robustness in noisy conditions
Johan de Veth, Laurent Mauuary, Bernhard Noe, Febe de Wet, Jürgen Sienel, Louis Boves, Denis Jouvet
Comparison of spectral derivative parameters for robust speech recognition
Dusan Macho, Climent Nadeu
Robust digit recognition in noise: an evaluation using the AURORA corpus
Umit Yapanel, John H. L. Hansen, Ruhi Sarikaya, Bryan Pellom
Robust ASR based on clean speech models: an evaluation of missing data techniques for connected digit recognition in noise
Jon Barker, Martin Cooke, Phil Green
Evaluation of the SPLICE algorithm on the Aurora2 database
Jasha Droppo, Li Deng, Alex Acero
Model-based compensation of the additive noise for continuous speech recognition. experiments using the Aurora II database and tasks
José C. Segura, Angel de la Torre, M. Carmen Benitez, Antonio M. Peinado
MAP combination of multi-stream HMM or HMM/ANN experts
Andrew Morris, Astrid Hagen, Hervé Bourlard
Second order statistics spectrum estimation method for robust speech recognition
Bojan Jarc, Rudolf Babic
Feature extraction and model-based noise compensation for noisy speech recognition evaluated on AURORA 2 task
Kaisheng Yao, Jingdong Chen, Kuldip K. Paliwal, Satoshi Nakamura
Broadcast news LM adaptation using contemporary texts
Marcello Federico, Nicola Bertoldi
Topic detection for language model adaptation of highly-inflected languages by using a fuzzy comparison function
Mirjam Sepesy Maucec, Zdravko Kacic
Efficient stochastic finite-state networks for language modelling in spoken dialogue systems
Kallirroi Georgila, Nikos Fakotakis, George Kokkinakis
Language models conditioned on dialog state
Karthik Visweswariah, Harry Printz
Using information retrieval methods for language model adaptation
Langzhou Chen, Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Martine Adda
Making the tongue model talk: merging MRI & EMA measurements
Olov Engwall
The relationship between intraoral air pressure and tongue/palate contact during the articulation of norwegian /t/ and /d/
Inger Moen, Hanne Gram Simonsen, Morten Huseby, John Grue
Mechanical versus perceptual constraints as determinants of articulatory strategy
Ahmed M. Elgendy, Louis C. W. Pols
Pre-liquid excrescent schwa: what happens when vocalic targets conflict
Bryan Gick, Ian Wilson
Exploring the null space of the acoustic-to- articulatory inversion using a hypercube codebook
Slim Ouni, Yves Laprie
Phoneme-based topic spotting on the switchboard corpus
M. W. Theunissen, K. Scheffler, J. A. du Preez
Topic styles in IR and TDT: effect on system behavior
Martin Franz, J. Scott McCarley, Todd Ward, Wei-Jing Zhu
Extracting caller information from voicemail
Geoffrey Zweig, Jing Huang, Mukund Padmanabhan
A portability study on natural language call steering
Hong-Kwang Jeff Kuo, Chin-Hui Lee
Improved spoken document retrieval by exploring extra acoustic and linguistic cues
Berlin Chen, Hsin-min Wang, Lin-shan Lee
Native vs non-native production of English vowels in spontaneous speech: an acoustic phonetic study
Kimiko Tsukada
Is non-native pronunciation modelling necessary ?
Silke Goronzy, Marina Sahakyan, Wolfgang Wokurek
Burst segmentation and evaluation of acoustic cues
Yves Laprie, Anne Bonneau
The schwa in albanian
Theodor Granser, Sylvia Moosmüller
A testbed for developing multilingual phonotactic descriptions
Simone Ashby, Julie Carson-Berndsen, Gina Joue
A physiological analysis of nasals and nasalization in Chinese
Wing-Nga Fung, Sze-Lok Lau
A component by component listening test analysis of the IBM trainable speech synthesis system
Robert E. Donovan
Semantic abnormality and its realization in spoken language
Shimei Pan, Kathleen McKeown, Julia Hirschberg
TALKING FOREIGN - concatenative speech synthesis and the language barrier
Nick Campbell
Schwa-assimilation in danish synthetic speech
Christian Jensen
Text-to-speech synthesis with arbitrary speaker's voice from average voice
Masatsune Tamura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi
High quality voice conversion based on Gaussian mixture model with dynamic frequency warping
Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
Voice transformations: from speech synthesis to mammalian vocalizations
Min Tang, Chao Wang, Stephanie Seneff
A new multi-speaker formant synthesizer that applies voice conversion techniques
J. M. Gutiérrez-Arriola, J. M. Montero, J. A. Vallejo, R. Córdoba, R. San-Segundo, Juan M. Pardo
Evaluation of cross-language voice conversion based on GMM and straight
Mikiko Mashimo, Tomoki Toda, Kiyohiro Shikano, Nick Campbell
Ejective reduction in chaha is conditioned by more than prosodic position
Rachel Coulston
Acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments
Hong Kook Kim, Richard C. Rose, Hong-Goo Kang
A robust front-end algorithm for distributed speech recognition
Yan Ming Cheng, Dusan Macho, Yuanjun Wei, Douglas Ealey, Holly Kelleher, David Pearce, William Kushner, Tenkasi Ramabadran
Robust ASR front-end using spectral-based and discriminant features: experiments on the Aurora tasks
M. Carmen Benitez, Lukas Burget, Barry Chen, Stephane Dupont, Hari Garudadri, Hynek Hermansky, Pratibha Jain, Sachin Kajarekar, Nelson Morgan, Sunil Sivadas
Noise reduction for noise robust feature extraction for distributed speech recognition
Bernhard Noé, Jürgen Sienel, Denis Jouvet, Laurent Mauuary, Johan de Veth, Louis Boves, Febe de Wet
Harmonic tunnelling: tracking non-stationary noises during speech
Douglas Ealey, Holly Kelleher, David Pearce
Resource-limited sentence boundary detection
David Carter, Ian Gransden
Metrics for measuring domain independence of semantic classes
Andrew Pargellis, Eric Fosler-Lussier, Alexandros Potamianos, Chin-Hui Lee
Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers
Xiaolong Mou, Stephanie Seneff, Victor Zue
Data-driven semantic inference for unconstrained desktop command and control
Jerome R. Bellegarda, Kim E. A. Silverman
Information extraction via heuristics for a movie showtime query system
Martin Jansche
Recognition of (almost) spoken words: evidence from word play in Japanese
Takashi Otake, Anne Cutler
Perceptual experiments on enhanced and slowed down speech sentences for second language acquisition
Vincent Colotte, Yves Laprie, Anne Bonneau
The relation between speech intelligibility and the complex modulation spectrum
Steven Greenberg, Takayuki Arai
Envelope information in speech processing: acoustic-phonetic analysis vs. auditory figure-ground segregation
Olivier Crouzet, William A. Ainsworth
A comparison between human vowel normalization strategies and acoustic vowel transformation techniques
Patti Adank, Roeland van Hout, Roel Smits
On large vocabulary continuous speech recognition of highly inflectional language - czech
P. Ircing, P. Krbec, J. Hajic, J. Psutka, S. Khudanpur, Frederick Jelinek, William Byrne
Towards automatic transcription of spontaneous presentations
Takahiro Shinozaki, Chiori Hori, Sadaoki Furui
A real-time Japanese broadcast news closed-captioning system
Olivier Siohan, Akio Ando, Mohamed Afify, Hui Jiang, Chin-Hui Lee, Qi Li, Feng Liu, Kazuo Onoe, Frank K. Soong, Qiru Zhou
Investigations on conversational speech recognition
Peter Beyerlein, X. Aubert, M. Harris, C. Meyer, Hauke Schramm
Recent advances in speech recognition system for IBM DARPA communicator
Yuqing Gao, Hakan Erdogan, Yongxin Li, Vaibhava Goel, Michael Picheny
Time and memory efficient viterbi decoding for LVCSR using a precompiled search network
Daniel Willett, Erik McDermott, Yasuhiro Minami, Shigeru Katagiri
A new verification-based fast match approach to large vocabulary speech recognition
Feng Liu, Mohamed Afify, Hui Jiang, Olivier Siohan
A fast calculation method in LVCSRS by time-skipping and clustering of probability density distributions
Seiichi Nakagawa, Yukihisa Horibe
Speech recognition of Japanese news commentary
Shinichi Homma, Akio Kobayashi, Shoei Sato, Toru Imai, Akio Ando
Festival speaks Italian!
Piero Cosi, Fabio Tesser, Roberto Gretter, Cinzia Avesani, Mike Macon
Multilingual TTS for computer telephony: the aculab approach
Alex Monaghan, Mahmoud Kassaei, Mark Luckin, Mariscela Amador-Hernandez, Andrew Lowry, Dan Faulkner, Fred Sannier
A flexible multilingual TTS development and speech research tool
Géza Kiss, Géza Németh, Gábor Olaszy, Géza Gordos
Speech synthesis development made easy: the bonn open synthesis system
Esther Klabbers, Karlheinz Stöber, Raymond Veldhuis, Petra Wagner, Stefan Breuer
Automatic prosody generation - a model for hungarian
Gábor Olaszy, Géza Németh, Péter Olaszi
Evaluation of PROS-3 for the assignment of prosodic structure, compared to assignment by human experts
Olga van Herwijnen, Jacques Terken
Stochastic F0 contour model based on the clustering of F0 shapes of a syntactic unit
Yoichi Yamashita, Tomoyoshi Ishida
Intonational phrase break prediction using decision tree and n-gram model
Xuejing Sun, Ted H. Applebaum
Synthesizing intonation of standard arabic language
A. Zaki, A. Rajouani, M. Najim
Invariance of relative F0 change field of Chinese disyllabic words
Dawei Xu, Hiroki Mori, Hideki Kasuya
Accent label prediction by time delay neural networks using gating clusters
Achim F. Müller, Rüdiger Hoffmann
Transformation-based learning of danish stress assignment
Peter Juel Henrichsen
On the prosody of German telephone numbers
Stefan Baumann, Jürgen Trouvain
Emotional speech synthesis: a review
Marc Schröder
Fun or boring? a web-based evaluation of expressive synthesis for children
Kjell Gustafson, David House
Sub-band based additive noise removal for robust speech recognition
Jingdong Chen, Kuldip K. Paliwal, Satoshi Nakamura
Development of an asynchronous multi-band system for continuous speech recognition
Yik-Cheung Tam, Brian Mak
A multi-band approach based on the probabilistic union model and frequency-filtering features for robust speech recognition
Peter Jancovic, Ji Ming
Split-band perceptual harmonic cepstral coefficients as acoustic features for speech recognition
Liang Gu, Kenneth Rose
Error correcting posterior combination for robust multi-band speech recognition
Astrid Hagen, Herve Bourlard
Robust parameters for speech recognition based on subband spectral centroid histograms
Bojana Gajic, Kuldip K. Paliwal
Pseudo-articulatory representations and the recognition of syllable patterns in speech
William H. Edmondson, Li Zhang
ASR - articulatory speech recognition
Joe Frankel, Simon King
Efficient decoding strategy for conversational speech recognition using state-space models for vocal-tract-resonance dynamics
Jeff Z. Ma, Li Deng
HMM2- extraction of formant structures and their use for robust ASR
Katrin Weber, Samy Bengio, Hervé Bourlard
Auditory model based speech recognition in noisy environment
Xiaoqing Yu, Wanggen Wan, Daniel P. K. Lun
Forward masking for increased robustness in automatic speech recognition
Sascha Wendt, Gernot A. Fink, Franz Kummert
An auditory system-based feature for robust speech recognition
Qi Li, Frank K. Soong, Olivier Siohan
Experiments with the philips continuous ASR system on the AURORA noisy digits database
Markus Lieb, Alexander Fischer
Robust digit recognition in noisy environments: the IBM Aurora 2 system
George Saon, Juan M. Huerta, Ea-Ee Jan
Evaluating the Aurora connected digit recognition task -- a bell labs approach
Mohamed Afify, Hui Jiang, F. Korkmazskiy, Chin-Hui Lee, Qi Li, Olivier Siohan, Frank K. Soong, Arun C. Surendran
Liaison and schwa deletion in French: an effect of lexical frequency and competition?
Cécile Fougeron, J. P. Goldman, U. H. Frauenfelder
An acoustical analysis of the vowels in beijing Mandarin
Eric Zee, Wai-Sum Lee
Discriminant analysis of nasal vs. oral vowels in French: comparison between different parametric representations
Veronique Delvaux, Alain Soquet
Whispery voiced nasal stops in rwanda
Didier Demolin, Véronique Delvaux
Prominence correlates. a study of Swedish
Gunnar Fant, Anita Kruckenberg, Johan Liljencrants, Antonis Botinis
Quantitative analysis of the effects of emphasis upon prosodic features of speech
Sumio Ohno, Hiroya Fujisaki
Towards a model of target oriented production of prosody
Grzegorz Dogil, Bernd Möbius
Prosody control for speaking and singing styles
Chilin Shih, Greg Kochanski
Automated modeling of Chinese intonation in continuous speech
Greg Kochanski, Chilin Shih
Prediction of intonation patterns of accented words in a corpus of read Swedish news through pitch contour stylization
Johan Frid
The use of fundamental frequency raising as a strategy for increasing vocal intensity in soft, normal, and loud phonation
Paavo Alku, Juha Vintturi, Erkki Vilkma
Prosodic interactions on segmental durations ingreek
Antonis Botinis, Marios Fourakis, Robert Bannert
Study on factors influencing durations of syllables in Mandarin
Min Chu, Yongqiang Feng
A comparative study of pauses in dialogues and read speech
Sofia Gustafson-Capkova, Beata Megyesi
Detecting Japanese local speech rate deceleration in spontaneous conversational speech using a variable threshold
Keiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai
Modelling fundamental frequency in first post-tonic syllables in danish sentences
Niels Reinholt Petersen
Non-finality and pre-finality in bari Italian intonation: a preliminary account
Michelina Savino
Building an integrated prosodic model of German
Hansjörg Mixdorff, Oliver Jokisch
A model of F0 contour for arabic affirmative and interrogative sentences
Omar A. G. Ibrahim, S.H. El-Ramly, N.S. Abdel-Kader
Variation in final lengthening as a function of topic structure
Caroline L. Smith, Lisa A. Hogan
Do speakers realize the prosodic structure they say they do?
Olga van Herwijnen, Jacques Terken
Coarticulatory effects at prosodic boundaries: some acoustic results
Marija Tabain, Guillaume Rolland, Christophe Savariaux
Generating duration from a cognitively plausible model of rhythm production
Plínio A. Barbosa
A mixture of Gaussians front end for speech recognition
M. N. Stuttle, M. J. F. Gales
Improved maximum mutual information estimation training of continuous density HMMs
Jing Zheng, John Butzberger, Horacio Franco, Andreas Stolcke
Maximum-likelihood training of a bipartite acoustic model for speech recognition
Florent Perronnin, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua
Analysis of the root-cepstrum for acoustic modeling and fast decoding in speech recognition
Ruhi Sarikaya, John H. L. Hansen
Distinctive features for use in an automatic speech recognition system
Ellen Eide
Improved context-dependent acoustic modeling for continuous Chinese speech recognition
Jiyong Zhang, Fang Zheng, Jing Li, Chunhua Luo, Guoliang Zhang
Class definition in discriminant feature analysis
Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle, Patrick Wambacq
Feature extraction from time-frequency matrices for robust speech recognition
Jose C. Segura, M. Carmen Benitez, Angel de la Torre, Antonio J. Rubio
Using spatial correlation information in speech recognition
Yu Peng, Wang Zuoying
On the choice of classes in MCE based discriminative HMM-training for speech recognizers used in the telephone environment
Josef G. Bauer
Plosive spotting with margin classifiers
Joseph Keshet, Dan Chazan, Ben-Zion Bobrovsky
Model agglomeration for context-dependent acoustic modeling
Fabio Brugnara
Multipass algorithm for acquisition of salient acoustic morphemes
M. Levit, A. L. Gorin, J. H. Wright
Rapid vocal tract length normalization using maximum likelihood estimation
Tadashi Emori, Koichi Shinoda
Towards the creation of acoustic models for stressed Japanese speech
Kozo Okuda, Tomoko Matsui, Satoshi Nakamura
Elderly acoustic model for large vocabulary continuous speech recognition
Akira Baba, Shinichi Yoshizawa, Miichi Yamada, Akinobu Lee, Kiyohiro Shikano
A hybrid approach to enhance task portability of acoustic models in Chinese speech recognition
Jin-Song Zhang, Shu-Wu Zhang, Yoshinori Sagisaka, Satoshi Nakamura
Evaluation of sublexical and lexical models of acoustic disfluencies for spontaneous speech recognition in Spanish
L. J. Rodriguez, I. Torres, A. Varona
Structural learning of dynamic Bayesian networks in speech recognition
Murat Deviren, Khalid Daoudi
Structured language model for class identification of out-of-vocabulary words arising from multiple wordclasses
Shigehiko Onishi, Hirofumi Yamamoto, Yoshinori Sagisaka
New language models using phrase structures extracted from parse trees
Takatoshi Jitsuhiro, Hirofumi Yamamoto, Setsuo Yamada, Yoshinori Sagisaka
Triggering individual word domains in n-gram language models
E. I. Sicilia-Garcia, Ji Ming, F. J. Smith
A structured statistical language model conditioned by arbitrarily abstracted grammatical categories based on GLR parsing
Tomoyosi Akiba, Katunobu Itou
Speech recognition of broadcast sports news
Atsushi Matsui, Hiroyuki Segi, Akio Kobayashi, Toru Imai, Akio Ando
Improvement of a structured language model: arbori-context tree
Shinsuke Mori, Masafumi Nishimura, Nobuyasu Itoh
Smoothing issues in the structured language model
Woosung Kim, Sanjeev Khudanpur, Jun Wu
The study of the effect of training set on statistical language modeling
Xipeng Shen, Bo Xu
Stochastic finite state automata language model triggered by dialogue states
Yannick Esteve, Frédéric Bechet, Alexis Nasr, Renato De Mori
A baseline method for compiling typed unification grammars into context free language models
Manny Rayner, John Dowding, Beth Ann Hockey
Comparison of width-wise and length-wise language model compression
E. W. D. Whittaker, Bhiksha Raj
Large vocabulary statistical language modeling for continuous speech recognition in finnish
Vesa Siivola, Mikko Kurimo, Krista Lagus
A new technique based on augmented language models to improve the performance of spoken dialogue systems
R. López-Cózar, D. H. Milone
Pause information for dependency analysis of read Japanese sentences
Kazuyuki Takagi, Kazuhiko Ozeki
An HMM/n-gram-based linguistic processing approach for Mandarin spoken document retrieval
Berlin Chen, Hsin-min Wang, Lin-shan Lee
Probabilistic concept verification for language understanding in spoken dialogue systems
Yi-Chung Lin, Huei-Ming Wang
Turkish word segmentation using morphological analyzer
M. Oguzhan Külekcý, Mehmed Özkan
Thai grapheme-to-phoneme using probabilistic GLR parser
Pongthai Tarsaku, Virach Sornlertlamvanich, Rachod Thongprasirt
Aligning prosody and syntax in property grammars
Philippe Blache, Daniel Hirst
From perceptual designs to linguistic typology and automatic language identification : overview and perspectives
Melissa Barkat, Ioana Vasilescu
Morphological approaches for an English pronunciation lexicon
Susan Fitt
An embodiment paradigm for speech recognition systems
Gina Joue, Julie Carson-Berndsen
Multi-parser architecture for query processing
Kui Xu, Fuliang Weng, Helen M. Meng, Po Chui Luk
Two-stage probabilistic approach to text segmentation
Yi-Chia Chen, Yi-Chung Lin
Lexicon optimization for dutch speech recognition in spoken document retrieval
Roeland Ordelman, Arjan van Hessen, Franciska de Jong
Evaluation of recent speech grammar standardization efforts
Tom Brøndsted
The influence of vocal effort on human speaker identification
Douglas S. Brungart, Kimberly R. Scott, Brian D. Simpson
Improving speaker recognition using phonetically structured Gaussian mixture models
Robert Faltlhauser, Günther Ruske
Information fusion for robust speaker verification
Conrad Sanderson, Kuldip K. Paliwal
A robust speaker verification system against imposture using an HMM-based speech synthesis system
Takayuki Satoh, Takashi Masuko, Takao Kobayashi, Keiichi Tokuda
Sequential decisions for faster and more flexible verification
Arun C. Surendran
Background learning of speaker voices for textindependent speaker identification
Wei-Ho Tsai, Y. C. Chu, Chao-Shih Huang, Wen-Whei Chang
Explicit exploitation of stochastic characteristics of test utterance for text-independent speaker identification
Wei-Ho Tsai, Wen-Whei Chang, Chao-Shih Huang
Improvement of speaker verification for Thai language
Chai Wutiwiwatchai, Varin Achariyakulporn, Sawit Kasuriya
Speaker identification for car infotainment applications
Javier Rodríguez-Saeta, Christian Koechling, Javier Hernando
A system for text dependent speaker verification - field trial evaluation and simulation results
H. Schalk, Herbert Reininger, Stephan Euler
Speaker recognition in a multi-speaker environment
Alvin F. Martin, Mark A. Przybocki
A new DP-like speaker clustering algorithm
Zhijian Ou, Zuoying Wang
On the use of the Bayesian information criterion in multiple speaker detection
P. Sivakumaran, J. Fortuna, A. M. Ariyaeeinia
Preliminary experiments on language identification using broadcast news recordings
Laurent Benarousse, Edouard Geoffrois
Multi-stream statistical n-gram modeling with application to automatic language identification
Katrin Kirchhoff, Sonia Parandekar
Up to what level can acoustical and textual features predict prominence
Barbertje M. Streefkerk, Louis C. W. Pols, Louis F. M. ten Bosch
Linguistic factors affecting timing in Korean with application to speech synthesis
Hyunsong Chung, Mark A. Huckvale
Measuring rhythmic deviation in second language speech
Felix Schaeffler
Good timing: place-dependent voice onset time in ejective stops
Ian Maddieson
Design of an optimal continuous speech database for text-to-speech synthesis considered as a set covering problem
Helene Francois, Olivier Boeffard
Use of clustering information for coarticulation compensation in speech synthesis by word concatenation
Christos Vosnidis, Vassilis Digalakis
Reducing spectral mismatches in concatenative speech synthesis via systematic database enrichment
Maria Founda, George Tambouratzis, Aimilios Chalamandaris, George Carayannis
Hansori 2001 - corpus-based implementation of the Korean hansori text-to-speech synthesizer
Attila Ferencz, Sung-Woo Choi, Ho-Eun Song, Myoung-Wan Koo
Must diphone synthesis be so unnatural?
William Barry, Claus Nielsen, Ove Andersen
Phonetic effects on listener detection of vowel concatenation
Ann K. Syrdal
Variable-length acoustic units inference for text-to-speech synthesis
Olivier Boeffard
Unit selection for speech synthesis using splicing costs with weighted finite state transducers
Ivan Bulyko, Mari Ostendorf
Cantonese text-to-speech synthesis using sub-syllable units
K. M. Law, Tan Lee, Wai Lau
A comparison of LPC and FFT-based acoustic features for noise robust ASR
Febe de Wet, Bert Cranen, Johan de Veth, Loe Boves
Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection
Miichi Yamada, Akira Baba, Shinichi Yoshizawa, Yuichiro Mera, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
Applying parallel model compensation with mel-frequency discrete wavelet coefficients for noise-robust speech recognition
Zekeriya Tufekci, John N. Gowdy, Sabri Gurbuz, E. Patterson
Linear interpolation of cepstral variance for noisy speech recognition
Tai-Hwei Hwang, Kuo-Hwei Yuo, Hsiao-Chuan Wang
Evaluation of a generalized dynamic cepstrum in distant speech recognition
Hiroshi Matsumoto, Akihiko Shimizu, Kazumasa Yamamoto
Robust speech/non-speech detection using LDA applied to MFCC for continuous speech recognition
Arnaud Martin, Géraldine Damnati, Laurent Mauuary
Toward noise-tolerant acoustic models
Edmondo Trentin, Marco Gori
Noise estimation without explicit speech, non-speech detection: a comparison of mean, modal and median based approaches
Nicholas W. D. Evans, John S. Mason
Evaluation of front-end features and noise compensation methods for robust Mandarin speech recognition
Rathi Chengalvarayan
ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition
Brendan J. Frey, Li Deng, Alex Acero, Trausti Kristjansson
Robust speech recognition in noise: an evaluation using the SPINE corpus
John H. L. Hansen, Ruhi Sarikaya, Umit Yapanel, Bryan Pellom
Robust speech recognition against packet loss
Manhung Siu, Yu-Chung Chan
Rapid CODEC adaptation for cellular phone speech recognition
Masaki Naito, Shingo Kuroiwa, Tsuneo Kato, Tohru Shimizu, Norio Higuchi
A robust front-end for ASR over IP snd GSM networks: an integrated scenario
Ascension Gallardo-Antolin, Carmen Pelaez-Moreno, Fernando Diaz-de-Maria
Robust speech recognition using missing feature theory and vector quantization
Philippe Renevey, Rolf Vetter, Jens Krauss
Modeling the mixtures of known noise and unknown unexpected noise for robust speech recognition
Ji Ming, Peter Jancovic, Philip Hanna, Darryl Stewart
Robust speech recognition based on selective use of missing frequency band HMMs
Takayoshi Kawamura, Kazuya Takeda, Fumitada Itakura
A new method for speech recognition in the presence of non-stationary, unpredictable and high-level noise
Ikuyo Masuda-Katsuse
A computational efficient real time noise robust speech recognition based on improved spectral subtraction method
Bojan Kotnik, Zdravko Kacic, Bogomir Horvat
The use of noisy frame elimination and frequency spectrum magnitude reduction in noise robust speech recognition
Damjan Vlaj, Zdravko Kacic, Bogomir Horvat
Combined linear regression adaptation and Bayesian predictive classification for robust speech recognition
Jen-Tzung Chien
Quantile based histogram equalization for noise robust speech recognition
Florian Hilger, Hermann Ney
Sequential noise compensation by a sequential kullback proximal algorithm
Kaisheng Yao, Kuldip K. Paliwal, Satoshi Nakamura
Blind speech separation of moving speakers using hybrid neural networks
Athanasios Koutras, Evangelos Dermatas, George Kokkinakis
Computationally efficient frequency-domain combination of acoustic echo cancellation and robust adaptive beamforming
W. Herbordt, H. Buchner, W. Kellermann
Calibration of microphone arrays for improved speech recognition
Michael L. Seltzer, Bhiksha Raj
Improving simultaneous speech recognition in real room environments using overdetermined blind source separation
Athanasios Koutras, Evangelos Dermatas, George Kokkinakis
Real-time sound source localization and separation system and its application to automatic speech recognition
Futoshi Asano, Masataka Goto, Katunobu Itou, Hideki Asoh
An efficient lipreading method using the symmetry of lip
Joohun Lee, JinYoung Kim
Comparing audio- and a-posteriori-probability-based stream confidence measures for audio-visual speech recognition
Martin Heckmann, Thorsten Wild, Frédéric Berthommier, Kristian Kroschel
Large-vocabulary audio-visual speech recognition by machines and humans
Gerasimos Potamianos, Chalapathy Neti, Giridharan Iyengar, Eric Helmuth
Evaluation of an automatically obtained shape and appearance model for automatic audio visual speech recognition
Philippe Daubias, Paul Deleglise
An approach to an Italian talking head
C. Pelachaud, E. Magno-Caldognetto, C. Zmarich, Piero Cosi
Education on the web: launch of three new websites
Anders Eriksson, Gerrit Bloothooft
SProSIG: a special interest group on speech prosody
Daniel Hirst, Bernard Bel, Nick Campbell
SPeaker and Language Characterization (spLC): a special interest group (SIG) of ISCA
Jean-François Bonastre, Ivan Magrin-Chagnolleau, Stephan Euler, François Pellegrino, Régine André-Obrecht, John S. Mason, Frédéric Bimbot
The ISCA special interest group on speech synthesis
Nick Campbell, Wolfgang Hess, Bernd Möbius, Jan van Santen
Auditory visual speech processing
Dominic W. Massaro
The specificity of French speech processing (no proceedings paper)
Frédéric Bimbot, Jean-Francois Bonastre
SIGdial - special interest group on discourse and dialogue
Laila Dybkjær
Integrating speech technology in language learning: an overview of the activities of inSTIL
Philippe Delcloque
ISCA SALTMIL SIG: speech and language technology for minority languages
Climent Nadeu, Donncha ÓCróinín, Bojan Petek, Kepa Sarasola, Briony Williams
Training prosodic phrasing rules for Chinese TTS systems
Weijun Chen, Fuzong Lin, Jianmin Li, Bo Zhang
Intonation modelling with a lexicon of natural F0 contours
Per Olav Heggtveit, Jon Emil Natvig
Smooth contour estimation in data-driven pitch modelling
Kim E. A. Silverman, Jerime R. Bellegarda, Kevin A. Lenzo
Generating F0 contours by statistical manipulation of natural F0 shapes
Takashi Saito, Masaharu Sakamoto
Learning prosodic features using a tree representation
Julia Hirschberg, Owen Rambow
Lip-reading from parametric lip contours for audio- visual speech recognition
Sabri Gurbuz, Eric K. Patterson, Zekeriya Tufekci, John N. Gowdy
An investigation of HMM classifier combination strategies for improved audio-visual speech recognition
Simon Lucey, Sridha Sridharan, Vinod Chandran
Combining multi-party speech and text exchanges over the internet
Niels Ole Bernsen, Laila Dybkjær
Real-time multiple speaker tracking by multi-modal integration for mobile robots
Kazuhiro Nakadai, Ken-ichi Hidai, Hiroshi G. Okuno, Hiroaki Kitano
XISL: an attempt to separate multimodal interactions from XML contents
Tsuneo Nitta, Kouichi Katsurada, Hirobumi Yamada, Yusaku Nakamura, Satoshi Kobayashi
Discriminative speaker adaptation with conditional maximum likelihood linear regression
Asela Gunawardana, William Byrne
What is the best type of prior distribution for EMAP speaker adaptation?
Patrick Kenny, Gilles Boulianne, Pierre Dumouchel
Maximum-likelihood affine cepstral filtering (MLACF) technique for speaker normalization
Yoon Kim
A novel algorithm for rapid speaker adaptation based on structural maximum likelihood eigenspace mapping
Bowen Zhou, John H. L. Hansen
Evaluation on unsupervised speaker adaptation based on sufficient HMM statictics of selected speakers
Shinichi Yoshizawa, Akira Baba, Kanako Matsunami, Yuichirou Mera, Miichi Yamada, Akinobu Lee, Kiyohiro Shikano
A novel target-driven MLLR adaptation algorithm with multi-layer structure
Jia Lei, Xu Bo
Scaled likelihood linear regression for hidden Markov model adaptation
Frank Wallhoff, Daniel Willett, Gerhard Rigoll
Fast adaptation using constrained affine transformations with hierarchical priors
Tor Andre Myrvoll, Kuldip K. Paliwal, Torbjørn Svendsen
A context adaptation approach for building context dependent models in LVCSR
Xiaoxing Liu, Baosheng Yuan, Yonghong Yan
Improving genericity for task-independent speech recognition
Fabrice Lefevre, Jean-Luc Gauvain, Lori Lamel
A posteriori and a priori transformations for speaker adaptation in large vocabulary speech recognition systems
Driss Matrouf, Olivier Bellot, Pascal Nocera, Georges Linares, Jean-Francois Bonastre
Bayesian methods for HMM speech recognition with limited training data
Darryl W. Purnell, Elizabeth C. Botha
Rapid speaker adaptation using MLLR and subspace regression classes
Kwok-Man Wong, Brian Mak
Speaker adaptation of output probabilities and state duration distributions for speech recognition
Nestor Becerra Yoma, Jorge Silva
Cohorts based custom models for rapid speaker and dialect adaptation
Jian Wu, Eric Chang
Speaker adaptation of quantized parameter HMMs
Marcel Vasilache, Olli Viikki
Segmental eigenvoice for rapid speaker adaptation
Yu Tsao, Shang-Ming Lee, Fu-Chiang Chou, Lin-Shan Lee
Speaker adaptation in an ASR system based on nonlinear dynamical systems
Narada D. Warakagoda, Magne H. Johnsen
An interactive directory assistance service for Spanish with large-vocabulary recognition
R. Córdoba, R. San-Segundo, J. M. Montero, J. Colás, J. Ferreiros, J. Macías-Guarasa, Juan M. Pardo
A multilingual-supporting dialog system using a common dialog controller
Yunbiao Xu, Masahiro Araki, Yasuhisa Niimi
Graphic platform for designing and developing practical voice interaction systems
Tomas Nouza, Jan Nouza
Speech translation for French in the NESPOLE! european project
Laurent Besacier, H. Blanchon, Y. Fouquet, J. P. Guilbaud, S. Helme, S. Mazenot, D. Moraru, D. Vaufreydaz
Lessons from the development of a conversational interface
Marianne Hickey, Paul St John Brittan
SCANMail: browsing and searching speech data by content
Julia Hirschberg, Michiel Bacchiani, Don Hindle, Phil Isenhour, Aaron Rosenberg, Litza Stark, Larry Stead, Steve Whittaker, Gary Zamchick
Multi-scale retrieval in MEI: an English-Chinese translingual speech retrieval system
Wai-Kit Lo, Patrick Schone, Helen M. Meng
Compact word graph in spoken dialogue system
Shih-Chieh Chien, Sen-Chia Chang
MINOS-II: a prototype car navigation system with mixed initiative turn taking dialogue
Munehiko Sasajima, Takebhide Yano, Taishi Shimomori, Tatsuya Uehara
Use of topic knowledge in spoken dialogue information retrieval system for academic documents
Shinya Kiriyama, Keikichi Hirose, Nobuaki Minematsu
Domain-independent spoken dialogue platform using key-phrase spotting based on combined language model
Kazunori Komatani, Katsuaki Tanaka, Hiroaki Kashima, Tatsuya Kawahara
OASIS natural language call steering trial
Peter J. Durston, Mark Farrell, David Attwater, James Allen, Hong-Kwang Jeff Kuo, Mohamed Afify, Eric Fosler-Lussier, Chin-Hui Lee
First steps toward an adaptive spoken dialogue system in medical domain
Ivano Azzini, Daniele Falavigna, Roberto Gretter, Giordano Lanzola, Marco Orlandi
Mokusei: a telephone-based Japanese conversational system in the weather domain
Mikio Nakano, Yasuhiro Minami, Stephanie Seneff, Timothy J. Hazen, D. Scott Cyphers, James Glass, Joseph Polifroni, Victor Zue
Speechbuilder: facilitating spoken dialogue system development
James Glass, Eugene Weinstein
Voice-IF: a mixed-initiative spoken dialogue system for AT&t conference services
M. Rahim, Giuseppe Di Fabbrizio, C. Kamm, Marilyn Walker, A. Pokrovsky, P. Ruscitti, E. Levin, S. Lee, Ann K. Syrdal, K. Schlosser
Smartkom: multimodal communication with a life- like character
Wolfgang Wahlster, Norbert Reithinger, Anselm Blocher
ISIS: a learning system with combined interaction and delegation dialogs
Helen M. Meng, Shuk Fong Chan, Yee Fong Wong, Cheong Chat Chan, Yiu Wing Wong, Tien Ying Fung, Wai Ching Tsui, Ke Chen, Lan Wang, Ting Yao Wu, Xiaolong Li, Tan Lee, Wing Nin Choi, P. C. Ching, Huisheng Chi
Robust language understanding in mipad
Ye-Yi Wang
The WITAS multi-modal dialogue system I
Oliver Lemon, Anne Bracy, Alexander Gruenstein, Stanley Peters
Universalizing speech: notes from the USI project
Stefanie Shriver, Roni Rosenfeld, Xiaojin Zhu, Arthur Toth, Alexander I. Rudnicky, Markus Flueckiger
Observations on overlap: findings and implications for automatic processing of multi-party conversation
Elizabeth Shriberg, Andreas Stolcke, Don Baron
Towards SMIL as a foundation for multimodal, multimedia applications
Jennifer L. Beckham, Giuseppe Di Fabbrizio, Nils Klarlund
ANVIL - a generic annotation tool for multimodal dialogue
Michael Kipp
DARPA communicator dialog travel planning systems: the june 2000 data collection
Marilyn Walker, J. Aberdeen, J. Boland, E. Bratt, J. Garofolo, Lynette Hirschman, A. Le, S. Lee, Shrikanth Narayanan, K. Papineni, Bryan Pellom, Joseph Polifroni, Alexandros Potamianos, P. Prabhu, Alexander I. Rudnicky, G. Sanders, Stephanie Seneff, D. Stallard, Steve Whittaker
Analysis of speaker variability
Chao Huang, Tao Chen, Stan Li, Eric Chang, Jianlai Zhou
Speaker recognition by separating phonetic space and speaker space
M. Nishida, Y. Ariki
Eigen-MLLR coefficients as new feature parameters for speaker identification
Nick J.-C. Wang, Wei-Ho Tsai, Lin-Shan Lee
Speaker verification using target and background dependent linear transforms and multi-system fusion
Jiri Navratil, Upendra V. Chaudhari, Ganesh N. Ramaswamy
Testing the perceptual relevance of syntactic completion and melodic configuration for turn-taking in dutch
Johanneke Caspers
Cues for perceived pitch register
Toni Rietveld, Patricia Vermillion
Language-specific effects of pitch range on the perception of universal intonational meaning
Aoju Chen, Toni Rietveld, Carlos Gussenhoven
Comparing word-level intelligibility after linear vs. non-linear time-compression
Esther Janse
AMSTIVOC (AMsterdam system for transcription of infant VOCalizations) applied to utterances of deaf and normally hearing infants
Florien J. Koopmans-van Beinum, Chris J. Clement, Ineke Van den Dikkenberg-Pot
Using linguopalatal contact patterns to tune a 3d tongue model
Olov Engwall
Electromagnetic articulograph (EMA) based on a nonparametric representation of tthe magnetic field
Tokihiko Kaburagi, Masaaki Honda
European portuguese nasal vowels: an EMMA study
A. Teixeira, F. Vaz
The role of the palate in tongue kinematics: an experimental assessment in v sequences from EPG and EMMA data
Susanne Fuchs, Pascal Perrier, Christine Mooshammer
Modelling care of articulation with HMMs is dangerous
Matthew P. Aylett
Spectral tilt as a perturbation-free measurement of noise levels in voice signals
Peter J. Murphy
Estimation of the modulation frequency and modulation depth of the fundamental frequency owing to vocal micro-tremor of the voice source signal
Jean Schoentgen
The perceptual relevance of glottal-pulse parameter variations
Ralph van Dinther, Raymond N.J. Veldhuis, Armin Kohlrausch
Speaker normalization based on test to reference speaker mapping
Marcel Ogner, Zdravko Kacic
A face-to-muscle inversion of a biomechanical face model for audiovisual and motor control research
Michel Pitermann, Kevin G. Munhall
A model of vowel production under positive pressure breathing
Allan J South
Helium speech normalisation by codebook mapping
Adam Podhorski, Marek Czepulonis
Building a corpus of natural speech - and tools for the processing of expressive speech
Nick Campbell
Aspects of modern multi-modal/multi-media corpora exploitation environments
Daan Broeder, Hennie Brugman, Peter Wittenburg
Emerging requirements for multi-modal annotation and analysis tools
Tony Bigbee, Dan Loehr, Lisa Harper
Three-dimensional modelling of speech corpora: added value through visualisation
Toomas Altosaar, Matti Karjalainen, Martti Vainio
The technical processing in smartkom data collection: a case study
Ulrich Türk
Use of real and contaminated speech for training of a hands-free in-car speech recognizer
M. Matassoni, M. Omologo, P. Svaizer
Combined front-end signal processing for in-vehicle speech systems
Jay P. Plucienkowski, John H. L. Hansen, Pongtep Angkititrakul
Robust automatic speech recognition in low-SNR car environments by the application of a connectionist subspace-based approach to the melbased cepstral coefficients
Sid-Ahmed Selouani, Hesham Tolba, Douglas OShaughnessy
Recognition of spelled city names in automotive environments
Andreas Korthauer
Acoustic echo control and noise reduction for cabin car communication
Eduardo Lleida, Enrique Masgrau, Alfonso Ortega
FST-based recognition techniques for multi-lingual and multi-domain spontaneous speech
Timothy J. Hazen, I. Lee Hetherington, Alex Park
A transducer approach to word graph generation
Gilles Boulianne, Pierre Ouellet, Pierre Dumouchel
An efficient implementation of phonological rules using finite-state transducers
I. Lee Hetherington
A weight pushing algorithm for large vocabulary speech recognition
Mehryar Mohri, Michael Riley
Transducer optimizations for tight-coupled decoding
Alexander Seward
A new method for testing communication efficiency and user acceptability of speech communication channels
Sander J. van Wijngaarden, Paula M.T. Smeele, Herman J.M. Steeneken
Phonetic transcriptions in the spoken dutch corpus: how to combine efficiency and good transcription quality
Catia Cucchiarini, Diana Binnenpoorte, Simo Goddijn
A functional approach to speech recognition evaluation
Ben Hutchinson
Instrumental derivation of equipment impairment factors for describing telephone speech codec degradations
Sebastian Möller, Jens Berger
Julius --- an open source real-time large vocabulary recognition engine
Akinobu Lee, Tatsuya Kawahara, Kiyohiro Shikano
Local refinement of phonetic boundaries: a general framework and its application using different transition models
Doroteo Torre Toledano, Luis A. Hernández Gómez
Detection of digital transmission systems for voice quality measurements
Thorsten Ludwig, Ulrich Heute
Automatic segmentation of recorded speech into syllables for speech synthesis
Eric Lewis, Mark Tatham
Phonetic events from the labeling the european portuguese database for speech synthesis, FEUP/IPBDB
João Paulo Teixeira, Diamantino Freitas, Daniela Braga, Maria João Barros, Vagner Latsch
Acoustical and topological experiments for an HMM-based speech segmentation system
Samir Nefti, Olivier Boeffard
TclBLASR: an automatic speech recognition extension for tcl
Qiru Zhou, Jinsong Zheng, Chin-Hui Lee
Lower WERs do not guarantee better transcriptions
Judith M. Kessens, Helmer Strik
An elitist approach to articulatory-acoustic feature classification
Shuangyu Chang, Steven Greenberg, Mirjam Wester
A dutch treatment of an elitist approach to articulatory-acoustic feature classification
Mirjam Wester, Steven Greenberg, Shuangyu Chang
Hybrid natural language generation for spoken dialogue systems
Michel Galley, Eric Fosler-Lussier, Alexandros Potamianos
The generation of speech for a search guide
Nicholas J. Cook, Ian D. Benest
An automatic dialogue system generator from the internet information contents
Masahiro Araki, Tasuku Ono, Kiyoshi Ueda, Takuya Nishimoto, Yasuhisa Niimi
Training a sentence planner for spoken dialog: the impact of syntactic and planning features
Monica Rogati, Marilyn Walker, Owen Rambow
A comparative study of MLP-based artificial neural networks in text-independent speaker verification against GMM-based systems
Carlos E. Vivaracho, Javier Ortega-García, Luis Alonso, Quiliano I. Moro
Enhancing GMM scores using SVM "hints"
Shai Fine, Jiri Navratil, Ramesh A. Gopinath
Combining GMM's with suport vector machines for text-independent speaker verification
Jamal Kharroubi, Dijana Petrovska-Delacretaz, Gerard Chollet
A text-independent speaker verification system using support vector machines classifier
Yong Gu, Trevor Thomas
A segmental mixture model for speaker recognition
Robert P. Stapert, John S. Mason
Tree based score computation for speaker verification
Raphael Blouet, Frédéric Bimbot
Phonetic speaker recognition
Walter D. Andrews, Mary A. Kohler, Joseph P. Campbell
Speaker recognition based on idiolectal differences between speakers
George Doddington
Advances in automatic speech summarization
Chiori Hori, Sadaoki Furui
A word graph interface for a flexible concept based speech understanding framework
Kadri Hacioglu, Wayne Ward
Comparing grammar-based and robust approaches to speech understanding: a case study
Sylvia Knight, Genevieve Gorrell, Manny Rayner, David Milward, Rob Koeling, Ian Lewin
Integrating multiple knowledge sources for improved speech understanding
Sherif Abdou, Michael Scordilis
Classification of transition sounds with application to automatic speech recognition
Zeev Litichever, Dan Chazan
Gaussian subtraction (GS) algorithms for word spotting in continuous speech
Avi Faizakov, Arnon Cohen, Tzur Vaich
Relating frame accuracy with word error in hybrid ANN-HMM ASR
Michael L. Shire
A two-layer lexical tree based beam search in continuous Chinese speech recognition
Guoliang Zhang, Fang Zheng, Wenhu Wu
Automatic labeling and digesting for lecture speech utilizing repeated speech by shift CDP
Yoshiaki Itoh, Kazuyo Tanaka
Improved phoneme-history-dependent search for large-vocabulary continuous-speech recognition
Takaaki Hori, Yoshiaki Noda, Shoichi Matsunaga
Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task
Josef Psutka, Ludek Müller, Josef V. Psutka
N-best list generation using word and phoneme recognition fusion
Ernest Pusateri, J.M. Van Thong
A one pass semi-dynamic network decoder based on language model network
Dong-Hoon Ahn, Minhwa Chung
Improving automatic speech recognition using tangent distance
W. Macherey, D. Keysers, J. Dahmen, Hermann Ney
N-best speech hypotheses reordering using linear regression
Ananlada Chotimongkol, Alexander I. Rudnicky
Low-resource hidden Markov model speech recognition
Sabine Deligne, Ellen Eide, Ramesh Gopinath, Dimitri Kanevsky, Benoit Maison, Peder Olsen, Harry Printz, Jan Sedivy
Speech recognition at multiple sampling rates
H. G. Hirsch, K. Hellwig, S. Dobler
Support vector machine with dynamic time-alignment kernel for speech recognition
Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Nakai, Shigeki Sagayama
Efficient scalable speech compression for scalable speech recognition
Naveen Srinivasamurthy, Antonio Ortega, Shrikanth Narayanan
Voice activity detection in noisy environments
J. Stadermann, V. Stahl, G. Rose
An improved wavelet-based speech enhancement system
Hamid Sheikhzadeh, Hamid Reza Abutalebi
Enhancing distributed speech recognition with back- end speech reconstruction
Tenkasi Ramabadran, Jeff Meunier, Mark Jasiuk, Bill Kushner
Implementation effective one-channel noise reduction system
Jiri Tihelka, Pavel Sovka
Efficient speech enhancement by diffusive gain factors (DGF)
Hyoung-Gook Kim, Klaus Obermayer, Mathias Bode, Dietmar Ruwisch
Correction of the voice timbre distortions on telephone network
Gaël Mahé, André Gilloire
Speech enhancement based on IMM with NPHMM
Yunjung Lee, Joohun Lee, Ki Yong Lee, Katsuhiko Shirai
Speech recognition under musical environments using kalman filter and iterative MLLR adaptation
M. Fujimoto, Y. Ariki
Dual channel speech enhancement using coherence function and MDL-based subspace approach in bark domain
Rolf Vetter, Philippe Renevey, Jens Krauss
Entropy based voice activity detection in very noisy conditions
Philippe Renevey, Andrzej Drygajlo
Discrimination between speech and music based on a low frequency modulation feature
Stefan Karnebäck
Credibility proof for speech content and speaker verification by fragile watermarking with consecutive frame-based processing
Yiou-Wen Cheng, Lin-Shan Lee
Map estimation for on-line noise compensation of time trajectories of spectral coefficients
I. Potamitis, Nikos Fakotakis, George Kokkinakis
A new method for speech denoising and robust speech recognition using probabilistic models for clean speech and for noise
Hagai Attias, Li Deng, Alex Acero, John C. Platt
Designing very compact decision trees for grapheme-to-phoneme transcription
Anne K. Kienappel, Reinhard Kneser
Using machine learning techniques for grapheme to phoneme transcription
Franco Mana, Paolo Massimino, Alberto Pacchiotti
Knowledge of language origin improves pronunciation accuracy of proper names
Ariadna Font Llitjos, Alan W. Black
On the pronunciation of acronyms in French and in Italian
Philippe Boula de Mareüil, Franck Floricic
Enhancement of noisy speech by using improved global soft decision
Vladimir I. Shin, Doh-Suk Kim, Moo Young Kim, Jeongsu Kim
Enhancement of speech using bark-scaled wavelet packet decomposition
Israel Cohen
A new approach for wavelet speech enhancement
Mohammed Bahoura, Jean Rouat
Speech/noise-dominant decision for speech enhancement
Sukhyun Yoon, Chang D. Yoo
An MCE based classification tree using hierarchical feature-weighting in speech recognition
Fan Wang, Fang Zheng, Wenhu Wu
Selective MCE training strategy in Mandarin speech recognition
Jianlai Zhou, Eric Chang, Chao Huang
Discriminative disfluency modeling for spontaneous speech recognition
Chung-Hsien Wu, Gwo-Lang Yan
Comparative analysis for data-driven temporal filters obtained via principal component analysis (PCA) and linear discriminant analysis (LDA) in speech recognition
Jeih-weih Hung, Hsin-min Wang, Lin-shan Lee
Coding method for successive pitch periods
Ari Heikkinen, Vesa T. Ruoppila, Samuli Pietilä
Objective evaluation of methods for quantization of variable-dimension spectral vectors in WI speech coding
Jani Nurminen, Ari Heikkinen, Jukka Saarinen
Squared error as a measure of phase distortion
Harald Pobloth, W. Bastiaan Kleijn
Non-linear predictive vector quantization of speech
Marcos Faundez-Zanuy
A variable rate hybrid coder based on a synchronized harmonic excitation
Nilantha Katugampala, Ahmet M. Kondoz
A hybrid sub-band sinusoidal coding scheme
M. S. Ho, D. J. Molyneux, B. M. G. Cheetham
Low rate speech coding incorporating simultaneously masked spectrally weighted linear prediction
J. Lukasiak, I. S. Burnett, C. H. Ritz
Narrowband perceptual audio coding: enhancements for speech
Hossein Najaf-Zadeh, Peter Kabal
Techniques for high-quality ACELP coding of wideband speech
B. Bessette, Roch Lefebvre, R. Salami, M. Jelinek, J. Vainio, J. Rotola-Pukkila, H. Mikkola, K. Jarvinen
Wideband ACELP at 16 kb/s with multi-band excitation
Sílvia Pujalte, Asunción Moreno
Wideband speech coding algorithm with application of discrete wavelet transform to upper band
Seung Won Lee, Keun Sung Bae
A switched DPCM/subband coder for pre-echo reduction
S. Satheesh, T. V. Sreenivas
A generalized multistage VQ approach for spectral magnitude quantization
Cagri Özgenc Etemoglu, Vladimir Cuperman
Efficient implementation of ITU-t g.723.1 speech coder for multichannel voice transmission and storage
Sung-Kyo Jung, Young-Cheol Park, Sung-Wan Youn, Kyoung-Tae Kim, Dae-Hee Youn
CU-move : analysis & corpus development for interactive in-vehicle speech systems
John H. L. Hansen, Pongtep Angkititrakul, Jay Plucienkowski, Stephen Gallant, Umit Yapanel, Bryan Pellom, Wayne Ward, Ron Cole
Multimedia data collection of in-car speech communication
Nobuo Kawaguchi, Shigeki Matsubara, Kazuya Takeda, Fumitada Itakura
The u.s. speechdat-car data collection
Peter A. Heeman, David Cole, Andrew Cronk
Word unit based multilingual comparative analysis of text corpora
Géza Németh, Csaba Zainkó
Creating a european English broadcast news transcription corpus and system
Gerhard Backfried, Robert Hecht, Sabine Loots, Norbert Pfannerer, Jürgen Riedler, Christian Schiefer
The nespole! voIP dialogue database
Susanne Burger, Laurent Besacier, Paolo Coletti, Florian Metze, Céline Morel
Design of speech corpus for text-to-speech synthesis
Jindrich Matousek, Josef Psutka, Jiri Kruta
The IFA corpus: a phonemically segmented dutch "open source" speech database
Rob J. J. H. van Son, Diana Binnenpoorte, Henk van den Heuvel, Louis C. W. Pols
African speech technology (AST) telephone speech databases: corpus design and contents
Philippa H. Louw, Justus C. Roux, Elizabeth C. Botha
Speechdat-e: five eastern european speech databases for voice-operated teleservices completed
Henk van den Heuvel, Jerome Boudy, Zsolt Bakcsi, Jan Cernocky, Valery Galunov, Julia Kochanina, Wojciech Majewski, Petr Pollak, Milan Rusko, Jerzy Sadowski, Piotr Staroniewicz, Herbert S. Tropf
Concordancing for parallel spoken language corpora
Dafydd Gibbon, Thorsten Trippel, Serge Sharoff
Large broadcast news and read speech corpora of spoken czech
Josef Psutka, Vlasta Radova, Ludek Müller, Jindrich Matousek, Pavel Ircing, David Graff
Development of Russian lexical databases, corpora and supporting tools for speech products
Serge A. Yablonsky
Constructing a segment database for greek time domain speech synthesis
Stavroula-Evita F. Fotinea, George D. Tambouratzis, George V. Carayannis
Subjective assessment of speech-system interface usability
Kate S. Hone, Robert Graham
An objective measure for estimating MOS of synthesized speech
Min Chu, Hu Peng
Comparing the performance of two CSRs: how to determine the significance level of the differences
Helmer Strik, Catia Cucchiarini, Judith M. Kessens
Prediction of low recognition rate words for isolated word recognition system
Ryuta Terashima, Hiroyuki Hoshino, Toshihiro Wakita
An objective measure for assessment of the concatenative TTS segment inventories
Robert Batusek
Word level confidence annotation using combinations of features
Rong Zhang, Alexander I. Rudnicky
A boosting approach for confidence scoring
Pedro J. Moreno, Beth Logan, Bhiksha Raj
On combining confidence measures for improved rejection of incorrect data
Delphine Charlet, Guy Mercier, Denis Jouvet
Improved word confidence estimation using long range features
David D. Palmer, Mari Ostendorf
Is this conversation on track?
Paul Carpenter, Chun Jin, Daniel Wilson, Rong Zhang, Dan Bohus, Alexander I. Rudnicky
Automatic n-gram language model creation from web resources
Ryuichi Nisimura, Kumiko Komatsu, Yuka Kuroda, Kentaro Nagatomo, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
On integrating the lexicon with the language model
Diamantino Caseiro, Isabel Trancoso
Back-off smoothing evaluation over syntactic language models
A. Varona, I. Torres
An online incremental language model adaptation method
Genqing Wu, Fang Zheng, Ling Jin, Wenhu Wu
Using boosting and POS word graph tagging to improve speech recognition
Christer Samuelsson, James L. Hieronymus
Robust parsing in spoken dialogue systems
Pengju Yan, Fang Zheng, Mingxing Xu
A theme structure method for the ellipsis resolution
Yinfei Huang, Fang Zheng, Yi Su, Fang Li, Wenhu Wu
Deriving document structure from prosodic cues
Martin Haase, Werner Kriechbaum, Gregor Möhler, Gerhard Stenzel
Design of a semantic parser with support to ellipsis resolution in a Chinese spoken language dialogue system
Yi Su, Fang Zheng, Yinfei Huang
Methodology for dialogue design in telephone-based spoken dialogue systems: a Spanish train information system
R. San-Segundo, J. M. Montero, J. Colás, J. Gutiérrez, J. M. Ramos, Juan M. Pardo
Spoken dialogue management as planning and acting under uncertainty
Bo Zhang, Qingsheng Cai, Jianfeng Mao, Eric Chang, Baining Guo
Modeling of conversational strategy for the robot participating in the group conversation
Yosuke Matsusaka, Shinya Fujie, Tetsunori Kobayashi
Supporting the construction of a user model in speech-only interfaces by adding multi-modality
Jacques Terken, Saskia te Riele
A word- and turn-oriented approach to exploring the structure of Mandarin dialogues
Shu-Chuan Tseng
A rule based approach to extraction of topics and dialog acts in a spoken dialog system
Yasuhisa Niimi, Tomoki Oku, Takuya Nishimoto, Masahiro Araki
Agent-based error handling in spoken dialogue systems
Markku Turunen, Jaakko Hakulinen
Iterative implementation of dialogue system modules
Lars Degerstedt, Arne Jönsson
Off-talk - a problem for human-machine-interaction?
Daniela Oppermann, Florian Schiel, Silke Steininger, Nicole Beringer
Automatic analysis of real dialogues and generating of training corpora
Jana Schwarz, Vaclav Matousek
Natural language understanding using statistical machine translation
Klaus Macherey, Franz Josef Och, Hermann Ney
Improvements in audio processing and language modeling in the CU communicator
Jianping Zhang, Wayne Ward, Bryan Pellom, Xiuyang Yu, Kadri Hacioglu
Dialogue session: management using voiceXML
Augustine Tsai, Andrew N. Pargellis, Chin-Hui Lee, Joseph P. Olive
Ambiguity representation and resolution in spoken dialogue systems
Egbert Ammicht, Alexandros Potamianos, Eric Fosler-Lussier
Learning of user formulations for business listings in automatic directory assistance
C. Popovici, M. Andorno, P. Laface, L. Fissore, M. Nigra, C. Vair
Mathematical modeling of spoken human - machine dialogues including erroneous confirmations
D. Louloudis, A. Tsopanoglou, Nikos Fakotakis, George Kokkinakis
Limited enquiry negotiation dialogues
Ian Lewin
A comparison of some different techniques for vector based call-routing
Stephen Cox, Ben Shahshahani
Architecture for adaptive multimodal dialog systems based on voiceXML
Georg Niklfeld, Robert Finan, Michael Pucher
Feature extraction by auditory modeling for unit selection in concatenative speech synthesis
Minoru Tsuzaki
Perceptual cost functions for unit searching in large corpus-based text-to-speech
Minkyu Lee
Pruning of redundant synthesis instances based on weighted vector quantization
Sanghun Kim, Youngjik Lee, Keikichi Hirose
Using real words for recording diphones
Susan Fitt
Application of the trended hidden Markov model to speech synthesis
John Dines, Sridha Sridharan, Miles Moody
Two features to check phonetic transcriptions in text to speech systems
Stefano Sandri, Enrico Zovato
Text-to-speech scripting interface for appropriate vocalisation of e-texts
Gerasimos Xydas, Georgios Kouroupetroglou
Representation of large lexica using finite-state transducers for the multilingual text-to-speech synthesis systems
Matej Rojc, Zdravko Kacic
Corpus-based synthesis of fundamental frequency contours based on a generation process model
Keikichi Hirose, Masaya Eto, Nobuaki Minematsu, Atsuhiro Sakurai
Corpus-based database of residual excitations used for speech reconstruction from MFCCs
Zbyn.ek Tychtl, Josef Psutka
Mixed excitation for HMM-based speech synthesis
Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura
Aperiodicity control in ARX-based speech analysis-synthesis method
Takahiro Ohtsuka, Hideki Kasuya
Generalized source-filter structures for speech synthesis
Matti Karjalainen, Tuomas Paatero
The speech synthesis environment and parametric modeling of coarticulation
Mikolaj Wypych
Defining constraints for multilinear speech processing
Julie Carson-Berndsen, Michael Walsh
Prosodic models, automatic speech understanding, and speech synthesis: towards the common ground
Anton Batliner, Bernd Möbius, Gregor Möhler, Antje Schweitzer, Elmar Nöth
Introducing phonetically motivated information into ASR
Heidi Christensenyz, Børge Lindbergy, Ove Anderseny
Integrating contextual phonological rules in a large vocabulary decoder
Guillaume Gravier, Francois Yvon, Bruno Jacob, Frédéric Bimbot
Automatic learning of finite state automata for pronunciation modeling
M. Pastor-i-Gadea, F. Casacuberta
AMR wideband codec - leap in mobile communication voice quality
J. Rotola-Pukkila, J. Vainio, H. Mikkola, K. Järvinen, B. Bessette, Roch Lefebvre, R. Salami, M. Jeline
Combined speech and audio coding with bit rate and bandwidth scalability
Maria Farrugia, Ahmet M. Kondoz
Joint speech and audio coding combining sinusoidal modeling and wavelet packets
Mark Fék, Annamária R. Várkonyi-Kóczy, Jean-Marc Boucher
Temporal decomposition: a promising approach to low rate wideband speech compression
C. H. Ritz, I. S. Burnett
Wideband LSF quantization by generalized voronoi codes
Stephane Ragot, Hassan Lahdili, Roch Lefebvre
Separating speaker and environment variabilities for improved recognition in non-stationary conditions
Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua
Robust speech recognition techniques applied to a speech in noise task
Richard C. Rose, Hong Kook Kim, Don Hindle
Minimax classification with parametric neighborhoods for noisy speech recognition
Mohamed Afify, Olivier Siohan, Chin-Hui Lee
Maximum likelihood non-linear transformation for environment adaptation in speech recognition
M. Padmanabhan, S. Dharanipragada
A study of speech coding parameters in speech recognition
Jari Turunen, Damjan Vlaj
Some practical considerations in the deployment of a wireless-communication interactive voice response system
Carmen Garcia-Mateo, Laura Docio-Fernandez, Antonio Cardenal-Lopez
Caller identification for the SCANMail voicemail browser
Aaron Rosenberg, Julia Hirschberg, Michiel Bacchiani, S. Parthasarathy, Philip Isenhour, Larry Stead
Extractive summarization of voicemail using lexical and prosodic feature subset selection
Konstantinos Koumpis, Steve Renals, Mahesan Niranjan
Business listings in automatic directory assistance
Odette Scharenborg, Janienke Sturm, Lou Boves
Eutrans: a speech-to-speech translator prototype
M. Pastor-i-Gadea, A. Sanchis, F. Casacuberta, E. Vidal
Speech recognition over netmeeting connections
Florian Metze, John McDonough, Hagen Soltau
DIARCA: a component approach to voice recognition
Juan C. Díaz Martín, Juan L. García Zapata, José M. Rodríguez García, José F. Álvarez Salgado, Pablo Espada Bueno, Pedro Gómez Vilda
The mvprotek : m-commerce voice verification system
Y. J. Kyung, J. O. Jung, S. M. Sohn, H. J. Chun, S. Y. Moon, M. H. Kim, W. H. Sull
Real-time multilingual communication by means of prestored conversational units
Norman Alm, Mamoru Iwabuchi, Peter N. Andreasen, Kenryu Nakamura, Iain R. Murray
Writing script-based dialogues for AAC
Iain R. Murray, John L. Arnott, Norman Alm, Richard Dye, Gillian Harper
Communication aid for non-vocal people using corpusbased concatenative speech synthesis
Akemi Iida, Yosuke Sakurada, Nick Campbell, Michiaki Yasumura
Social effects on vocal rate with echoic mimicry using prosody-only voice
Noriko Suzuki, Kazuhiko Kakehi, Yugo Takeuchi, Michio Okada
Everyday life sounds and speech analysis for a medical telemonitoring system
Eric Castelli, Dan Istrate
Speaking while driving - preliminary results on spellings in the German speechdat-car database
Christoph Draxler, Klaus Bengler, Christina Olaverri-Monreal
Efficient periodicity extraction based on sine-wave representation and its application to pitch determination of speech signals
Dan Chazan, Meir (Zibulski) Tzur, Ron Hoory, Gilad Cohen
Viseme recognition using multiple feature matching
I. Shdaifat, R. Grigat, Stefan Lütgert
The fundamental frequency of cough by autocorrelation analysis
A. Van Hirtum, D. Berckmans
A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency
Yuichi Ishimoto, Masashi Unoki, Masato Akagi
Robust LP analysis using glottal source HMM with application to high-pitched and noise corrupted speech
Akira Sasou, Kazuyo Tanaka
Fast harmonic estimation using a low resolution pitch for low bit rate harmonic coding
Yong-Soo Choi, Dae-Hee Youn
Comparative evaluation of F0 estimation algorithms
Alain de Cheveigné, Hideki Kawahara
Identification of accent and intonation in sentences for CALL systems
Carlos Toshinori Ishi, Nobuaki Minematsu, Ryuji Nishide, Keikichi Hirose
Systematic F0 glitches around nasal-vowel transitions
Hideki Kawahara, Parham Zolfaghari
Using aerial and geometric features in automatic lip-reading
Jacek C. Wojdel, Leon J. M. Rothkrantz
Inverse filtering of tube models with frequency dependent tube terminations
Karl Schnell, Arild Lacroix
Formant estimation using gammachirp filterbank
Kaïs Ouni, Zied Lachiri, Noureddine Ellouze
Autoregressive time-frequency interpolation in the context of missing data theory for impulsive noise compensation
I. Potamitis, Nikos Fakotakis
Analysis of the voiced speech using the generalized fourier transform with quadratic phase
D. Petrinovic, Vladimir Cuperman
From here to utility - melding phonetic insight with speech technology
Steven Greenberg
Speech quality measure for voIP using wavelet based bark coherence function
Sang-Wook Park, Young-Cheol Park, Dae-Hee Youn
A proposed method for measuring language dependency of narrow band voice coders
Sander J. van Wijngaarden, Herman J. M. Steeneken
An efficient transcoding algorithm for g.723.1 and g.729a speech coders
Sung Wan Yoon, Sung Kyo Jung, Young Cheol Park, Dae Hee Youn
Joint source-channel coding for low bit-rate coding of LSP parameters
Jose L. Perez-Cordoba, Antonio J. Rubio, Antonio M. Peinado, Angel de la Torre
An investigation of modelling aspects for ratedependent speech recognition
Britta Wrede, Gernot A. Fink, Gerhard Sagerer
Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition
Hiroaki Nanjo, Kazuomi Kato, Tatsuya Kawahara
Analysis of n-best output hypotheses for fast speech in large vocabulary continuous speech recognition
Tibor Fábián, Thilo Pfau, Günther Ruske
Automatic rhythm modeling for language identification
Jérôme Farinas, François Pellegrino
Confidence measure (CM) estimation for large vocabulary speaker-independent continuous speech recognition system
Yaxin Zhang, Raymond Lee, Anton Madievski
Experimental evaluation on confidence of agreement among multiple Japanese LVCSR models
Yasuhiro Kodama, Takehito Utsuro, Hiromitsu Nishizaki, Seiichi Nakagawa
Detection of recognition errors and out of the spelling dictionary names in a spelled name recognizer for Spanish
R. San-Segundo, J. Macías-Guarasa, J. Ferreiros, P. Martín, Juan M. Pardo
Use of acoustic prior information for confidence measure in ASR applications
Erhan Mengusoglu, Christophe Ris
Improving performance of a keyword spotting system by using a new confidence measure
Luciana Ferrer, Claudio Estienne
Word level confidence measures using n-best sub-hypotheses likelihood ratio
Beng T. Tan, Yong Gu, Trevor Thomas
Confidence based lattice segmentation and minimum Bayes-risk decoding
Vaibhava Goel, Shankar Kumar, William Byrne
A data selection strategy for utterance verification in continuous speech recognition
Hui Jiang, Frank K. Soong, Chin-Hui Lee
Improved speech recognition using iterative decoding based on confidence measures
J. Ogata, Y. Ariki
Detection of OOV words using generalized word models and a semantic class language model
Thomas Schaaf
Effects of OOV rates on keyphrase rejection schemes
Gies Bouwman, Janienke Sturm, Lou Boves
A new auditory based microphone array and objective evaluation using e-RASTI
J. L. Sánchez-Bote, J. González-Rodríguez, D. Simón-Zorita
Equivalence between frequency domain blind source separation and frequency domain adaptive null beamformers
Shoko Araki, Shoji Makino, Ryo Mukai, Hiroshi Saruwatari
Separation and dereverberation performance of frequency domain blind source separation for speech in a reverberant environment
Ryo Mukai, Shoko Araki, Shoji Makino
Blind source separation for speech based on fast-convergence algorithm with ICA and beamforming
Hiroshi Saruwatari, Toshiya Kawamura, Kiyohiro Shikano
Noise reduction using paired-microphones for both far-field and near-field sound sources
Mitsunori Mizumachi, Satoshi Nakamura
Statistical sound source identification in a real acoustic environment for robust speech recognition using a microphone array
Takanobu Nishiura, Satoshi Nakamura, Kiyohiro Shikano
Speech enhancement and source separation based on binaural negative beamforming
A. Álvarez-Marquina, P. Gómez-Vilda, R. Martínez-Olalla, V. Nieto-Lluís, V. Rodellar-Biarge
Multiple source separation in the frequency domain using negative beamforming
P. Gómez-Vilda, A. Álvarez-Marquina, V. Nieto-Lluís, V. Rodellar-Biarge, R. Martínez-Olalla
Planar superdirective microphone arrays for speech acquisition in the car
Rainer Martin, Alexey Petrovsky, Thomas Lotter
Is speech data clustered? - statistical analysis of cepstral features
Tomi Kinnunen, Ismo Kärkkäinen, Pasi Fränti
Maximum likelihood adaptation for distant speech recognition of stationary and moving speakers in reverberant environments
George Nokas, Evangelos Dermatas, George Kokkinakis
Model-based blind estimation of reverberation time: application to robust ASR in reverberant environments
Laurent Couvreur, Christophe Ris, Christophe Couvreur
Using the modulation complex wavelet transform for feature extraction in automatic speech recognition
Yasunori Momomura, Kenji Okada, Takayuki Arai, Noboru Kanedera, Yuji Murahara
Separating three simultaneous speeches with two microphones by integrating auditory and visual processing
Hiroshi G. Okuno, Kazuhiro Nakadai, Tino Lourens, Hiroaki Kitano
A time-varying complex AR speech analysis based on GLS and ELS method
Keiichi Funaki
Vocal tract normalization equals linear transformation in cepstral space
Michael Pitz, Sirko Molau, Ralf Schlüter, Hermann Ney
An algorithm for finding line spectrum frequencies of added speech signals and its application to robust speech recognition
An-Tze Yu, Hsiao-Chuan Wang
Improved entropic gain for speech signals analysis/synthesis based on an adaptive time-frequency segmentation scheme
G. Gonon, S. Montrésor, M. Baudry
Automatic word acquisition from continuous speech
Helmut Lucke, Masanori Omote
Why is automatic recognition of children's speech difficult?
Qun Li, Martin J. Russell
Politeness and frustration language in child-machine interactions
Sudha Arunachalam, Dylan Gould, Elaine Andersen, Dani Byrd, Shrikanth Narayanan
Speech emotion recognition using hidden Markov models
Albino Nogueiras, Asunción Moreno, Antonio Bonafonte, José B. Mariño
Speech enhanced remote control for media terminal
Aseel Ibrahim, Jonas Lundberg, Jenny Johansson
The development of a portuguese version of a media watch system
Rui Amaral, Thibault Langlois, Hugo Meinedo, Joao Neto, Nuno Souto, Isabel Trancoso
Classification of video genre using audio
Matthew Roach, John S. Mason
Prosody in finger braille and teletext receiver for finger braille
Yasuo Horiuchi, Akira Ichikawa
Joint channel decoding - Viterbi recognition for wireless applications
Alexis Bernard, Abeer Alwan
MMSE-based channel error mitigation for distributed speech recognition
Antonio M. Peinado, Victoria Sanchez, José C. Segura, José L. Perez-Cordoba
Distributed speech recognition using traditional and hybrid modeling techniques
J. Stadermann, R. Meermeier, Gerhard Rigoll
Graceful degradation of speech recognition performance over lossy packet networks
Eve A. Riskin, Constantinos Boulis, Scott Otterson, Mari Ostendorf
Experiments on cross-language acoustic modeling
Tanja Schultz, Alex Waibel
Crosslingual speech recognition with multilingual acoustic models based on agglomerative and tree-based triphone clustering
Andrej Zgank, Bojan Imperl, Finn Tore Johansen, Zdravko Kacic, Bogomir Horvat
Comparing parameter tying methods for multilingual acoustic modelling
Mikko Harju, Petri Salmela, Jussi Leppänen, Olli Viikki, Jukka Saarinen
Accent-independent universal HMM-based speech recognizer for american, australian and british English
Rathi Chengalvarayan
The effect of time stress on automatic speech recognition accuracy when using second language
Fang Chen, Jonas Sääv
The effect of pitch and lexical tone on different Mandarin speech recognition tasks
Yiu Wing Wong, Eric Chang
Acoustic modeling of foreign words in a German speech recognition system
Georg Stemmer, Elmar Nöth, Heinrich Niemann
Semi-automatic grammar induction for bi-directional English-Chinese machine translation
K. C. Siu, Helen M. Meng
F0 feature extraction by polynomial regression function for monosyllabic Thai tone recognition
Patavee Charnvivit, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Umavasee Thathong, Boonchai Thampanitchawong
The use of prosody in a combined system for punctuation generation and speech recognition
Ji-Hwan Kim, P. C. Woodland
Lexical stress modeling for improved speech recognition of spontaneous telephone speech in the jupiter domain
Chao Wang, Stephanie Seneff
Modeling auxiliary information in Bayesian network based ASR
Todd A. Stephenson, M. Mathew, Herve Bourlard
A new dynamic HMM model for speech recognition
Feili Chen, Eric Chang
Multi-keyword spotting of telephone speech using orthogonal transform-based SBR and RNN prosodic model
Wern-Jun Wang, Chun-Jen Lee, Eng-Fong Huang, Sin-Horng Chen
Recognition of slovenian speech: within and cross-language experiments on monophones using the speechdat(II)
Andrej Iskra, Bojan Petek, Tom Brøndsted
Boiling down prosody for the classification of boundaries and accents in German and English
Anton Batliner, Jan Buckow, Richard Huber, Volker Warnke, Elmar Nöth, Heinrich Niemann
Javaspeakerrecognition - interactive workbench for visualizing speaker recognition concepts on the WWW
Andrzej Drygajlo, Gary Garcia Molina
Prototype of a vocal-tract model for vowel production designed for education in speech science
Takayuki Arai, Nobuyuki Usuki, Yuji Murahara
A tool for automatic feedback on phonemic transcription
Martin Cooke, Maria Luisa Garcia-Lecumberri, John Maidment
Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research
Eric Chang, Yu Shi, Jianlai Zhou, Chao Huang
Relating phonepass scores overall scores to the council of europe framework level descriptors
John H. A. L. de Jong, Jared Bernstein
A multilingual, multimodal, speech training system, SPECO
Klára Vicsi, Peter Roach, Anne-Marie Öster, Zdravko Kacic, F. Csatári, A. Sfakianaki, R. Veronik, Géza Gordos
Instantaneous estimation of accentuation habits for Japanese students to learn English pronunciation
Naoki Nakamura, Nobuaki Minematsu, Seiichi Nakagawa
Automatic construction of CALL system from TV news program with captions
Takashi Tanaka, Kazumasa Mori, Satoshi Kobayashi, Seiichi Nakagawa
Pitch-dependent GMMs for text-independent speaker recognition systems
Mijail Arcienega, Andrzej Drygajlo
Towards combining pitch and MFCC for speaker recognition systems
Hassan Ezzaidi, Jean Rouat, Douglas OShaughnessy
Formant-broadened CMS using peak-picking in LOG spectrum
Yu-Jin Kim, Hea-Kyoung Jung, Jae-Ho Chung
Improvements in the speaker identification rate using feature-sets
Daniel J. Mashao, N. Tinyiko Baloyi
Minimum classification error training for speaker identification using Gaussian mixture models based on multi-space probability distribution
Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura
Speaker recognition based on feature space trace
Wu Yadong, Li Zhizhu
Additive and convolutional noise canceling in speaker verification using a stochastic weighted viterbi algorithm
Nestor Becerra Yoma, Miguel Villar Fernandez
A multi-SNR subband model for speaker identification under noisy environments
Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki
Article |
---|