doi: 10.21437/ICSLP.2000
ISSN: 2958-1796
Subglottal pressure and prosody in Swedish
Johan Liljencrants, Gunnar Fant, Anita Kruckenberg
Observation of laryngeal control for voicing and pitch change by magnetic resonance imaging technique
Kiyoshi Honda, Shinobu Masaki, Yasuhiro Shimada
Physiological mechanisms for fundamental frequency control in standard Chinese
Hiroya Fujisaki, Ryou Tomana, Shuichi Narusawa, Sumio Ohno, Changfu Wang
On vocal tract asymmetry/symmetry
René Carré
Are static MRI measurements representative of dynamic speech? results from a comparative study using MRI, EPG and EMA
Olov Engwall
Prosodic control in Chinese TTS system
Shinan Lu, Lin He, Yufang Yang, Jianfen Cao
Multistage coarticulation model combining articulatory, formant and cepstral features
Yuqing Gao, Raimo Bakis, Jing Huang, Bing Xiang
Rhythmic organization and signal characteristics of speech
Osamu Fujimura
Oral culture in the 21st century: the case of speech processing
Sven E. G. Öhman
On the correlation between facial movements, tongue movements and speech acoustics
Jintao Jiang, Abeer Alwan, Lynne E. Bernstein, Patricia Keating, Ed Auer
Coarticulation patterns in identical twins: an acoustic case study
S. P. Whiteside, E. Rixon
Improved lexicon formation through removal of co-articulation and acoustic recognition errors
Philip Hanna, Darryl Stewart, Ji Ming, F. Jack Smith
A two-level approach to the handling of foreign items in Swedish speech technology applications
Anders Lindström, Anna Kasaty
Word repetitions in Japanese spontaneous speech
Yasuharu Den, Herbert H. Clark
The role of language experience in speaker and rate normalization processes
Allard Jongman, Corinne B. Moore
Data-driven importance analysis of linguistic and phonetic information
Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann
Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language
Hiroya Fujisaki, Katsuhiko Shirai, Shuji Doshita, Seiichi Nakagawa, Keikichi Hirose, Shuichi Itahashi, Tatsuya Kawahara, Sumio Ohno, Hideaki Kikuchi, Kenji Abe, Shinya Kiriyama
The expression and recognition of emotions through prosody
Li-chiung Yang
Prosodic marking of information status in tokyo Japanese
Marc Swerts, Miki Taniguchi, Yasuhiro Katagiri
Influence of duration on static and dynamic properties of German vowels in spontaneous speech
Britta Wrede, Gernot A. Fink, Gerhard Sagerer
The regular accent in Chinese sentences
Bo Zheng, Bei Wang, Yufang Yang, Shinan Lu, Jianfen Cao
A tool for the synchronization of speech and mouth shapes: LIPS
Odile Mella, Dominique Fohr, Laurent Martin, Andreas Carlen
Semantic tree unification grammar: a new formalism for spoken language processing
Mohamed-Zakaria Kurdi
Identification of utterance intention in Japanese spontaneous spoken dialogue by use of prosody and keyword information
Akira Kurematsu, Yousuke Shionoya
Improved speech understanding using dialogue expectation in sentence parsing
Sherif Abdou, Michael Scordilis
The use of belief networks for mixed-initiative dialog modeling
Helen M. Meng, Carmen Wai, Roberto Pieraccini
Integrating flexibility into a structured dialogue model: some design considerations
Michael F. McTear, Susan Allen, Laura Clatworthy, Noelle Ellison, Colin Lavelle, Helen McCaffery
A task-independent dialogue controller based on the extended frame-driven method
Yasuhisa Niimi, Tomoki Oku, Takuya Nishimoto, Masahiro Araki
Language modeling for dialog system
Wei Xu, Alex Rudnicky
Building stochastic language model networks based on simultaneous word/phrase clustering
Kallirroi Georgila, Nikos Fanotakis, George Kokkinakis
Prosody and topic structuring in spoken dialogue
Li-chiung Yang, Richard Esposito
Elements of conversational computing - a paradigm shift
Stéphane H. Maes
Rejection and key-phrase spottin techniques using a mumble model in a czech telephone dialog system
Ludek Müller, Filip Jurcicek, Lubos Smidl
Continuous listening for unconstrained spoken dialog
Tim Paek, Eric Horvitz, Eric Ringger
Audio signals in speech interfaces
Stefanie Shriver, Alan W. Black, Ronald Rosenfeld
Visualisation of spoken dialogues
Péter Pál Boda
The construction of speech output to support elderly visually impaired users starting to use the internet
Mary Zajicek
Effects of word string language models on noisy broadcast news speech recognition
Kazuyuki Takagi, Rei Oguro, Kazuhiko Ozeki
Semantic tokenization of verbalized numbers in language modeling
Xiaoqiang Luo, Martin Franz
Automatic transcription of lecture speech using topic-independent language modeling
Kazuomi Kato, Hiroaki Nanjo, Tatsuya Kawahara
Extending grammars based on similar-word recognition
Rocio Guillén, Randal Erman
Particle-based language modelling
E. W. D. Whittaker, P. C. Woodland
Lexical tree decoding with a class-based language model for Chinese speech recognition
W. N. Choi, Y. W. Wong, Tan Lee, P. C. Ching
Impact of bucketing on performance of linearly interpolated language models
K. Visweswariah, H. Printz, M. Picheny
An embedded knowledge integration for hybrid language modelling
Shuwu Zhang, Hirofami Yamamoto, Yoshinori Sagisaka
Hierarchical statistical language models: experiments on in-domain adaptation
Lucian Galescu, James Allen
A language model for conversational speech recognition using information designed for speech translation
Hirofumi Yamamoto, Kouichi Tanigaki, Yoshinori Sagisaka
Optimizing BNF grammars through source transformations
Bob Carpenter, Sol Lerner, Roberto Pieraccini
On enhancing katz-smoothing based back-off language model
Jian Wu, Fang Zheng
Can artificial neural networks learn language models?
Wei Xu, Alex Rudnicky
Improving language model perplexity and recognition accuracy for medical dictations via within-domain interpolation with literal and semi-literal corpora
Guergana Savova, Michael Schonwetter, Sergey Pakhomov
Placing structuring elements in a word sequence for generating new statistical language models
Karl Weilhammer, Günther Ruske
Dynamic selection of language models in a dialogue system
Yannick Estève, Frédéric Béchet, Renato de Mori
Stochastic modeling of semantic content for use IN a spoken dialogue system
Magne H. Johnsen, Trym Holter, Torbjørn Svendsen, Erik Harborg
Spoken word recognition using the artificial evolution of a set of vocabulary
Tomio Takara, Eiji Nagaki
Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems
Eric Horvitz, Tim Paek
Chinese spoken language understanding across domain
Yunbin Deng, Bo Xu, Taiyi Huang
Interpolation of stochastic grammar and word bigram models in natural language understanding
Sven C. Martin, Andreas Kellner, Thomas Portele
A portable development tool for spoken dialogue systems
Satoru Kogure, Seiichi Nakagawa
Error-tolerant language understanding for spoken dialogue systems
Yi-Chung Lin, Huei-Ming Wang
Language modeling by stochastic dependency grammar for Japanese speech recognition
Akinori Ito, Chiori Hori, Masaharu Katoh, Masaki Kohda
A tagger-aided language model with a stack decoder
Ruiqiang Zhang, Ezra Black, Andrew Finch, Yoshinori Sagisaka
Generalizing prosodic prediction of speech recognition errors
Julia Hirschberg, Diane Litman, Marc Swerts
Toward unconstrained command and control: data-driven semantic inference
Jerome R. Bellegarda, Kim E. A. Silverman
Continuous speech recognition with parse filtering
Ken Hanazawa, Shinsuke Sakai
Investigating text normalization and pronunciation variants for German broadcast transcription
Martine Adda-Decker, Gilles Adda, Lori Lamel
A comparison of data-derived and knowledge-based modeling of pronunciation variation
Mirjam Wester, Eric Fosler-Lussier
A bottom-up method for obtaining information about pronunciation variation
Judith M. Kessens, Helmer Strik, Catia Cucchiarini
Semi-continuous segmental probability modeling for continuous speech recognition
Jiyong Zhang, Fang Zheng, Mingxing Xu, Ditang Fang
Acoustic modelling using modular/ensemble combinations of heterogeneous neural networks
Christos A. Antoniou, T. Jeff Reynolds
Unifying HMM and phone-pair segment models
Hsiao-Wuen Hon, Shankar Kumar, Kuansan Wang
Multi-group mixture weight HMM
Ming Li, Tiecheng Yu
Application of pattern recognition neural network model to hearing system for continuous speech
Tetsuro Kitazoe, Tomoyuki Ichiki, Makoto Funamori
Data-dependent kernels in svm classification of speech patterns
Nathan Smith, Mahesan Niranjan
Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition
S. Umesh, Richard C. Rose, S. Parthasarathy
Large vocabulary continuous speech recognition under real environments using adaptive sub-band spectral subtraction
Masahiro Fujimoto, Jun Ogata, Yasuo Ariki
Perceptual harmonic cepstral coefficients as the front-end for speech recognition
Liang Gu, Kenneth Rose
Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition
Yik-Cheung Tam, Brian Mak
On the use of speaking rate as a generalized feature to improve decision trees
Robert Faltlhauser, Thilo Pfau, Günther Ruske
Syllable recognition using glides based on a non-linear transformation
Jun Toyama, Masaru Shimbo
Consonant discrimination in elicited and spontaneous speech: a case for signal-adaptive front ends in ASR
Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg, Horacio Franco
A new approach for multi-band speech recognition based on probabilistic graphical models
Khalid Daoudi, Dominique Fohr, Christophe Antoine
Test of several external posterior weighting functions for multiband full combination ASR
Hervé Glotin, Frédéric Berthommier
Using the modulation wavelet transform for feature extraction in automatic speech recognition
Kanji Okada, Takayuki Arai, Noburu Kanederu, Yasunori Momomura, Yuji Murahara
AM-demodulation of speech spectra and its application io noise robust speech recognition
Qifeng Zhu, Abeer Alwan
Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR
Astrid Hagen, Andrew Morris
Using multiple time scales in the framework of multi-stream speech recognition
Astrid Hagen, Hervé Bourlard
Streamlining the front end of a speech recognizer
Hua Yu, Alex Waibel
Reconstruction of damaged spectrographic features for robust speech recognition
Bhiksha Raj, Michael L. Seltzer, Richard M. Stern
Impact of speaking style and speaking task on acoustic models
Janienke Sturm, Hans Kamperman, Lou Boves, Els den Os
Encoded speech recognition accuracy improvement in adverse environments by enhancing formant spectral bands
Shubha Kadambe, Ron Burns
Soft decisions in missing data techniques for robust automatic speech recognition
Jon Barker, Ljubomir Josifovski, Martin Cooke, Phil Green
New tone recognition methods for Chinese continuous speech
Jian Liu, Tiecheng Yu
Reliable bands guided similarity measure for noise-robust speech recognition
Bo Zhang, Gang Peng, William S.-Y. Wang
A novel feature extraction using multiple acoustic feature planes for HMM-based speech recognition
Tsuneo Nitta, Masashi Takigawa, Takashi Fukuda
Integrating the energy information into MFCC
Fang Zheng, Guoliang Zhang
Speaker independent phoneme recognition by MLP using wavelet features
Omar Farooq, Sekharjit Datta
A corpus-based approach for robust ASR in reverberant environments
Laurent Couvreur, Christophe Couvreur, Christophe Ris
Modeling out-of-vocabulary words for robust speech recognition
Issam Bazzi, James R. Glass
Hidden Markov model environmental compensation for automatic speech recognition on hand-held mobile devices
Bojana Gajic, Richard C. Rose
A neural network for classification with incomplete data: application to robust ASR
Andrew C. Morris, Ljubomir Josifovski, Hervé Bourlard, Martin Cooke, Phil Green
Feature-dependent allophone clustering
Shigeki Matsuda, Mitsuru Nakai, Hiroshi Shimodaira, Shigeki Sagayama
Data-driven lexical modeling of pronunciation variations for ASR
Qian Yang, Jean-Pierre Martens
Fuzzy entropy hidden Markov models for speech recognition
Dat Tran, Michael Wagner
Adjacent node continuous-state HMMs
Carl Quillen
Modelling phonetic context using head-body-tail models for connected digit recognition
Janienke Sturm, Eric Sanders
Using support vector machines for spoken digit recognition
Issam Bazzi, Dina Katabi
Data-driven model construction for continuous speech recognition using overlapping articulatory features
Jiping Sun, Xing Jing, Li Deng
Speech recognition using HMMs with quantized parameters
Marcel Vasilache
A perception and PDE based nonlinear transformation for processing spoken words
Yingyong Qi, Jack Xin
Training of isolated word recognizers with continuous speech
Reinhard Blasig, Georg Rose, Carsten Meyer
Repair patterns in spontaneous Chinese dialogs: morphemes, words, and phrases
Shu-Chuan Tseng
Improvement of a physiological articulatory model for synthesis of vowel sequences
Jianwu Dang, Kiyoshi Honda
Computation of 3-d vocal tract acoustics based on mode-matching technique
Kunitoshi Motoki, Xavier Pelorson, Pierre Badin, Hiroki Matsuzaki
Exploring vowel production strategies from infant to adult by means of articulatory inversion of formant data
Lucie Ménard, Louis-Jean Boë
Segmentation of a speech waveform according to glottal open and closed phases using an autoregressive-HMM
Gavin Smith, Tony Robinson
Comparison of inverse filtering of the flow signal and microphone signal
Rosemary Orr, Bert Cranen, Felix de Jong, Lou Boves
Inter- and intra-speaker variability of glottal flow derivative using the LF model
Markus R. Iseli, Abeer Alwan
Multi-level annotation for spoken language corpora
Philippe Blache, Daniel Hirst
CASS: a phonetically transcribed corpus of mandarin spontaneous speech
Aijun Li, Fang Zheng, William Byrne, Pascale Fung, Terri Kamm, Yi Liu, Zhanjiang Song, Umar Ruhi, Veera Venkataramani, XiaoXia Chen
Multiple decision-tree strategy for input-error robustness: a simulation of tree combinations
Kazuhide Yamamoto, Eiichiro Sumita
Discriminative training on language model
Zheng Chen, Kai-Fu Lee, Ming-jing Li
N-gram distribution based language model adaptation
Jianfeng Gao, Mingjing Li, Kai-Fu Lee
Towards a common phone alphabet for multilingual speech recognition
Francisco Palou, P. Bravetti, O. Emam, V. Fischer, Eric Janke
What²s next: a case study in the multidimensionality of a dialog system
Robert Belvin, Ron Burns, Cheryl Hein
A new dialogue control method based on human listening process to construct an interface for ascertaining a user²s inputs
Masanobu Higashida, Kumiko Ohmori
Spoken language understanding in a Chinese spoken dialogue system engine
XianFang Wang, LiMin Du
Statistical methods for topic segmentation
Satya Dharanipragada, Martin Franz, J. Scott McCarley, K. Papineni, Salim Roukos, T. Ward, W.-J. Zhu
Retrieval of mandarin broadcast news using spoken queries
Berlin Chen, Hsin-min Wang, Lin-shan Lee
CU-move: robust speech processing for in-vehicle speech systems
John H. L. Hansen, Jay Plucienkowski, Stephen Gallant, Bryan Pellom, Wayne Ward
A rule-based named entity recognition system for speech input
Ji-Hwan Kim, Philip C. Woodland
A rule-based approach to farsi language text-to-phoneme conversion
Mohammad Reza Sadigh, Hamid Sheikhzadeh, M. R. Jahangir, Arash Farzan
Acoustic and perceptual properties of English fricatives
Allard Jongman, Yue Wang, Joan Sereno
The special phonological characteristics of monosyllabic function words in English
Stefanie Shattuck-Hufnagel, Nanette Veilleux
Selection of sublexical units for continuous speech recognition of basque
Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, Luis Javier Rodríguez
Machine learning techniques for the identification of cues for stop place
Madelaine C. Plauché, Kemal Sönmez
Strategies of vowel reduction - a speaker-dependent phenomenon
Christina Widera
Syllable-final /s/ lenition in the LDC's callhome Spanish corpus
Michelle A. Fox
Meaning extraction based on frame representation for Japanese spoken dialogue
Akira Kurematsu, Takeaki Nakazaki
Pitch accents, boundary tones and turn-taking in dutch map task dialogues
Johanneke Caspers
An annotation scheme of spoken dialogues with topic break indexes
Yoichi Yamashita, Michiyo Murai
Application of the centering framework in spontaneous dialogues
Nanette Veilleux
Automatic lexicon generation and dialogue modeling for spontaneous speech
Hiroki Mori, Hideki Kasuya
Evaluating radio news intonation - autosegmental versus superpositional modelling
Maria Wolters, Hansjörg Mixdorff
A mixed language model for a dialogue system over ihe telephone
Daniele Falavigna, Roberto Gretter, Marco Orlandi
Positive and negative user feedback in a spoken dialogue corpus
Linda Bell, Joakim Gustafson
Stress and lexical activation in dutch
Anne Cutler, Mariëtte Koster
Automatic modeling and implementation of intonation for the arabic language in TTS systems
Safa Nasser Eldin, Hanna Abdel Nour, Rajouani Abdenbi
Modeling word durations
Venkata Ramana Rao Gadde
Japanese intonation synthesis using superposition and linear alignment models
Jennifer J. Venditti, Jan P. H. van Santen
Improving the naturalness of synthetic speech by utilizing the prosody of natural speech
Toshimitsu Minowa, Ryo Mochizuki, Hirofumi Nishimura
A hybrid statistical/RNN approach to prosody synthesis for taiwanese TTS
Sin-Horng Chen, Chen-Chung Ho
Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances
Nobuaki Minematsu, Yukiko Fujisawa, Seiichi Nakagawa
Restricted-domain female-voice synthesis in Spanish: from database design to ANN prosodic modeling
Juan Manuel Montero, Ricardo Córdoba, José A. Vallejo, Juana Gutiérrez-Arriola, Emilia Enríquez, Juan Manuel Pardo
A hierarchical intonation model for synthesising F0 contours in galician language
Xavier Fernández-Salgado, Eduardo R. Banga
Features for F0 contour prediction
Ted H. Applebaum, Nick Kibre, Steve Pearson
Prosodic variation of focused syllables of disyllabic word in Mandarin Chinese
Zhenglai Gu, Hiroki Mori, Hideki Kasuya
Automatic head gesture learning and synthesis from prosodic cues
Stephen M. Chu, Thomas S. Huang
Measuring the importance of morphological information for finnish speech synthesis
Martti Vainio, Toomas Altosaar, Stefan Werner
Learning the parameters of quantitative prosody models
Oliver Jokisch, Hansjörg Mixdorff, Hans Kruschke, Ulrich Kordon
A method for automatic extraction of parameters of the fundamental frequency contour
Shuichi Narusawa, Hiroya Fujisaki, Sumio Ohno
Recognition of emotional states using voice, face image and thermal image of face
Tetsuro Kitazoe, Sung-Ill Kim, Yasunari Yoshitomi, Tatsuhiko Ikeda
Turn taking and multimodal information in two-people dialog
Keiko Watanuki, Susumu Seki, Hideo Miyoshi
Implementation of a text-to-speech system for farsi language
Hamid Reza Abutalebi, Mahmood Bijankhan
Recognition of emotion in a realistic dialogue scenario
Richard Huber, Anton Batliner, Jan Buckow, Elmar Nöth, Volker Warnke, Heinrich Niemann
Differentiation in tone production in cantonese-speaking hearing-impaired children
Johanna Barry, Peter Blamey, Kathy Lee, Dilys Cheung
Learning effects for phonetic properties of synthetic speech
Martine van Zundert, Jacques Terken
An empirical study of the effectiveness of speech-recognition-based pronunciation training
Laura Mayfield Tomokiyo, Le Wang, Maxine Eskenazi
Automatic detection of mispronounced phonemes for language learning tools
Olivier Deroo, Christophe Ris, Sofie Gielen, Johan Vanparys
Estimation of duration models for phonemes in m exican speech synthesis
Horacio Meza Escalona, Ingrid Kirschning, Ofelia Cervantes Villagómez
Special text processing based external descriptor rule
Xiaoru Wu, Renhua Wang, Guoping Hu
Articulatory synthesis using a vocal-tract model of variable length
Zhenli Yu, Shangcui Zeng
Linguistic-prosodic processing for text-to-speech synthesis in italian
Philippe Boula de Mareüil
A unified approach for speech synthesis and speech recognition using stochastic Markov graphs
Matthias Eichner, Matthias Wolff, Rüdiger Hoffmann
Using F0 within a phonologically motivated method of unit selection
Andrew Breen, James Salter
Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context
Christophe J. Blouin, Paul C. Bagshaw
Automatic construction of acoustic inventory for the concatenative speech synthesis for polish
Artur Janicki
Universal and multilingual unit selection for DRESS
Diane Hirschfeld, Matthias Wolff
Improving speech synthesis for high intelligibility under adverse conditions
Davis Pan, Brian Heng, Shiufun Cheung, Ed Chang
Development of a formant-based analysis-synthesis system and generation of high quality liquid sounds of Japanese
Nobuyuki Nishizawa, Nobuaki Minematsu, Keikichi Hirose
Synthesizing and evaluating an artificial language: klingon
Oliver Jokisch, Matthias Eichner
Non-standard word and homograph resolution for asian language text analysis
Craig Olinsky, Alan W. Black
Re-estimation of LPC coefficients in the sense of l&inf; criterion
Zhang Sen, Katsuhiko Shirai
An efficient codebook search algorithm for EVRC
Sung-Kyo Jung, Yong-Soo Choi, Young-Cheol Park, Dae-Hee Youn
The reduction of the search time by the pre-determination of the grid bit in the g.723.1 MP-MLQ
Jong-Kuk Kim, Jeong-Jin Kim, Myung-Jin Bae
Real-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement
Sebastian Möller, Hervé Bourlard
HMM-based echo and announcement modeling approaches for noise suppression avoiding the problem of false triggers
Rathinavelu Chengalvarayan, David L. Thomson
Speaker information enhancement
Fangxin Chen
Exhaustive search for lower-bound error-rates in vocal tract length normalization
Hans Dolfing
Use of voicing information to improve the robustness of the spectral parameter set
Dusan Macho, Climent Nadeu
Residual noise compensation by a sequential EM algorithm for robust speech recognition in nonstationary noise
Kaisheng Yao, Bertram E. Shi, Satoshi Nakamura, Zhigang Cao
Principal mixture speaker adaptation for improved continuous speech recognition
Hui Ye, Pascale Fung, Taiyi Huang
Reduced impedance mismatch in speech database access
Toomas Altosaar, Martti Vainio
Internet training system for listening and pronunciation of Chinese stop consonants
Jiapeng Tian, Jouji Miwa
Identification of Japanese double-mora phonemes considering speaking rate for the use in CALL systems
Carlos Toshinori Ishi, Keikichi Hirose, Nobuaki Minematsu
Phonological processing in the auditory system: a new class of stimuli and advances in fmri techniques
Roy D. Patterson, Stefan Uppenkamp, Dennis Norris, William Marslen-Wilson, Ingrid Johnsrude, Emma Williams
Brain regions responsible for word retrieval, speech production and deficient word fluency in elderly people: a PET activation study
Itaru F. Tatsumi, Michio Senda, Kenji Ishii, Masahiro Mishina, Masashi Oyama, Hinako Toyama, Keiichi Oda, Masayuki Tanaka, Yasuyuki Gondo
MEG-measurements of brain activity reveal the link between human speech production and perception
Paavo Alku, Hannu Tiitinen, Kalle J. Palomäki, Päivi Sivonen
Normal and impaired processing in quasi-regular domains of language: the case of English past-tense verbs
Karalyn Patterson, Matthew A. Lambon Ralph, Helen Bird, John R. Hodges, James L. McClelland
Neuropsychological and computational evidence for a model of lexical processing, verbal short-term memory and learning
Nadine Martin, Eleanor M. Saffran, Gary S. Dell, Myrna F. Schwartz, Prahlad Gupta
Normal and impaired reading of Japanese kanji and kana
Takao Fushimi, Mutsuo Ijuin, Naoko Sakuma, Masayuki Tanaka, Tadahisa Kondo, Shigeaki Amano, Karalyn Patterson, Itaru F. Tatsumi
A connectionist approach to naming disorders of Japanese in dyslexic patients
Mutsuo Ijuin, Takao Fushimi, Karalyn Patterson, Naoko Sakuma, Masayuki Tanaka, Itaru Tatsumi, Tadahisa Kondo, Shigeaki Amano
Impaired pronunciations of kanji words by Japanese CVA patients
Taeko N. Wydell, Takako Shinkai
Disability of phonological versus visual information processes in Japanese dyslexic children
Akira Uno, M. Kaneko, N. Haruhara, M. Kaga
Lexical tone in the spoken word recognition of Chinese
Xiaolin Zhou, Yanxuan Qu
Lexical tone in the speech production of Chinese words
Xiaolin Zhou, Jie Zhuang
Prosody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour
Yu Hu, Qin-Feng Liu, Ren-Hua Wang
Multi-strategy data mining on Mandarin prosodic patterns
Yiqiang Chen, Wen Gao, Tingshao Zhu, Jiyong Ma
A unified view on synchronized overlap-add methods for prosodic modifications of speech
Werner Verhelst, Dirk van Compernolle, Patrick Wambacq
Chinese tone modeling with stem-ML
Chilin Shih, Greg P. Kochanski
Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis
Colin W. Wightman, Ann K. Syrdal, Georg Stemmer, Alistair Conkie, Mark Beutnagel
Data-driven importance analysis of linguistic and phonetic information
Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann
Tonal structure of yes-no question intonation in chaha
Zhiqiang Li, Degif Petros Banksira
Improved tone recognition by normalizing for coarticulation and intonation effects
Chao Wang, Stephanie Seneff
Discriminating Chinese lexical tones by anchoring F0 features
Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose
Universal and language-specific effects in the perception of question intonation
Carlos Gussenhoven, Aoju Chen
The interplay and interaction between prosody and syntax: evidence from Mandarin Chinese
Chiu-Yu Tseng, Da-De Chen
A quantitative description of German prosody offering symbolic labels as a by-product
Hansjörg Mixdorff, Hiroya Fujisaki
Towards a universal speech interface
Roni Rosenfeld, Xiaojin Zhu, Arthur Toth, Stefanie Shriver, Kevin Lenzo, Alan W. Black
A domain model centered approach to spoken language dialog systems
Dale Russell
From multilingual multimodal spoken language acquisition towards on-line assistance to intermittent human interpreting: SIM*, a versatile environment for SLP
Georges Fafiotte, Jian-She Zhai
Informational characterization of dialogue states
Matthias Denecke
A new method for dialogue management in an intelligent system for information retrieval
Kenji Abe, Kazushige Kurokawa, Kazunari Taketa, Sumio Ohno, Hiroya Fujisaki
The AT&t-DARPA communicator mixed-initiative spoken dialog system
Esther Levin, Shrikanth Narayanan, Roberto Pieraccini, Konstantin Biatov, E. Bocchieri, Giuseppe Di Fabbrizio, Wieland Eckert, S. Lee, A. Pokrovsky, Mazin Rahim, P. Ruscitti, M. Walker
Integrating multimodal language processing with speech recognition
Srinivas Bangalore, Michael Johnston
Task and domain specific modelling in the Carnegie Mellon communicator system
Alexander I. Rudnicky, Christina Bennett, Alan W. Black, Ananlada Chotimongkol, Kevin Lenzo, Alice Oh, Rita Singh
Adapt - a multimodal conversational dialogue system in an apartment domain
Joakim Gustafson, Linda Bell, Jonas Beskow, Johan Boye, Rolf Carlson, Jens Edlund, Björn Granström, David House, Mats Wirén
Implementation of a multimodal dialog system using extended markup languages
Kuansan Wang
ORION: from on-line interaction to off-line delegation
Stephanie Seneff, Chian Chuu, D. Scott Cyphers
Practical spoken language translation using compiled feature structure grammars
Lei Duan, Alexander Franz, Keiko Horiguchi
ISIS: A multilingual spoken dialog system developed with CORBA and KQML agents
Helen Meng, Shuk Fong Chan, Yee Fong Wong, Tien Ying Fung, Wai Ching Tsui, Tin Hang Lo, Cheong Chat Chan, Ke Chen, Lan Wang, Ting Yao Wu, Xiaolong Li, Tan Lee, Wing Nin Choi, Yiu Wing Wong, P. C. Ching, Huisheng Chi
New feature parameters for detecting misunderstandings in a spoken dialogue system
Jun-Ichi Hirasawa, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa
Toward an acoustic-articulatory model of inter-speaker variability
Parham Mokhtari, Frantz Clermont, Kazuyo Tanaka
Degrees of freedom of tongue movements in speech may be constrained by biomechanics
Pascal Perrier, Joseph Perkell, Yohan Payan, Majid Zandipour, Frank Guenther, Ali Khalighi
Gestural overlap, place of articulation and speech rate - an x-ray investigation
Béatrice Vaxelaire, Rudolph Sock, Pascal Perrier
Articulatory compensation and adaptation for unexpected palate shape perturbation
Masaaki Honda, Akinori Fujino
Modeling of a speech production system based on MRI measurement of three-dimensional vocal tract shapes during fricative consonant phonation
Takuya Niikawa, Masafumi Matsumura, Takashi Tachimura, Takeshi Wada
Improving acoustic-to-articulatory inversion by using hypercube codebooks
Slim Ouni, Yves Laprie
Concatenative arabic speech synthesis using large speech database
Wael M. Hamza, Mohsen A. Rashwan
A new speech classifier based on Yinyang compensatory soft computing theory
Dong Chen, Jingming Kuang, Yan Zhang
New models predicting conversational effects of telephone transmission on speech communication quality
Sebastian Möller, Ute Jekosch, Alexander Raake
A novel search algorithm for LSF VQ
Jinyu Li, Xin Luo, Ren-Hua Wang
Conversational networking: conversational protocols for transport, coding, and control
Stéphane H. Maes, Dan Chazan, Gilad Cohen, Ron Hoory
A low bit rate speech coding method using a formant-articulatory parameter nomogram
Hiroshi Ohmura, Akira Sasou, Kazuyo Tanaka
Variable bit-rate sinusoidal transform coding using variable order spectral estimation
Ning Li, Derek J. Molyneux, Meau Shin Ho, B. M. G. Cheetham
Efficient harmonic-CELP based hybrid coding of speech at low bit rates
Yong-Soo Choi, Sueng-Kyun Ryu, Young-Cheol Park, Dae-Hee Youn
Speech enhancement based on a constrained sinusoidal model
Jesper Jensen, John H. L. Hansen
A bark coherence function for perceived speech quality estimation
Sang-Wook Park, Seung-Kyun Ryu, Young-Cheol Park, Dae-Hee Youn
A high-efficiency scheme for secure speech transmission using spatiotemporal chaos synchronization
Jinyu Kiang, Kun Deng, Ronghuai Huang
Application of speaker authentication technology to a telephone dialogue system
Leandro Rodríguez Liñares, Carmen García Mateo
Language recognition using time-frequency principal component analysis and acoustic modeling
Michel Dutat, Ivan Magrin-Chagnolleau, Frédéric Bimbot
Comparative study of GMM, DTW, and ANN on Thai speaker identification system
Chularat Tanprasert, Varin Achariyakulporn
Efficient mixed-order hidden Markov model inference
Ludwig Schwardt, Johan du Preez
Speaker identification and verification using eigenvoices
Olivier Thyes, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua
A priori threshold selection for fixed vocabulary speaker verification systems
Arun C. Surendran, Chin-Hui Lee
Application of LDA to speaker recognition
Qin Jin, Alex Waibel
Automatic language identification using mixed-order HMMs and untranscribed corpora
Ludwig Schwardt, Johan du Preez
On the potential threat of using large speech corpora for impostor selection in speaker verification
Johan Lindberg, Mats Blomberg
Phonetic consistency in Spanish for pin-based speaker verification system
J. Ortega-Garcia, J. G. Rodriguez, D. T. Merino
An auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition
Zhimin Liu, Xihong Wu, Bin Zhen, Huisheng Chi
Transition-oriented hidden Markov models for speaker verification
S. Douglas Peters, Matthieu Hébert, Daniel Boies
An LLR-based technique for frame selection for GMM-based text-independent speaker identification
Pang Kuen Tsoi, Pascale Fung
Robust speaker recognition based on high order cumulant
Jiyong Ma, Wen Gao
Two-stage speaker identification system based on VQ and NBDGMM
Luo Si, Qi Xiu Hu
A MAP approach, with synchronous decoding and unit-based normalization for text-dependent speaker verification
Johnny Mariethoz, Johan Lindberg, Frédéric Bimbot
A fast search method of speaker identification for large population using pre-selection and hierarchical matching
Zhibin Pan, Koji Kotani, Tadahiro Ohmi
Optimal fusion of diverse feature sets for speaker identification: an alternative method
Lan Wang, Ke Chen, Huisheng Chi
Transformation enhanced multi-grained modeling for text-independent speaker recognition
Upendra V. Chaudhari, Jiri Navrátil, Stéphane H. Maes, Ramesh Gopinath
Imposture using synthetic speech against speaker verification based on spectrum and pitch
Takashi Masuko, Keiichi Tokuda, Takao Kobayashi
Speaker recognition with recurrent neural networks
Shahla Parveen, Abdul Qadeer, Phil Green
Speaker feature extraction from pitch information based on spectral subtraction for speaker identification
Yoshiroh Itoh, Jun Toyama, Masaru Shimbo
Text-independent speaker identification using Gaussian mixture bigram models
Wei-Ho Tsai, Chiwei Che, Wen-Whei Chang
Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification
Hassan Ezzaidi, Jean Rouat
Speaker verification in mismatch training and testing conditions
Marcos Faúndez-Zanu, Adam Slupinski
Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training
Toshiaki Uchibe, Shingo Kuroiwa, Norio Higuchi
Accent-specific Mandarin adaptation based on pronunciation modeling technology
Mingkuan Liu, Bo Xu
In search of paralinguistic features
Hyun Bok Lee
A prominence based model of Swedish intonation
Gunnar Fant, Anita Kruckenberg
Roles of voice source dynamics as a conveyer of paralinguistic features
Hideki Kasuya, Masanori Yoshizawa, Kikuo Maekawa
Influence of paralinguistic information on segmental articulation
Kikuo Maekawa, Takayuki Kagomiya
Analysis and modeling of the effect of paralinguistic information upon the local speech rate
Sumio Ohno, Yoshimitsu Sugiyama, Hiroya Fujisaki
Rhythm of spoken Chinese - linguistic and paralinguistic evidences -
Jianfen Cao
Identification and discrimination of syntactically and pragmatically contrasting intonation patterns by native and non-native speakers of standard Japanese
Sanae Eda
Articulatory characteristics of emotional utterances in spoken English
Donna Erickson, Arthur Abramson, Kikuo Maekawa, Tokihiko Kaburagi
Analytical and perceptual study on the role of acoustic features in realizing emotional speech
Keikichi Hirose, Nobuaki Minematsu, Hiromichi Kawanami
Expression of emotion and attitude through temporal speech variations
Sylvie J. L. Mozziconacci, Dik J. Hermes
A cross-cultural investigation of emotion inferences from voice and speech: implications for speech technology
Klaus R. Scherer
Speaker dependent emotion recognition using speech signals
Bong-Seok Kang, Chul-Hee Han, Sang-Tae Lee, Dae-Hee Youn, Chungyong Lee
Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach)
Edmilson S. Morais, Paul Taylor, Fábio Violaro
A corpus-based Chinese speech synthesis with contextual dependent unit selection
Ren-Hua Wang, Zhongke Ma, Wei Li, Donglai Zhu
Segment selection in the L&h Realspeak laboratory TTS system
Geert Coorman, Justin Fackrell, Peter Rutten, Bert Van Coile
A Taiwanese (min-nan) text-to-speech (TTS) system based on automatically generated synthetic units
Ren-yuan Lyu, Zhen-hong Fu, Yuang-chin Chiang, Hui-mei Liu
Puretalk: a high quality Japanese text-to-speech system
Masayuki Yamada, Yasuo Okutani, Toshiaki Fukada, Takashi Aso, Yasuhiro Komori
Using cross-syllable units for Cantonese speech synthesis
Ka Man Law, Tan Lee
Limited domain synthesis
Alan W. Black, Kevin A. Lenzo
Coupling dialogue and prosody computation in spoken dialogue generation
Christine H. Nakatani, Jennifer Chu-Carroll
A study on the pitch pattern of a singing voice synthesis system based on the cepstral method
Tomio Takara, Kazuto Izumi, Keiichi Funaki
Automatic methods for lexical stress assignment and syllabification
Steve Pearson, Roland Kuhn, Steven Fincke, Nick Kibre
Using bayesian belief networks for model duration in text-to-speech systems
Olga Goubanova, Paul Taylor
Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis
Diane Hirschfeld
Temporal patterns of critical-band spectrum for text-to-speech
Pratibha Jain, Hynek Hermansky
Successive cohort selection (SCS) for text-independent speaker verification
Eric H. C. Choi, Jianming Song
Fuzzy normalisation methods for speaker verification
Dat Tran, Michael Wagner
Speaker verification in operational environments - monitoring for improved service operation
Yong Gu, Hans Jongebloed, Dorota Iskra, Els den Os, Lou Boves
On-line unsupervised adaptation in speaker verification
Larry P. Heck, Nikki Mirghafori
Multiple sub-band systems for speaker verification
P. Sivakumaran, A. M. Ariyaeeinia, Jill A. Hewitt
An orthogonal GMM based speaker verification system
Xiaoxing Liu, Baosheng Yuan, Yonghong Yan
A naive de-lambing method for speaker identification
Qin Jin, Alex Waibel
The lincoln speaker recognition system: NIST eval2000
Douglas A. Reynolds, R. Bob Dunn, Jack L. McLaughlin
Foldering voicemail messages by caller using text independent speaker recognition
Aaron E. Rosenberg, S. Parthasarathy, Julia Hirschberg, Stephen Whittaker
Structural framework for combining speaker recognition methods
Claude Montacié, Marie-José Caraty
Bootstrapping for speaker recognition
Walter D. Andrews, Joseph P. Campbell, Douglas A. Reynolds
On the importance of components of the MFCC in speech and speaker recognition
Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi
On the influence of rate, pitch, and spectrum on automatic speaker recognition performance
Thomas F. Quatieri, R. Bob Dunn, Douglas A. Reynolds
A model-based transformational approach to robust speaker recognition
Remco Teunen, Ben Shahshahani, Larry Heck
Contrastive lateral clicks and variation in click types
Amanda Miller-Ockhuizen, Bonny E. Sands
Analysis of acoustic models trained on a large-scale Japanese speech database
Tomoko Matsui, Masaki Naito, Yoshinori Sagisaka, Kozo Okuda, Satoshi Nakamura
Farsi vowel compensatory lengthening: an experimental approach
Mahmood Bijankhan
Cortical reorganization associated with the acquisition of Mandarin tones by american learners: an FMRI study
Yue Wang, Joan A. Sereno, Allard Jongman, Joy Hirsch
The production of real and non-words in adult stutterers and non-stutterers: an acoustic study
S. P. Whiteside, R. A. Varley, T. Phillips, H. Garety
A new proposal of laryngeal features for the tonal system of Vietnamese
Masaaki Shimizu, Masatake Dantsuji
How to choose training set for language modeling
Hong Zhang, Bo Xu, Taiyi Huang
High performance "general purpose" phonetic recognition for Italian
Piero Cosi, John-Paul Hosom
First approach to the selection of lexical units for continuous speech recognition of Basque
Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, N. Ezeiza, M. Peñagarikano, M. Hernandez, Luis Javier Rodriguez
Assimilation, ambiguity, and the feature parsing problem
David W. Gow Jr.
Optimization of units for continuous-digit recognition task
Sachin S. Kajarakar, Hynek Hermansky
Perceptual features for the identification of Romance languages
Ioana Vasilescu, Francois Pellegrino, Jean-Marie Hombert
Perception of Swedish vowel quantity: tracing late stages of development
Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan
Statistically trained orthographic to sound models for Thai
Ananlada Chotimongkol, Alan W. Black
Speech timing patterning as an indicator of discourse and syntactic boundaries
Janice Fon, Keith Johnson
On the phonetics of geminates: evidence from Cypriot Greek
Amalia Arvaniti, Georgios Tserdanelis
A simple procedure to clarify the relation between text and prosody
Hanny den Ouden, Carel van Wijk, Marc Swerts
Effects of consonantal voicing on English diphthongs: a comparison of L1 and L2 production
Kimiko Tsukada
The challenge of non-lexical speech sounds
Nigel Ward
A method to synthesize Arabic from short phonetic
Yousif A. El-Imam
A brazilian portuguese language corpus development
Mauricio C. Schramm, Luis Felipe R. Freitas, Adriano Zanuz, Dante Barone
Visual lipreading of voicing for French stop consonants
C. Colin, Monique Radeau, Didier Demolin, A. Soquet
Acoustic features of vowel production in Mandarin speakers of English
Yang Chen, Michael Robb
Spoken language navigation systems for drivers
Robert Belvin, Ron Burns, Cheryl Hein
An approach to intelligent Chinese dialogue system
Fang Chen, Baozong Yuan
Goal-oriented table-driven design for dialogue manager
Huei-Ming Wang, Yi-Chung Lin
Dialogue management in the Bell Labs communicator system
Alexandros Potamianos, Egbert Ammicht, Hong-Kwang J. Kuo
Dialogue management based on a hierarchical task structure
Jiang Han, Yong Wang
Melodic characteristics of backchannels in Dutch map task dialogues
Johanneke Caspers
Corrections in spoken dialogue systems
Marc Swerts, Diane Litman, Julia Hirschberg
F0 correlates of topic and subject in spontaneous Japanese speech
John Fry
Specification of communicative acts of utterances based on dialogue corpus analysis
Mutsuko Tomokiyo, Solange Hollard
An experimental verification of the prosodic/lexical effects on the occurrence of backchannels
Hiroaki Noguchi, Yasuhiro Katagiri, Yasuharu Den
The acoustic characteristics of Japanese identical vowel sequences in connected speech
Tsutomu Sato, John A. Maidment
Effects of dialog initiative and multi-modal presentation strategies on large directory information access
Shrikanth Narayanan, Giuseppe Di Fabbrizio, C. Kamm, James Hubbell, B. Buntschuh, P. Ruscitti, Jerry H. Wright
A declarative framework for building compositional dialog modules
William Thompson, Harry Bliss
A plan-based dialog system with probabilistic inferences
Kuansan Wang
Generating effective confirmation and guidance using two-level confidence measures for dialogue systems
Kazunori Komatani, Tatsuya Kawahara
Intelligent barge-in in conversational systems
Nikko Ström, Stephanie Seneff
A system for the research into multi-modal man-machine communication within a virtual environment
Andrew Breen, Barry Eggleton, Gavin Churcher, Paul Deans, Simon Downey
Advances in automatic transcription of Italian broadcast news
Fabio Brugnara, Mauro Cettolo, Marcello Federico, Diego Giuliani
Live thesaurus construction for interactive voice-based web search
Shui-Lung Chuang, Hsiao-Tieh Pu, Wen-Hsiang Lu, Lee-Feng Chien
Selecting TV news stories and newswire articles related to a target article of newswire using SVM
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi
Towards an integrated approach for spoken document retrieval
Kenney Ng
An experimental study of an audio indexing system for the web
Beth Logan, Pedro Moreno, Jean-Manuel van Thong, Ed Whittaker
Title generation for spoken broadcast news using a training corpus
Rong Jin, Alex G. Hauptmann
Evaluating different information retrieval algorithms on real-world data
Manfred Weber, Thomas Kemp
Transcription and summarization of voicemail speech
Konstantinos Koumpis, Steve Renals
Robust rejection for embedded systems
W. C. Tsai, Y. C. Chu
Multimodal signal processing in naturalistic noisy environments
Sharon Oviatt
A multi-modal dialog system for business transactions
Joyce Chai, Sylvie Levesque, Margorzata Budzikowska, Veronika Horvath, Nanda Kambhatla, Nicolas Nicolov, Wlodek Zadrozny
Office message center - a spoken dialogue system
Jiang Han, Yonghong Yan, Zhiwei Lin, Yong Wang, Jian Liu, Danjun Liu, Zhihui Wang
A new method for understanding sequences of utterances by multiple speakers
Noboru Miyazaki, Jun-ichi Hirasawa, Mikio Nakano, Kiyoaki Aikawa
Improvement of dialogue efficiency by dialogue control model according to performance of processes
Hideaki Kikuchi, Katsuhiko Shirai
MUXING: a telephone-access Mandarin conversational system
C. Wang, D. Scott Cyphers, Xiaolong Mou, Joseph Polifroni, Stephanie Seneff, J. Yi, Victor Zue
Jaspis - a framework for multilingual adaptive speech applications
Markku Turunen, Jaakko Hakulinen
The CU communicator: an architecture for dialogue systems
Bryan Pellom, Wayne Ward, Sameer Pradhan
Preferred modalities in dialogue systems
Vildan Bilici, Emiel Krahmer, Saskia te Riele, Raymond Veldhuis
Introduction to the IST-HLT project speech-driven multimodal automatic directory assistance (SMADA)
Fréderic Béchet, Elisabeth den Os, Lou Boves, Jürgen Sienel
Using HPSG to represent multi-modal grammar in multi-modal dialogue
Crusoe Mao, Tony Tuo, Danjun Liu
An efficient dialogue control method under system²s limited knowledge
Kohji Dohsaka, Norihito Yasuda, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa
A distributed spoken user interface based on open agent architecture (OAA)
Ying Cheng, Anurag Gupta, Raymond Lee
Bimodal speech recognition using coupled hidden Markov models
Stephen M. Chu, Thomas S. Huang
A parallel multi-stream model for sign language recognition
Jiyong Ma, Wen Gao
MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation
Lionel Revéret, Gérard Bailly, Pierre Badin
Modeling visual coarticulation in synthetic talking heads using a lip motion unit inventory with concatenative synthesis
Steve Minnis, Andrew Breen
A generation system for Chinese texts
Hua Wu, Taiyi Huang, Bo Xu
Formal and natural language generation in the Mercury conversational system
Stephanie Seneff, Joseph Polifroni
A method of creating a new speaker²s voicefont in a text-to-speech system
Takashi Saito, Masaharu Sakamoto
Signal approximation in Hilbert space and its application on articulatory speech synthesis
Jun Huang, Stephen Levinson, Mark Hasegawa-Johnson
Quality improvement of PSOLA analysis-synthesis using partial zero-phase conversion
Nobuaki Minematsu, Seiichi Nakagawa
A machine learning approach to Swedish word pronunciation
Hanna Lindgren, Jessica Granberg
An improved speech analysis-synthesis algorithm based on the autoregressive with exogenous input speech production model
Takahiro Ohtsuka, Hideki Kasuya
Combination of temporal trajectory filtering and projection measure for robust speaker identification
Kuo-Hwei Yuo, Tai-Hwei Hwang, Hsiao-Chuan Wang
A combined adaptive and decision tree based speech separation technique for telemedicine applications
Yunxin Zhao, Xiao Zhang, Xiaodong He, Laura Schopp
Additive and convolutional noises compensation for speaker recognition
Olivier Bellot, Driss Matrouf, Teva Merlin, Jean-François Bonastre
Dialect adaptation for Mandarin Chinese speech recognition
Frédéric Beaugendre, Tom Claes, Hugo van Hamme
Can automatic speaker verification be improved by training the algorithms on emotional speech?
Klaus R. Scherer, Tom Johnstone, Gudrun Klasmeyer, Thomas Bänziger
New distance measures for text-independent speaker identification
Zhong-Hua Wang, Cheng Wu, David Lubensky
Automatic speech recognition in Mandarin for embedded platforms
Fengguang Zhao, Prabhu Raghavan, Sunil K. Gupta, Ziyi Lu, Wentao Gu, Wentao Gu
Confidence measure based unsupervised speaker adaptation
Husheng Li, Jia Liu, Runsheng Liu
Improved variable preselection list length estimation using NNs in a large vocabulary telephone speech recognition system
Javier Macías-Guarasa, Javier Ferreiros, José Colás, A. Gallardo-Antolín, Juan Manuel Pardo
Incorporating multiple-HMM acoustic modeling in a modular large vocabulary speech recognition system in telephone environment
Ascensión Gallardo-Antolín, Javier Ferreiros, Javier Macías-Guarasa, R. de Córdoba, Juan Manuel Pardo
Decision tree based text-to-phoneme mapping for speech recognition
Janne Suontausta, Juha Häkkinen
Reduced traceback matrix storage for small footprint model alignment
Jeff Meunier
Dynamic adaptation of vocabulary independent HMMs to an application environment
Claudio Vair, Luciano Fissore, Pietro Laface
Synergy of spectral and perceptual features in multi-source connectionist speech recognition
Roberto Gemello, Loreta Moisa, Pietro Laface
High performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation
Ramalingam Hariharan, Olli Viikki
Transcription of broadcast news with a time constraint: IBMs 10xRT HUB4 system
Ellen Eide, Benoît Maison, D. Kanevsky, P. Olsen, S. Chen, L. Mangu, M. Gales, Miroslav Novak, Ramesh Gopinath
Exact alpha-beta computation in logarithmic space with application to MAP word graph construction
Geoffrey Zweig, Mukund Padmanabhan
Relationship among speaking style, inter-phoneme's distance and speech recognition performance
Kazumasa Yamamoto, Seiichi Nakagawa
Spanish recogniser of continuously spelled names over the telephone
Ruben San-Segundo, José Colás, Javier Ferreiros, Javier Macías-Guarasa, Juan Miguel Pardo
Two-stream modeling of Mandarin tones
Frank Seide, Nick J.C. Wang
A neural network speech recognizer based on the both acoustic steady portions and transitions
Seyyed Ali Seyyed Salehi
Belief networks for a syntactic and semantic analysis of spoken utterances for speech understanding
Marc Hofmann, Manfred Lang
A robust speech understanding system using conceptual relational grammar
Jiping Sun, Roberto Togneri, Li Deng
Incorporating tone information into Cantonese large-vocabulary continuous speech recognition
Wai Lau, Tan Lee, Yiu Wing Wong, P. C. Ching
A novel loss function for the overall risk criterion based discriminative training of HMM models
Janez Kaiser, Bogomir Horvat, Zdravko Kacic
Looking for topic similarities of highly inflected languages for language model adaptation
Mirjam Sepesy Maucec, Zdravko Kacic, Bogomir Horvat
Integrating MAP and linear transformation for language model adaptation
David Janiszek, Frédéric Béchet, Renato De Mori
Utterance verification based speech recognition system
Beng Tiong Tan, Yong Gu, Trevor Thomas
Use of linear extrapolation based linear predictive cepstral features (LE-LPCC) for Tamil speech recognition
Rathinavelu Chengalvarayan
Robust fundamental frequency estimation using instantaneous frequencies of harmonic components
Yoshinori Atake, Toshio Irino, Hideki Kawahara, Jinlin Lu, Satoshi Nakamura, Kiyohiro Shikano
Integrating different acoustic and syntactic language models in a continuous speech recognition system
Amparo Varona, In Torres, Miren Karmele López de Ipiña, Luis Javier Rodriguez
Combining multiple speech recognizers using voting and language model information
Holger Schwenk, Jean-Luc Gauvain
Dialogue management based on inferred behavioral goal - improving the accuracy of understanding by dialogue context -
Keisuke Watanabe, Yasushi Ishikawa
Speech recognition using context conditional word posterior probabilities
Ralf Schlüter, Frank Wessel, Hermann Ney
The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language
Hugo Meinedo, Joao P. Neto
Combination of acoustic models in continuous speech recognition hybrid systems
Hugo Meinedo, Joao P. Neto
Automatic speech recognition of non-native speakers using consonant-vowel-consonant (CVC) words
David A. van Leeuwen, Sander J. van Wijngaarden
Understanding Chinese in spoken dialogue systems
Gang Zhao, Hong Xu
A front-end using the harmonicity cue for speech enhancement in loud noise
Frédéric Berthommier, Hervé Glotin, Emmanuel Tessier
Lucent automatic speech recognition: a speech recognition engine for internet and telephony srvice applications
Qiru Zhou, Sergey Kosenko
Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables
Todd A. Stephenson, Hervé Bourlard, Samy Bengio, Andrew C. Morris
Towards robust telephony speech recognition in office and automobile environments
Subrata Das, David Lubensky
Extracting phonological chunks based on piecewise linear segment lattices
Hiroaki Kojima, Kazuyo Tanaka
Evaluating hierarchical hybrid statistical language models
Lucian Galescu, James Allen
An efficient lexical tree search for large vocabulary continuous speech recognition
Jun Ogata, Yasuo Ariki
Reliability evaluation of speech recognition in acoustic modeling
Bin Jia, Xiaoyan Zhu, Yupin Luo, Dongcheng Hu
Using GMM for voiced/voiceless segmentation and tone decision in Mandarin continuous speech recognition
Ching X. Xu
Auditory spectrum based features (ASBF) for robust speech recognition
Chi H. Yim, Oscar C. Au, Wanggen Wan, Cyan L. Keung, Carrson C. Fung
Large vocabulary Mandarin speech recognition with different approaches in modeling tones
Eric Chang, Jianlai Zhou, Shuo Di, Chao Huang, Kai-Fu Lee
Fast very large vocabulary recognition based on compact DAWG-structured language models
Kalirroi Georgila, Kyriakos Sgarbas, Nikos Fanotakis, George Kokkinakis
Crosslinguistic disfluency modeling: a comparative analysis of Swedish and tok pisin human-human ATIS dialogues
Robert Eklund
Vector space representation of language probabilities through SVD of n-gram matrix
Shiro Terashima, Kazuya Takeda, Fumitada Itakura
Spoken language parsing based on incremental disambiguation
Yoshihide Kato, Shigeki Matsubara, Katsuhiko Toyama, Yasuyoshi Inagaki
Jacobian adaptation of HMM with initial model selection for noisy speech recognition
Hiroshi Shimodaira, Yutaka Kato, Toshihiko Akae, Mitsuru Nakai, Shigeki Sagayama
The BBN Byblos 2000 conversational Mandarin LVCSR system
Han Shu, Chuck Wooters, Owen Kimball, Thomas Colthurst, Fred Richardson, Spyros Matsoukas, Herbert Gish
The 2000 BBN Byblos LVCSR system
Thomas Colthurst, Owen Kimball, Fred Richardson, Han Shu, Chuck Wooters, Rukmini Iyer, Herbert Gish
Broadcast news transcription in Mandarin
Langzhou Chen, Lori Lamel, Gilles Adda, Jean-Luc Gauvain
Word concept model: a knowledge representation for dialogue agents
Yang Li, Tong Zhang, Stephen E. Levinson
Audio-visual speech recognition using MCE-based hmms and model-dependent stream weights
Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura
Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems
Hiroaki Nanjo, Akinobu Lee, Tatsuya Kawahara
Taiwanese corpus collection via continuous speech recognition tool
Yuang-Chin Chiang, Zhi-Siang Yang, Ren-Yuan Lyu
Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR
Baosheng Yuan, Qingwei Zhao, Qing Guo, Xiangdong Zhang, Zhiwei Lin
Frame level likelihood transformations for ASR and utterance verification
Konstantin P. Markov, Satoshi Nakamura
Integrating recognition confidence scoring with language understanding and dialogue modeling
Timothy J. Hazen, Theresa Burianek, Joseph Polifroni, Stephanie Seneff
Speech recognition based on estimation of mutual information
Yibiao Yu, Heming Zhao
Keyword spotting in auto-attendant system
Qing Guo, Yonghong Yan, Zhiwei Lin, Baosheng Yuan, Qingwei Zhao, Jian Liu
A new approach for modeling OOV words
Weimin Ren, Chengfa Wang, Wen Gao, Jinpei Xu
Speech recognition using error spotting
Rachida El Méliani, Douglas O'Shaughnessy
Robust endpoint detection for in-car speech recognition
Chung-Ho Yang, Ming-Shiun Hsieh
Internet speech analysis system using e-mail and web technology
Jouji Miwa, Masaru Kumagai
Multi-class linear dimension reduction by generalized Fisher criteria
Marco Loog, Reinhold Haeb-Umbach
Improving the representation of time structure in front-ends for automatic speech recognition
Wendy J. Holmes
Speech analysis by rule extraction from trained artificial neural networks
Katrin Kirchhoff
Minimum mean square error spectral peak envelope estimation for automatic vowel classification
Jaishree Venugopal, Stephen A. Zahorian, Montri Karnjanadecha
Probabilistic compensation of unreliable feature components for robust speech recognition
Cyan L. Keung, Oscar C. Au, Chi H. Yim, Carrson C. Fung
A new tone conversion method for Mandarin by an adaptive linear prediction analysis
Congxiu Wang, Qihu Li, Guoying Zhao, Li Yin, Shuai Hao, Da Meng
Multimodal interface research: a science without borders
Sharon Oviatt
Studies of audiovisual speech perception using production-based animation
K. G. Munhall, C. Kroos, T. Kuratate, J. Lucero, M. Pitermann, Eric Vatikiotis-Bateson, H. Yehia
Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction
Chalapathi Neti, Giridharan Iyengar, Gerasimos Potamianos, A. Senior, Benoit Maison
Towards robust lipreading
Wen Gao, Jiyong Ma, Rui Wang, Hongxun Yao
Stream weight optimization of speech and lip image sequence for audio-visual speech recognition
Satoshi Nakamura, Hidetoshi Ito, Kiyohiro Shikano
HMM-based text-to-audio-visual speech synthesis
Shinji Sako, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura
Real-time speech-generated subtitles: problems and solutions
Jill Hewitt, Andi Bateman, Andrew Lambourne, A. Ariyaeeinia, P. Sivakumaran
Mipad: a next generation PDA prototype
Xuedong Huang, Alex Acero, C. Chelba, Li Deng, D. Duchene, Joshua Goodman, H. Hon, D. Jacoby, L. Jiang, R. Loynd, M. Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto, Mike Plumpe, K. Wang, Y. Wang
Dialogue management for multimodal user registration
Fei Huang, Jie Yang, Alex Waibel
Segmental optical phonetics for human and machine speech processing
Lynne E. Bernstein
Classification of Thai consonant naming using Thai tone
Umavasee Thathong, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Boonchai Thampanitchawong
A high-performance auditory feature for robust speech recognition
Qi Li, Frank K. Soong, Olivier Siohan
A new strategy of formant tracking based on dynamic programming
Kun Xia, Carol Espy-Wilson
Dominant subspace analysis for auditory spectrum
Xugang Lu, Gang Li, Lipo Wang
Spectral and cepstral projection bases constructed by independent component analysis
Ilyas Potamitis, Nikos Fanotakis, George Kokkinakis
Relating LPC modeling to a factor-based articulatory model
Sacha Krstulovic
On data-derived temporal processing in speech feature extraction
Michael L. Shire, Barry Y. Chen
Minimum Bayes error feature selection
George Saon, Mukund Padmanabhan
Using mutual information to design feature combinations
Daniel P. W. Ellis, Jeff A. Bilmes
Multichannel signal separation for cocktail party speech recognition: a dynamic recurrent network
Seungjin Choi, Heonseok Hong, Hervé Glotin, Frédéric Berthommier
An automatic algorithm for segmenting and labelling a connected digit sequence
V. Kamakshi Prasad, Hema A. Murthy
The signal reconstruction of speech by KPCA
Hui Yan, Xuegong Zhang, Yanda Li, Liqin Shen, Weibin Zhu
Blind source separation based on subband ICA and beamforming
Hiroshi Saruwatari, Satoshi Kurita, Kazuya Takeda, Fumitada Itakura, Kiyohiro Shikano
A synchrony front-end using phase-locked-loop techniques
Claudio Estienne, Patricia Pelle
On the use of filter-bank energies driven from the autocorrelation sequence for noisy speech recognition
Javier Hernando
Combining semantic and syntactic structure for language modeling
Rens Bod
Language model size reduction by pruning and clustering
Joshua Goodman, Jianfeng Gao
Efficient training methods for maximum entropy language modeling
Jun Wu, Sanjeev Khudanpur
Statistical language modeling with a class based n-multigram model
Sabine Deligne
A hierarchical language model incorporating class-dependent word models for OOV words recognition
Koichi Tanigaki, Hirofumi Yamamoto, Yoshinori Sagisaka
Input Chinese sentences using digits
Fang Zheng, Jian Wu, Wenhu Wu
Hidden-articulator Markov models: performance improvements and robustness to noise
Matt Richardson, Jeff Bilmes, Chris Diorio
Keyword-based discriminative training of acoustic models
Eric D. Sandness, I. Lee Hetherington
Segmental minimum Bayes-risk ASR voting strategies
Vaibhava Goel, Shankar Kumar, William Byrne
Loosely coupled HMMs for ASR
Harriet J. Nock, Steve J. Young
HMM2- a novel approach to HMM emission probability estimation
Katrin Weber, Samy Bengio, Hervé Bourlard
Structured redefinition of sound units by merging and splitting for improved speech recognition
Rita Singh, Bhiksha Raj, Richard M. Stern
Speech modeling with state constrained Markov fields over frequency bands
V. Arsigny, Gérard Chollet, Guillaume Gravier, Marc Sigelle
Duration modeling for Chinese synthesis from C-toBI labeled corpus
Weibin Zhu, Liqin Shen, Xiaochuan Miu
The pitch movement of word stress in Chinese
Bei Wang, Bo Zheng, Shinan Lu, Jianfen Cao, Yufang Yang
The distribution of fillers in lectures in the Japanese language
Michiko Watanabe, Carlos Toshinori Ishi
Research on stress in bisyllsblic words of Mongolian
Huhe Harnud, Yuling Zheng, Jiayou Chen
Modelling of the perception of English sentence stress for computer-assisted language learning
Kazunori Imoto, Masatake Dantsuji, Tatsuya Kawahara
Data driven intonation modelling of 6 languages
Jeska Buhmann, Halewijn Vereecken, Justin Fackrell, Jean-Pierre Martens, Bert van Coile
Prosody prediction using a tree-structure similarity metric
Laurent Blin, Mike Edgington
Prosodic features for automatic text-independent evaluation of degree of nativeness for language learners
Carlos Teixeira, Horacio Franco, Elizabeth Shriberg, Kristin Precoda, Kemal Sönmez
Instantaneous estimation of prosodic pronunciation habits for Japanese students to learn English pronunciation
Nobuaki Minematsu, Seiichi Nakagawa
Synthesis of fundamental FDrequency contours of standard Chinese sentences from tone sandhi and focus conditions
Jinfu Ni, Keikichi Hirose
Syllable duration and its functions in standard Chinese discourse
Yiqing Zu, Xiaoxia Chan, Aijun Li, Wu Hua, Guohua Sun
Generating prosody by superposing multi-parametric overlapping contours
Bleicke Holm, Gérard Bailly
Consistent pitch marking
Raymond Veldhuis
Labeler agreement in transcribing korean intonation with K-toBI
Sun-Ah Jun, Sook-Hyang Lee, Keeho Kim, Yong-Ju Lee
Effectiveness of prosodic features in syntactic analysis of read Japanese sentences
Yukiyoshi Hirose, Kazuhiko Ozeki, Kazuyuki Takagi
A study of F0 declination in Japanese: towards a discourse model of prosodic structure
Mieko Banno
Data-driven intonation modeling using a neural network and a command response model
Atsuhiro Sakurai, Nobuaki Minematsu, Keikichi Hirose
Natural F0 contours with a new neural-network-hybrid approach
Caglayan Erdem, Martin Holzapfel, Rüdiger Hoffmann
Prosodic variation with text type
Justin Fackrell, Halewijn Vereecken, Jeska Buhmann, Jean-Pierre Martens, Bert Van Coile
Inter-transcriber reliability of toBI prosodic labeling
Ann K. Syrdal, Julia McGory
Stem-ML: language-independent prosody description
Greg P. Kochanski, Chilin Shih
Using prosody database in Chinese speech synthesis
Minghui Dong, Kim Teng Lua
Some articulatory and acoustic changes associated with emphasis in spoken English
Donna Erickson, Kikuo Maekawa, Michiko Hashi, Jianwu Dang
Fast speech timing in Dutch: durational correlates of lexical stress and pitch accent
Esther Janse, Anke Sennema, Anneke Slis
On perception of word-based local speech rate in Japanese without focusing attention
Makoto Hiroshige, Kantaro Suzuki, Kenji Araki, Koji Tochinai
Modeling and generation of accentual phrase F0 contours based on discrete HMMs synchronized at mora-unit transitions
Atsuhiro Sakurai, Koji Iwano, Keikichi Hirose
Synthesizing prosody for commands in a Xhosa TTS system
Philippa H. Louw, Justus. C. Roux, Elizabeth. C. Botha
Design and implementation of a Greek text-to-speech system based on concatenative synthesis
Costas Christogiannis, Yiannis Stavroulas, Yiannis Vamvakoulas, Theodora Varvarigou, Agatha Zappa, Chilin Shih, Amalia Arvaniti
GENESIS-II: a versatile system for language generation in conversational system applications
Lauren Baptist, Stephanie Seneff
New analysis method for harmonic plus noise model based on time-domain periodicity score
Eun-Kyoung Kim, Yung-Hwan Oh
Straight-based voice conversion algorithm based on Gaussian mixture model
Tomoki Toda, Jinlin Lu, Hiroshi Saruwatari, Kiyohiro Shikano
Syllable-based text-to-phoneme conversion for German
Marion Libossek, Florian Schiel
A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network
Horst-Udo Hain
Parametric high definition (PHD) speech synthesis-by-analysis: the development of a fundamentally new system creating connected speech by modifying lexically-represented language units
Hans G. Tillmann, Hartmut R. Pfitzinger
A new synthesis algorithm using phase information for TTS systems
Chul H. Kwon, Minkyu Lee, Joseph P. Olive
Unit fusion for concatenative speech synthesis
Johan Wouters, Michael W. Macon
Diphone collection and synthesis
Kevin A. Lenzo, Alan W. Black
Natural language generation for spoken dialogue
Thomas Portele
Preselection of candidate units in a unit selection-based text-to-speech synthesis system
Alistair Conkie, Mark C. Beutnagel, Ann K. Syrdal, Philip E. Brown
Self-organizing letter code-book for text-to-phoneme neural network model
Kare Jean Jensen, Søren Riis
A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis
Jon R. W. Yi, James R. Glass, I. Lee Hetherington
Analysis of fundamental frequency contours of standard Chinese in terms of the command-response model and its application to synthesis by rule of intonation
Changfu Wang, Hiroya Fujisaki, Ryou Tomana, Sumio Ohno
Manipulating speech pitch periods according to optimal insertion/deletion position in residual signal for intonation control in speech synthesis
Toshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano
Improving naturalness of Thai text-to-speech synthesis by prosodic rule
Pradit Mittrapiyanuruk, Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Virach Sornlertlamvanich
Word-level F0 range in Mandarin Chinese and its application to inserting words into a sentence
Dawei Xu, Hiroki Mori, Hideki Kasuya
A new Japanese TTS system based on speech-prosody database and speech modification
Mitsuaki Isogai, Kimihito Tanaka, Satoshi Takano, Hideyuki Mizuno, Masanobu Abe, Sinya Nakajima
Stress assignment in Spanish proper names
Ruben San-Segundo, Juan Manuel Montero, Ricardo de Córdoba, Juana Gutiérrez-Arriola
Segmentation of prosodic phrases for improving the naturalness of synthesized Mandarin Chinese speech
Zhengyu Niu, Peiqi Chai
Practical language modeling: an interpolating method
Xiaohu Liu, Douglas O'Shaughnessy
Combination of different n-grams based on their different assumptions
Gongjun Li, Na Dong, Toshiro Ishikawa
Construction of speech corpus in moving car environment
Nobuo Kawaguchi, Shigeki Matsubara, Hiroyuki Iwa, Shoji Kajita, Kazuya Takeda, Fumitada Itakura, Yasuyoshi Inagaki
Parsing spoken dialogues
Yue-Shi Lee, Hsin-Hsi Chen
A noise robust multilingual reference recogniser based on SPEECHDAT(II)
Børge Lindberg, Finn Tore Johansen, Narada Warakagoda, Gunnar Lehtinen, Zdravko Kacic, Andrej Zgank, Kjell Elenius, Giampiero Salvi
The design and application of a speech database for Chinese TTS system
Muhua Lv, Lianhong Cai
Use of multiple classifiers for speech recognition in wireless CDMA network environments
Rathinavelu Chengalvarayan
An imperative programming language for spoken language translation
Alexander Franz, Keiko Horiguchi, Lei Duan
Fine keyword clustering using a thesaurus and example sentences for speech translation
Yumi Wakita, Kenji Matsui, Yoshinori Sagisaka
Data collection and processing in a Chinese spontaneous speech corpus IIS_CSS
JunLan Feng, XianFang Wang, LiMin Du
Spoken language corpus for machine interpretation research
Yasuyuki Aizawa, Shigeki Matsubara, Nobuo Kawaguchi, Katsuhiko Toyama, Yasuyoshi Inagaki
When will synthetic speech sound human: role of rules and data
Jan van Santen, Michael Macon, Andrew Cronk, John-Paul Hosom, Alexander Kain, Vincent Pagel, Johan Wouters
Corpus-based techniques in the AT&t nextgen synthesis system
Ann K. Syrdal, Colin W. Wightman, Alistair Conkie, Yannis Stylianou, Mark Beutnagel, Juergen Schroeter, Volker Strom, Ki-Seung Lee, Matthew J. Makashay
Limitations to concatenative speech synthesis
Nick Campbell
A design method of speech corpus for text-to-speech synthesis taking account of prosody
Hisashi Kawai, Seiichi Yamamoto, Norio Higuchi, Tohru Shimizu
Corpus-based methods and hand-built methods
Richard Sproat
Heredity and environment in speech recognition: the role of a priori information vs. data
Michael A. Picheny
A constraint-based analysis of compound accent in Japanese
Haruo Kubozono
Language acquisition through a human-robot interface
Naoto Iwahashi
Rules, but what for? - rule description as efficient and robust abstraction of corpora and optimal fitting to applications -
Yoshinori Sagisaka, Hirofumi Yamamoto, Minoru Tsuzaki, Hiroaki Kato
Cross-linguistic aspects of intonation perception
Veronika Makarova
Visual information and the perception of prosody
Haruo Kubozono, Shosuke Haraguchi
Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours
Masato Akagi, Hironori Kitakaze
Neuromagnetic study on localization of speech sounds
Kalle J. Palomäki, Paavo Alku, Ville Mäkinen, Patrick May, Hannu Tiitinen
Perception of identical vowel sequences in Japanese conversational speech
Yukiyoshi Hirose, Kazuhiko Kakehi
Acoustic cues to perception of vowel quality
Santiago Fernández, Sergio Feijóo
A solution to the reduction of concatenation artefacts in speech synthesis
Esther Klabbers, Raymond Veldhuis, Kim Koppen
Domain-unconstrained language understanding based on CKIP-auto tag, how-net, and ART
Jhing-Fa Wang, Hsien-Chang Wang, Kin-Nan Lee, Chieh-Yi Huang
The generation of representations of word meanings from dictionaries
Chris Powell, Mary Zajicek, David Duce
Grammar partitioning and parser composition for natural language understanding
Po Chui Luk, Helen Meng, Filung Wang
Comprehension of synthesized speech while driving and in the lab
Jennifer Lai, Omer Tsimhoni, Paul Green
Orthographic influences on initial phoneme addition and deletion tasks: the effect of lexical status
Michael D. Tyler, Denis K. Burnham
Investigation of analysis and synthesis parameters of straight by subjective evaluation
Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki Kawahara
Cross-domain classification using generalized domain acts
Andrew N. Pargellis, Alexandros Potamianos
Hierarchical feature-based translation for scalable natural language understanding
Ganesh N. Ramaswamy, Jan Kleindienst
Statistical recursive finite state machine parsing for speech understanding
Alexandros Potamianos, Hong-Kwang J. Kuo
Speaker change detection using minimum message length criterion
Chaojun Liu, Yonghong Yan
Toward the realization of spontaneous speech recognition - introduction of a Japanese priority program and preliminary results -
Sadaoki Furui, Kikuo Maekawa, Hitoshi Isahara, Takahiro Shinozaki, Takashi Ohdaira
A comparative study on acoustic and linguistic characteristics using speech from human-to-human and human-to-machine conversations
Toshiyuki Takezawa, Fumiaki Sugaya, Masaki Naito, Seiichi Yamamoto
Speaker dependent temporal constraints combined with speaker independent HMM for speech recognition in noise
Néstor Becerra Yoma
Forward masking on a generalized logarithmic scale for robust speech recognition
Yoshihiro Ito, Hiroshi Matsumoto, Kazumasa Yamamoto
Noise robustness of heterogeneous features employing minimum classification error feature space transformations
Heidi Christensen, Børge Lindberg, Ove Andersen
Classifier-based mask estimation for missing feature methods of robust speech recognition
Michael L. Seltzer, Bhiksha Raj, Richard M. Stern
Optimized subspace weighting for robust speech recognition in additive noise environments
Kris Hermus, Werner Verhelst, Patrick Wambacq
Robust feature selection using probabilistic union models
Ji Ming, Peter Jancovic, Philip Hanna, Darryl Stewart, F. Jack Smith
Multi-resolution front-end for noise robust speech recognition
Ramalingam Hariharan, Imre Kiss, Olli Viikki, Jilei Tian
Recognition of digit strings in noisy speech with limited resources
Douglas O'Shaughnessy, Marcel Gabrea
Factors affecting native Japanese speakers' production of intrusive (epenthetic) vowels in English words
Keiichi Tajima, Donna Erickson, Kyoko Nagao
Beyond the conventional statistical language models: the variable-length sequences approach
Imed Zitouni, Kamel Smaïli, Jean-Paul Haton
Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures
Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara
ASR-based subtitling of live TV-programs for the hearing impaired
Trym Holter, Erik Harborg, Magne Hallstein Johnsen, Torbjörn Svendsen
Natural language processing for Taiwanese sign language to speech conversion
Chung-Hsien Wu, Yu-Hsien Chiu, Chi-Shiang Guo
Japanese spoken language learning system using java information technology
Jouji Miwa, Hiroshi Sasaki, Kazunori Tanno
L2 pronunciation quality in read and spontaneous speech
Helmer Strik, Catia Cucchiarini, Diana Binnenpoorte
Designing modulation filters for improving speech intelligibility in reverberant environments
Tomoko Kitamura, Keisuke Kinoshita, Takayuki Arai, Akiko Kusumoto, Yuji Murahara
An environment model-based robust speech recognition
Lei Zhang, Jiqing Han, Chengguo Lv, Chengfa Wang
Particle filtering for non-stationary speech modelling and enhancement
Jaco Vermaak, Christophe Andrieu, Arnaud Doucet
Maximum likelihood noise HMMm estimation in model-based robust speech recognition
Martin Graciarena
Microphone array within a handset or face mask for speech enhancement
Qingsheng Zeng, Douglas O'Shaughnessy
Embedding visually recognizable watermarks into digital audio signals
Chengfa Wang, Qiusheng Wang
Auditory perception of amplitude modulated sinusoid using a pure tone and band-limited noises as modulation signals
Mamoru Iwaki
Spectral voice conversion based on unsupervised clustering of acoustic space
Masoud Geravanchizadeh
Removing hum from spoken language resources
Hartmut R. Pfitzinger
Joint pronunciation modelling of non-native speakers using data-driven methods
Ingunn Amdal, Filipp Korkmazskiy, Arun C. Surendran
A comparison of disfluency distribution in a unimodal and a multimodal speech interface
Linda Bell, Robert Eklund, Joakim Gustafson
Modelling pronunciation variations in spontaneous Mandarin speech
Yi Liu, Pascale Fung
A method of generating English pronunciation dictionary for Japanese English recognition systems
Tadashi Suzuki, Jun Ishii, Kunio Nakajima
A framework for evaluating contextual understanding
Hélène Bonneau-Maynard, L. Devillers
Towards high performance continuous Mandarin digit string recognition
Yonggang Deng, Taiyi Huang, Bo Xu
Stochastic suprasegmentals: relationships between redundancy, prosodic structure and care of articulation in spontaneous speech
Matthew Aylett
An automatic pitch-marking method using wavelet transform
Masaharu Sakamoto, Takashi Saitoh
A proposal of a model to extract Japanese voluntary speech rate control
Keiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai
Acoustic characteristics of surprise in Russian questions
Veronika Makarova
Neural network based integration of multiple confidence measures for OOV detection
Yonggang Deng, Yang Cao, Bo Xu
How fast can we really change pitch? maximum speed of pitch change revisited
Yi Xu, Xuejing Sun
Predicting segmental durations for Dutch using the sums-of-products approach
Esther Klabbers, Jan van Santen
A stochastic polynomial tone model for continuous Mandarin speech
Yang Cao, Taiyi Huang, Bo Xu, Chengrong Li
Detection of filled pauses in spontaneous conversational speech
Marcel Gabrea, Douglas OShaughnessy
Some observations on different strategies for the timing of fundamental frequency events
Bertil Lyberg, Sonia Sangarig
Research on dynamic characters of Chinese pitch contours
Zhiyong Wu, Lianhong Cai, Tongchun Zhou
Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers
Bing Zhao, Bo Xu
An online incremental speaker adaptation method using speaker-clustered initial models
Zhipeng Zhang, Sadaoki Furui
Prior parameter transformation for unsupervised speaker adaptation
Guoqiang Li, Limin Du, Ziqiang Hou
Improved Jacobian adaptation for fast acoustic model adaptation in noisy speech recognition
Ruhi Sarikaya, John H. L. Hansen
A study of vocal tract length normalization with generation-dependent acoustic models
Keiko Fujita, Yoshio Ono, Yoshihisa Nakatoh
Optimal on-line Bayesian model selection for speaker adaptation
Shaojun Wang, Yunxin Zhao
Unsupervised audio stream segmentation and clustering via the Bayesian information criterion
Bowen Zhou, John H. L. Hansen
Frame-period adaptation for speaking rate robust speech recognition
Satoru Tsuge, Toshiaki Fukada, Kenji Kita
Cross-language use of acoustic information for automatic speech recognition
C. Nieuwoudt, Elizabeth C. Botha
Selective training of HMMs by using two-stage clustering
Shoei Sato, Toru Imai, Hideki Tanaka, Akio Ando
Compensation of noise effects for robust speech recognition in car environments
Angel de la Torre, Dominique Fohr, Jean-Paul Haton
Bayesian speaker adaptation based on probabilistic principal component analysis
Dong Kook Kim, Nam Soo Kim
MLLR-based accent model adaptation without accented data
Wai Kat Liu, Pascale Fung
Fast speaker adaptation using eigenspace-based maximum likelihood linear regression
Kuan-Ting Chen, Wen-Wei Liau, Hsin-Min Wang, Lin-Shan Lee
Stream confidence estimation for audio-visual speech recognition
Gerasimos Potamianos, Chalapathy Neti
The effect of reduced spectral information on Japanese consonant perception: comparison between L1 and L2 listeners
Masahiko Komatsu, Won Tokuma, Shinichi Tokuma, Takayuki Arai
Can cantonese children with cochlear implants perceive lexical tones?
Valter Ciocca, Rani Aisha, Alex Francis, Lena Wong
Recognition of spoken words in the continuous speech: effects of transitional probability
Michael C. W. Yip
Detection of speech landmarks using temporal cues
Ariel Salomon, Carol Espy-Wilson
A set of Japanese word cohorts rated for relative familiarity
Takashi Otake, Anne Cutler
The phonetic value of the devocalized vowel in Japanese - in case of velar plosive
Kimiko Yamakawa, Hiromitsu Miyazono, Ryoji Baba
Positive and negative influences of the lexicon on phonemic decision-making
James M. McQueen, Anne Cutler, Dennis Norris
Phonotactic and acoustic cues for word segmentation in English
Andrea Weber
Intelligibility of time-compressed speech: three ways of time-compression
Esther Janse
Evidence for demodulation in speech perception
Hartmut Traunmüller
Fast decoding for indexation of broadcast data
Jean-Luc Gauvain, Lori Lamel
Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR
Sheng Gao, Bo Xu, Hong Zhang, Bing Zhao, Chengrong Li, Taiyi Huang
Combined acoustic and linguistic look-ahead for one-pass time-synchronous decoding
Xavier L. Aubert, Reinhard Blasig
Large-vocabulary speech recognition under adverse acoustic environments
Li Deng, Alex Acero, Mike Plumpe, Xuedong Huang
Acoustic language model classes for a large vocabulary continuous speech recognizer
Volker Fischer, S. J. Kunzmann
A hybrid speech recognizer combining HMMs and polynomial classification
Franz Kummert, Gernot A. Fink, Gerhard Sagerer
Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition
Chao Huang, Eric Chang, Jianlai Zhou, Kai-Fu Lee
A mixed and code excitation LPC vocoder at 1.76 kb/s
Jinzhong Zhang, Yingmin He, Renshu Yu
Efficient segment quantization of LSP parameters for very low bit speech coding
Minoru Kohata, Ikuya Mitsuya, Motoyuki Suzuki, Shozo Makino
Phonetic vocoder assessment
Carlos M. Ribeiro, Isabel M. Trancoso, Diamantino A. Caseiro
A new low bit rate speech coder based on intraframe waveform interpolation
Hongtao Hu, Limin Du
Discriminatively derived HMM-based announcement modeling approach for noise control avoiding the problem of false alarms
Rathinavelu Chengalvarayan, David L. Thomson
Instantaneous-distortion based weighted acoustic modeling for robust recognition of coded speech
Juan M. Huerta, Richard M. Stern
Adapting phonetic decision trees between languages for continuous speech recognition
Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma
Speaker normalization in the MFCC domain
Stephen Cox
Data-driven phonetic regression class tree estimation for MLLR adaptation
Reinhold Haeb-Umbach
Constrained maximum likelihood linear regression for speaker adaptation
Mohamed Afify, Olivier Siohan
Predictive speaker adaptation based on least squares method
Woo-Yong Choi, Hyung Soon Kim
HMM adaptation using vector taylor series for noisy speech recognition
Alex Acero, Li Deng, Trausti Kristjansson, Jerry Zhang
Minimum risk acoustic clustering for multilingual acoustic model combination
Dimitra Vergyri, Stavros Tsakalidis, William Byrne
Talking to thimble jellies: children²s conversational speech with animated characters
Sharon Oviatt
A high-resolution glottal pulse tracker
Robert Rodman, David McAllister, Donald Bitzer, D. Chappell
Analysis of voice production in breathy, normal and pressed phonation by comparing inverse filtering and videokymography
Paavo Alku, Jan G. Svec, Erkki Vilkman, Frantisek Sram
Model of the mechanical linkage of the upper lip-jaw for the articulatory coordination
Takayuki Ito, Hiroaki Gomi, Masaaki Honda
Measurement of palatolingual contact pressure and tongue force using a force-sensor-mounted palatal plate
Masafumi Matsumura, Takuya Niikawa, Taku Torii, Hitoshi Yamasaki, Hisanaga Hara, Takashi Tachimura, Takeshi Wada
A 3d tongue model based on MRI data
Olov Engwall
Speech quality improvement in TTS system using ABS/OLA sinusoidal model
Jae-Hyun Bae, Heo-Jin Byeon, Yung-Hwan Oh
A study of palatal segments' production by danish speakers
Marielle Bruyninckx, Bernard Harmegnies
Dynamic selection of feature spaces for robust speech recognition
Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny
A probabilistic model of integration of acoustic cues in FV syllables
Santiago Fernández, Sergio Feijóo
Directed graphical models of classifier combination: application to phone recognition
Jeff A. Bilmes, Katrin Kirchhoff
Real-time multilingual HMM training robust to channel variations
E. E. Jan, Jaime Botella Ordinas, George Saon, Salim Roukos
The intelligibility of German and English speech to Dutch listeners
Sander J. van Wijngaarden, Herman J.M. Steeneken
On the use of bandpass liftering in speaker recognition
Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi
On auditory-phonetic short-term transformation
René Carré, Liliane Sprenger-Charolles, Souhila Messaoud-Galusi, Willy Serniclaes
Predicting the perceptual confusion of synthetic plosive consonants in noise
James J. Hant, Abeer Alwan
Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches
Martha Larson, Daniel Willett, Joachim Köhler, Gerhard Rigoll
Learning and transfer of learning for synthetic speech
Martine van Zundert, Jacques Terken
Neural plasticity revealed in perceptual training of a Japanese adult listener to learn american /l-r/ contrast: a whole-head magnetoencephalography study
Yang Zhang, Patricia K. Kuhl, Toshiaki Imada, Paul Iverson, John Pruitt, Makoto Kotani, Erica Stevens
The effect of consonantal context and acoustic characteristics on the discrimination between the English vowel /i/ and /e/ by Japanese learners
Akiyo Joto
A study on emotional feature recognition in speech
Li Zhao, Wei Lu, Ye Jiang, Zhenyang Wu
LPC, LPCC and MFCC parameterisation applied to the detection of voice impairments
Juan I. Godino-Llorente, Santiago Aguilera-Navarro, Pedro Gómez-Vilda
A complementary approach to computer-aided transcription: synergy of statistical-based and kbnowledge discovery paradigms
Benjamin K. T'sou, Tom B. Y. Lai
Teraspeech2000 : a 10,000 speakers database
Marie-José Caraty, Claude Montacié
The MATE workbench - a tool in support of spoken dialogue annotation and information extraction
Laila Dybkjær, Niels Ole Bernsen
Discarding impossible events from statistical language models
Armelle Brun, David Langlois, Kamel Smaili, Jean-Paul Haton
A tool to build a treebank for conversational Chinese
Yves Lepage, Nicolas Auclerc, Satoshi Shirai
Parameter reduction in a text-independent speaker verification system
Roland Auckenthaler, Michael Carey, John Maso
Advances on HMM-based text-dependent speaker verification
Yong Gu, Trevor Thomas
Optimisation of GMM in speaker recognition
Robert Stapert, John S. Mason, Roland Auckenthaler
Distance-based Gaussian mixture model for speaker recognition over the telephone
Ran D. Zilca, Yuval Bistritz
Pruning abnormal data for better making a decision in speaker verification
Jun-Hui Liu, Ke Chen
ASR, dialects, and acoustic/phonological distances
Louis ten Bosch
Speaker verification by integrating dynamic and static features using subspace method
Masafumi Nishida, Yasuo Ariki
Improvement of speaker recognition system by individual information weighting
Se-Hyun Kim, Gil-Jin Jang, Yung-Hwan Oh
Speaker verification in noise using temporal constraints
Néstor Becerra Yoma, Tarciano Facco Pegoraro
Speaker identification using discriminative features selection
Bogdan Sabac, Inge Gavat, Zica Valsan
A further investigation on speech features for speaker characterization
Ivan Magrin-Chagnolleau, Guilleaume Gravier, Mouhamadou Seck, Olivier Boeffard, R. Blouet, Frédéric Bimbot
Language identification from short segments of speech
Jyotsana Balleda, Hema A Murthy, T. Nagarajan
Generation of utterances based on visual context information
Susanne Kronenberg, Franz Kummert
A spoken dialogue system for conference/workshop services
Mazin Rahim, Roberto Pieraccini, Wieland Eckert, Esther Levin, Giuseppe Di Fabbrizio, Giuseppe Riccardi, Candy Kamm, Shrikanth Narayanan
Developing robust, user-centred multimodal spoken language systems: the MUeSLI project
Gavin Churcher, Peter Wyard
TABOR - a norwegian spoken dialogue system for bus travel information
Magne H. Johnsen, Torbjørn Svendsen, Tore Amble, Trym Holter, Erik Harborg
Language understanding component for Chinese dialogue system
Yinfei Huang, Fang Zheng, Mingxing Xu, Pengju Yan, Wenhu Wu
Designing a domain independent platform of spoken dialogue system
Kazumi Aoyama, Izumi Hirano, Hideaki Kikuchi, Katsuhiko Shirai
An enhanced BLSTIP dialogue research platform
Qiru Zhou, Antoine Saad, Sherif Abdou
Using machine learning method and subword unit representations for spoken document categorization
Weidong Qu, Katsuhiko Shirai
ASR satisficing: the effects of ASR accuracy on speech retrieval
Litza Stark, Steve Whittaker, Julia Hirschberg
A system for retrieving broadcast news speech documents using voice input keywords and similarity between words
Hiromitsu Nishizaki, Seiichi Nakagawa
Intention extraction and semantic matching for internet FAQ retrieval using spoken language query
Yu-Sheng Lai, Kuen-Lin Lee, Chung-Hsien Wu
A domain-independent model to improve spelling in a web environment
Robert J. van Vark, Jelle K. de Haan, Leon J. M. Rothkrantz
Expanded vector space model based on word space in cross media retrieval of news speech data
Seiichi Takao, Jun Ogata, Yasuo Ariki
Audio stream phrase recognition for a national gallery of the spoken word: "one small step"
John H. L. Hansen, Bowen Zhou, Murat Akbacak, Ruhi Sarikaya, Bryan Pellom
Pronunciation variants description using recognition error modeling with phonetic derivation hypotheses
Hideharu Nakajima, Yoshinori Sagisaka, Hirofumi Yamamoto
Evaluating responsiveness in spoken dialog systems
Wataru Tsukahara, Nigel Ward
Characteristics of spoken language required for objective quality evaluation of echo cancellers
Nobuhiko Kitawaki, Futoshi Asano, Takeshi Yamada
Evaluation of the ATR-matrix speech translation system with a pair comparison method between the system and humans
Fumiaki Sugaya, Toshiyuki Takezawa, Akio Yokoo, Yoshinori Sagisaka, Seiichi Yamamoto
An automatic timing detection method for superimposing closed captions of TV programs
Ichiro Maruyama, Yoshiharu Abe, Terumasa Ehara, Katsuhiko Shirai
Normalized time-frequency speech representation in articulation training systems
Marcel Ogner, Zdravko Kacic
Semantic transcoding: making the handicapped and the aged free from their barriers in obtaining information on the web
Shinichi Torihara, Katashi Nagao
The use of nonlinear energy transformation for Tamil connected-digit speech recognition
Rathinavelu Chengalvarayan
State based sub-band Wiener filters for speech enhancement in car environments
Aimin Chen, Saeed Vaseghi
Total least squares based subband modelling for scalable speech representations with damped sinusoids
Kris Hermus, Werner Verhelst, Patrick Wambacq, Philippe Lemmerling
Speech enhancement: new approaches to soft decision
Joon-Hyuk Chang, Nam Soo Kim
Data collection and performance evaluation of spoken dialogue systems: the MIT experience
James Glass, Joseph Polifroni, Stephanie Seneff, Victor Zue
Considerations in the design and evaluation of spoken language dialog systems
Lori Lamel, Sophie Rosset, Jean-Luc Gauvain
Labeling audio-visual speech corpora and training an ANN/HMM audio-visual speech recognition system
Martin Heckmann, Frédéric Berthommier, Christophe Savario, Kristian Kroschel
Speech corpus of Chinese discourse and the phonetic research
Aijun Li, Maocan Lin, XiaoXia Chen, Yiqing Zu, Guohua Sun, Wu Hua, Zhigang Yin, Jingzhu Yan
Results of the 1999 topic detection and tracking evaluation in Mandarin and English
Jonathan G. Fiscus, George R. Doddington
Multimodal corpora for human-machine interaction research
Satoshi Nakamura, Keiko Watanuki, Toshiyuki Takezawa, Satoru Hayamizu
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
David Pearce, Hans-Günter Hirsch
The bavarian archive for speech signals - serving the speech community
Hans-Günther Tillmann, Florian Schiel, Christoph Draxler, Phil Hoole
The development of spoken language resources in oceania
J. Bruce Millar
Hands-free human-machine dialogue - corpora, technology and evaluation
Frank K. Soong, Eric A. Woudenberg
On-line learning of acoustic and lexical units for domain-independent ASR
Giuseppe Riccardi
Semi-automatic language model acquisition without large corpora
Tomoyosi Akiba, Katsunobu Itou
Detecting acoustic morphemes in lattices for spoken language understanding
Dijana Petrovska-Delacrétaz, Allen L. Gorin, Jerry H. Wright, Giuseppe Riccardi
Design of robust subtractive beamformer for noisy speech recognition
Mitsunori Mizumachi, Masato Akagi, Satoshi Nakamura
Objective long-term assessment of speech quality changes in pre-lingual cochlear implant children
Hamid Sheikhzadeh, Rassoul Amirfattahi
Automatic stuttering recognition using hidden Markov models
Elmar Nöth, Heinrich Niemann, Tino Haderlein, M. Decher, Uwe Eysholdt, F. Rosanowski, T. Wittenberg
Grounded speech communication
Deb Roy
Acquisition of second language intonation
Sun-Ah Jun, Mira Oh
Computer-aided Mandarin pronunciation learning system
Man-hung Siu, Ka-Ming Wong, Man-Yan Ching, Mei-Sum Lau
Speech recognition software: a tool for people with dyslexia
Michael McTear, Norma Conn, Nicola Phillips
STAR: articulation training for young children
H. Timothy Bunnell, Debra M. Yarrington, James B. Polikoff
Sound pressure distributions and propagation paths in the vocal tract with the pyriform fossa and the larynx
Takayoshi Nakai, Keizo Ishida, Hisayoshi Suzuki
Lip representation by image ellipse
László Czap
An acoustic profile of speech efficiency
Rob J. J. H. van Son, Barbertje M. Streefkerk, Louis C. W. Pols
Multi-scale audio indexing for Chinese spoken document retrieval
Helen M. Meng, W. K. Lo, Yuk Chi Li, P. C. Ching
Phone dependent modeling of hyperarticulated effects#
Hagen Soltau, Alex Waibel
Vocabulary-based acoustic model trim down and task adaptation
Qing Guo, Yonghong Yan, Baosheng Yuan, Xiangdong Zhang, Ying Jia, Xiaoxing Liu
Place of articulation cues for voiced and voiceless plosives and fricatives in syllable-initial position
Willa S. Chen, Abeer Alwan
A block cosine transform and its application in speech recognition
Jingdong Chen, Kuldip K. Paliwal, Satoshi Nakamura
Automatic metric-based speech segmentation for broadcast news via principal component analysis
Jeih-Weih Hung, Hsin-Min Wang, Lin-Shan Lee
Maximal rank likelihood as an optimization function for speech recognition
Yuqing Gao, Yongxin Li, Michael Picheny
The effects of room acoustics on MFCC speech parameter
Yue Pan, Alex Waibel
Time-frequency distribution of partial phonetic information measured using mutual information
Mark Hasegawa-Johnson
Subword-dependent speaker clustering for improved speech recognition
Li Jiang, Xuedong Huang
An equivalent-class based MMI learning method for MGCPM
Chunhua Luo, Fang Zheng, Mingxing Xu
Continuous speech recognition using articulatory data
Alan A. Wrench, Korin Richmond
Asynchrony with trained transition probabilities improves performance in multi-band speech recognition
Brian Mak, Yik-Cheung Tam
Discriminative MLPs in HMM-based recognition of speech in cellular telephony
Sunil Sivadas, Pratibha Jain, Hynek Hermansky
Acoustic modeling for spontaneous speech recognition using syllable dependent models
Toshiyuki Hanazawa, Jun Ishii, Yohei Okato, Kunio Nakajima
A robust training strategy against extraneous acoustic variations for spontaneous speech recognition
Hui Jiang, Li Deng
Improved performance and generalization of minimum classification error training for continuous speech recognition
Darryl W. Purnell, Elizabeth C. Botha
Dynamic threshold setting via Bayesian information criterion (BIC) in HMM training
Ying Jia, Yonghong Yan, Baosheng Yuan
Modelling sub-phone insertions and deletions in continuous speech recognition
Thomas Hain, Philip C. Woodland
Improved acoustics modeling for speech recognition using transformation techniques
Carrson C. Fung, Oscar C. Au, Wanggen Wan, Chi H. Yim, Cyan L. Keung
Discriminative training of tied-mixture HMM by deterministic annealing
Liang Gu, Jayanth Nayak, Kenneth Rose
Discriminative training in natural language call routing
Hong-Kwang Jeff Kuo, Chin-Hui Lee
A speech recognition method with a language-independent intermediate phonetic code
Kazuyo Tanaka, Hiroaki Kojima
Confidence measures based on the k-nn probability estimator
Fabrice Lefèvre
On deriving a phoneme model for a new language
Niloy Mukherjee, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma
Estimation of semantic case of Japanese dialogue by use of distance derived from statistics of dependency
Tomonobu Saito, Kiyoshi Hashimoto
A semantically-based confidence measure for speech recognition
Stephen Cox, Srinandan Dasmahapatra
Support vector machines for automatic data cleanup
Aravind Ganapathiraju, Joseph Picone
Competition-based score analysis for utterance verification in name recognition
Yong Gu, Trevor Thomas
Utterance verification/rejection for speaker-dependent and speaker-independent speech recognition
Yaxin Zhang
Emotion recognition in speech signal: experimental study, development, and application
Valery A. Petrushin
A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA)
Ren-yuan Lyu, Chi-yu Chen, Yuang-chin Chiang, Min-shung Liang
A data-driven methodology for the production of multilingual conversational systems
Ossama Emam, Jorge Gonzalez, Carsten Günther, Eric Janke, Siegfried Kunzmann, Giulio Maltese, Claire Waast-Richard
Multi-path, context dependent SC-HMM architectures for improved connected word recognition
Tzur Vaich, Arnon Cohen
Robust recognition using multiple utterances
Yoram Meron, Keikichi Hirose
High performance Italian continuous "digit" recognition
Piero Cosi, John-Paul Hosom, Fabio Tesser
The automatic speech recognition engine ESPERE: experiments on telephone speech
Dominique Fohr, Odile Mella, Christophe Antoine
A comparison of distributed and network speech recognition for mobile communication systems
Imre Kiss
An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces
Joe Frankel, Korin Richmond, Simon King, Paul Taylor
The OGI kids² speech corpus and recognizers
Khaldoun Shobaki, John-Paul Hosom, Ronald A. Cole
Reducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning
Jian Wu, Fang Zheng
A three-stage solution for flexible vocabulary speech understanding
Grace Chung
Decoding speech in the presence of other sound sources
Jon Barker, Martin Cooke, Daniel P. W. Ellis
Efficient search strategy in large vocabulary continuous speech recognition using prosodic boundary information
Shi-Wook Lee, Keikichi Hirose, Nobuaki Minematsu
Large vocabulary Korean continuous speech recognition using a one-pass algorithm
Ha-Jin Yu, Hoon Kim, Joon-Mo Hong, Min-Seong Kim, Jong-Seok Lee
A tree-trellis n-best decoder for stochastic context-free grammars
Alexander Seward
EWAVES: an efficient decoding algorithm for lexical tree based speech recognition
Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua
Novel two-pass search strategy using time-asynchronous shortest-first second-pass beam search
Atsunori Ogawa, Yoshiaki Noda, Shoichi Matsunaga
Pruning of state-tying tree using bayesian information criterion with multiple mixtures
Yu-Chung Chan, Manhung Siu, Brian Mak
Improvements of the Philips 2000 Taiwan Mandarin benchmark system
Yuan-Fu Liao, Nick Wang, Max Huang, Hank Huang, Frank Seide
Extending the generation of word graphs for a cross-word m-gram decoder
Christoph Neukirchen, Xavier Aubert, Hans Dolfing
Improvements in search algorithm for large vocabulary continuous speech recognition
Qingwei Zhao, Zhiwei Lin, Baosheng Yuan, Yonghong Yan
New developments in automatic meeting transcription
Hua Yu, Takashi Tomokiyo, Zhirong Wang, Alex Waibel
Effective vector quantization for a highly compact acoustic model for LVCSR
Jielin Pan, Baosheng Yuan, Yonghong Yan
Effective lexical tree search for large vocabulary continuous speech recognition
Hiroki Yamamoto, Toshiaki Fukada, Yasuhiro Komori
Improvements in automatic speech summarization and evaluation methods
Chiori Hori, Sadaoki Furui
Automatic phonetic transcription of spontaneous speech (american English)
Shuangyu Chang, Lokendra Shastri, Steven Greenberg
Speed improvement of the tree-based time asynchronous search
Miroslav Novak, Michael Picheny
Recent improvements in speech recognition performance on large vocabulary conversational speech (voicemail and switchboard)
Jing Huang, B. Kingsbury, L. Mangu, Mukund Padmanabhan, George Saon, Geoffrey Zweig
Speaker normalization training and adaptation for speech recognition
Lei He, Ditang Fang, Wenhu Wu
Lexical and acoustic modeling of non-native speech in LVSCR
Laura Mayfield Tomokiyo
Modeling phone correlation for speaker adaptive speech recognition
Baojie Li, Keikichi Hirose, Nobuaki Minematsu
Very fast adaptation for large vocabulary continuous speech recognition using eigenvoices
Henrik Botterweck
Efficiently using speaker adaptation data
Chengyi Zheng, Yonghong Yan
A combination of speaker normalization and speech rate normalization for automatic speech recognition
Thilo Pfau, Robert Faltlhauser, Günther Ruske
Speech model compensation with direct adaptation of cepstral variance to noisy environment
Tai-Hwei Hwang, Kuo-Hwei Yuo, Hsiao-Chuan Wang
Gaussian similarity analysis and its application in speaker adaptation
Ji Wu, Zuoying Wang
A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique
Nobuyasu Itoh, Masafumi Nishimura, Shinsuke Mori
VODIS - voice-operated driver information systems: a usability study on advanced speech technologies for car environments
Petra Geutner, Luis Arevalo, Joerg Breuninger
Natural language call steering for service applications
Wu Chou, Qiru Zhou, Hong-Kwang Jeff Kuo, Antoine Saad, David Attwater, Peter Durston, Mark Farrell, Frank Scahill
A single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas
Jörg Hunsinger, Manfred Lang
Low complexity connected digit recognition for mobile applications
Prabhu Raghavan, Sunil K. Gupta
Telephone speech recognition from large lists of Czech words
Jan Nouza
Speech and word detection algorithms for hands-free applications
Duanpei Wu, X. Menendez-Pidal, L. Olorenshaw, R. Chen, M. Tanaka, M. Amador
Large vocabulary continuous speech recognition of read speech over cellular and landline networks
Ashwin Rao, Bob Roth, Venkatesh Nagesha, Don McAllaster, Natalie Liberman, Larry Gillick
Toward speech communications beyond language barrier - research of spoken language translation technologies at ATR -
Seiichi Yamamoto
Speech translation for French within the c-STAR II consortium and future perspectives
Hervé Blanchon, Christian Boitet
Japanese-to-Chinese spoken language translation based on the simple expression
Chengqing Zong, Yumi Wakita, Bo Xu, Zhenbiao Chen, Kenji Matsui
Finite-state models for lexical reordering in spoken language translation
Srinivas Bangalore, Giuseppe Riccardi
CHUNKY: an example based machine translation system for spoken dialogs
Ralf Engel
Spoken translation: challenges and opportunities
Gianni Lazzari
Analysis into a formal task-oriented pivot without clear abstract - semantics is best handled as "usual" translation
Christian Boitet, Jean-Philippe Guilbaud
An improved template-based approach to spoken language translation
Chengqing Zong, Taiyi Huang, Bo Xu
An automatic interpretation system for travel conversation
Takao Watanabe, Akitoshi Okumura, Shinsuke Sakai, Kiyoshi Yamabana, Shinichi Doi, Ken Hanazawa
Cellular-phone based speech-to-speech translation system ATR-MATRIX
Rainer Gruhn, Harald Singer, Hajime Tsukada, Masaki Naito, Atsushi Nishino, Atsushi Nakamura, Yoshinori Sagisaka, Satoshi Nakamura
Generation of pronunciation rule sets for automatic segmentation of American English and Japanese
Nicole Beringer, Tsuyoshi Ito, Marcia Neff
Hindi speech database
K. Samudravijaya, P. V. S. Rao, S. S. Agrawal
MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database
Hsiao-Chuan Wang, Frank Seide, Chiu-Yu Tseng, Lin-Shan Lee
Wavesurfer - an open source speech tool
Kåre Sjölander, Jonas Beskow
Automatic labelling of voice-quality in speech databases for synthesis
Nick Campbell, Toru Marumoto
Speech quality evaluation based on AM-FM time-frequency representations
Joe Timoney, J. Brian Foley
Free software toolkit for Japanese large vocabulary continuous speech recognition
Tatsuya Kawahara, Akinobu Lee, Tetsunori Kobayashi, Kazuya Takeda, Nobuaki Minematsu, Shigeki Sagayama, Katsunobu Itou, Akinori Ito, Mikio Yamamoto, Atsushi Yamada, Takehito Utsuro, Kiyohiro Shikano
Robust speech recognition based on off-line elicitation of multiple priors and on-line adaptive prior fusion
Qiang Huo, Bin Ma
Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components
William J.J. Roberts, Sadaoki Furui
Pronunciation variation in ASR: which variation to model?
Mirjam Wester, Judith M. Kessens, Helmer Strik
The use of dynamic reliability scoring in speech recognition
Xiaolong Mou, Victor Zue
Acoustical and lexical based confidence measures for a very large vocabulary telephone speech hypothesis-verification system
Javier Macías-Guarasa, Javier Ferreiros, Ruben San-Segundo, Juan Manuel Montero, Juan Manuel Pardo
Phone-duration-based confidence measures for embedded applications
Silke Goronzy, Krzysztof Marasek, Ralf Kompe, Andreas Haag
Hybrid SVM/HMM architectures for speech recognition
Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone
Rapid adaptation of n-gram language models using inter-word correlation for speech recognition
Koki Sasaki, Hui Jiang, Keikichi Hirose
Class-based language model adaptation using mixtures of word-class weights
Gareth Moore, Steve Young
A language model adaptation approach based on text classification
Jiasong Sun, Xiaodong Cui, Zuoying Wang, Yang Liu
Automatically incorporating unknown words in JUPITER
Grace Chung
Look-ahead sequential feature vector normalization for noisy speech recognition
Rathinavelu Chengalvarayan
Speaker adaptation in noisy environments based on parameter estimation using uncertain data
Naoto Iwahashi, Akihiko Kawasaki
Speech/noise separation using two microphones and a VQ model of speech signals
Alex Acero, Steven Altschuler, Lani Wu
Using maximum likelihood linear regression for segment clustering and speaker identification
Michiel Bacchiani
Structural maximum a-posteriori linear regression for unsupervised speaker adaptation
Tor André Myrvoll, Olivier Siohan, Chin-Hui Lee, Wu Chou
Transformation-based Bayesian predictive classification for online environmental learning and robust speech recognition
Jen-Tzung Chien, Guo-Hong Liao
Improved MLLR speaker adaptation using confidence measures for conversational speech recognition
Michael Pitz, Frank Wessel, Hermann Ney
Unified acoustic modeling for continuous speech recognition
Rathinavelu Chengalvarayan
A nonlinear unsupervised adaptation technique for speech recognition
Satya Dharanipragada, Mukund Padmanabhan
Using class weighting in inter-class MLLR
Sam-Joo Doh, Richard M. Stern
Burst detection based on measurements of intensity discrimination
John-Paul Hosom, Ronald A. Cole
Using acoustic condition clustering to improve acoustic change detection on broadcast news
Javier Ferreiros López, Daniel P. W. Ellis
Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems
Jon P. Nedel, Rita Singh, Richard M. Stern
The measurement of acoustic similarity and its applications
Liqin Shen, Guokang Fu, Haixin Chai, Yong Qin
Glottal parameters contributing to the perceotion of loud voices
Sopae Yi, Hyung Soon Kim, One Good Lee
Grapheme based speech recognition for large vocabularies
Christoph Schillo, Gernot A. Fink, Franz Kummert
Automatic subword unit refinement for spontaneous speech recognition via phone splitting
Jon P. Nedel, Rita Singh, Richard M. Stern
Rhythm timing in Japanese English
Takeshi Tarui
A vocal tract area ratio estimation from spectral parameter extracted by straight
Mamoru Iwaki
Decision tree based rate of speech modeling for speech recognition
Bhuvana Ramabhadran, Yuqing Gao
Spectral peak tracking and its use in speech recognition
Mukund Padmanabhan
Weighted pairwise scatter to improve linear discriminant analysis
Yongxin Li, Yuqing Gao, Hakan Erdogan
ARTIC: a new Czech text-to-speech system using statistical approach to speech segment database construction
Jindrich Matousek, Josef Psutka
Extended maximum a posterior linear regression (EMAPLR) model adaptation for speech recognition
Wu Chou, Olivier Siohan, Tor André Myrvoll, Chin-Hui Lee
Thai monophthong recognition using continuous density hidden Markov model and LPC cepstral coefficients
Ekkarit Maneenoi, Somchai Jitapunkul, Visarut Ahkuputra, Umavasee Thathong, Boonchai Thampanitchawong, Sudaporn Luksaneeyanawin
Error recovery and sentence verification using statistical partial pattern tree for conversational speech
Chung-Hsien Wu, Yeou-Jiunn Chen, Cher-Yao Yang
Vowel landmark detection
Andrew Wilson Howitt
Rival training: efficient use of data in discriminative training
Carsten Meyer, Georg Rose
Nasal detection module for a knowledge-based speech recognition system
Marilyn Y. Chen
Semi-continuous segmental probability model for speech signals
Jun Liu, Xiaoyan Zhu, Bin Jia
Cross-domain robust acoustic training
Ea-Ee Jan, Jaime Botella Ordinas
A c/v segmentation method for Mandarin speech based on multiscale fractal dimension
Fan Wang, Fang Zheng, Wenhu Wu
An application of SAMPA-c for standard Chinese
Xiaoxia Chen, Aijun Li, Guohua Sun, Wu Hua, Zhigang Yu
Joint speech signal enhancement based on spectral subtraction and SVD filter
Wenkai Lu, Xuegong Zhang, Yanda Li, Shen Liqin, Zhu Weibin
Inverse lattice filtering of speech with adapted non-uniform delays
Sacha Krstulovic, Frédéric Bimbot
Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay
Hideki Kawahara, Yoshinori Atake, Parham Zolfaghari
Filterbank-based feature extraction for speech recognition and its application to voice mail transcription
Jun Huang, Mukund Padmanabhan
A cepstrum-based harmonics-to-noise ratio in voice signals
Peter J. Murphy
A pitch determination algorithm based on subharmonic-to-harmonic ratio
Xuejing Sun
Source separation techniques applied to speech linear prediction
Jordi Solé i Casals, Enric Monte i Moreno, Christian Jutten, Anisse Taleb
Model based voice decomposition method
Masahide Sugiyama
A time-varying complex speech analysis based on IV method
Keiichi Funaki
A sinusoidal model based on frequency-to-instantaneous frequency mapping
Parham Zolfaghari, Hideki Kawahara
Dynamic feature extraction by wavelet analysis
Omar Farooq, Sekharjit Datta
An investigation of variable block length methods for calculation of spectral/temporal features for automatic speech recognition
Montri Karnjanadecha, Stephen A. Zahorian
Glottal excitation modeling using HMM with application to robust analysis of speech signal
Akira Sasou, Kazuyo Tanaka
Automatic segmentation of speech based on hidden Markov models and acoustic features
Laura Docío-Fernández, Carmen García-Mateo
VERBMOBIL dialogues: multifaced analysis
Akira Kurematsu, Youichi Akegami, Susanne Burge, Susanne Jekat, Brigitte Lause, Victoria L. Maclaren, Daniela Oppermann, Tanja Schultz
A computation-efficient parameter adaptation algorithm for the generalized spectral subtraction method
Jin-Jie Zhang, Zhi-Gang Cao, Zheng-Xin Ma
A semantic tagging tool for spoken dialogue corpus
Masahiro Araki, Kiyoshi Ueda, Takuya Nishimoto, Yasuhisa Niimi
The phonetic labeling on read and spontaneous discourse corpora
Aijun Li, Xiaoxia Chen, Guohua Sun, Wu Hua, Zhigang Yin, Yiqing Zu, Fang Zheng, Zhanjiang Song
The quality of multilingual automatic segmentation using German MAUS
Nicole Beringer, Florian Schiel
UWB_S01 corpus - a czech read-speech corpus
Vlasta Radová, Josef Psutka
Web-based monitoring, logging and reporting tools for multi-service multi-modal systems
Giuseppe Di Fabbrizio, Shrikanth Narayanan
Comparing the recognition performance of CSRs: in search of an adequate metric and statistical significance test
Helmer Strik, Catia Cucchiarini, Judith M. Kessens
Perceptual dimensions of speech sound quality in modern transmission systems
Alexander Raake
Article |
---|