doi: 10.21437/Interspeech.2006
ISSN: 2958-1796
Robust interpretation in dialogue by combining confidence scores with contextual features
Matthew Purver, Florin Ratiu, Lawrence Cavedon
A clustering approach to semantic decoding
Hui Ye, Steve Young
A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts
Teruhisa Misu, Tatsuya Kawahara
Phoneme-to-grapheme mapping for spoken inquiries to the semantic web
Axel Horndasch, Elmar Nöth, Anton Batliner, Volker Warnke
Bootstrapping language models for dialogue systems
Karl Weilhammer, Matthew N. Stuttle, Steve Young
Question answering with discriminative learning algorithms
Junlan Feng
Feature normalization using smoothed mixture transformations
Patrick Kenny, Vishwa Gupta, G. Boulianne, Pierre Ouellet, Pierre Dumouchel
Stochastic vector mapping-based feature enhancement using prior model and environment adaptation for noisy speech recognition
Chia-Hsin Hsieh, Chung-Hsien Wu, Jun-Yu Lin
A framework for robust MFCC feature extraction using SNR-dependent compression of enhanced mel filter bank energies
Babak Nasersharif, Ahmad Akbari
Coupling particle filters with automatic speech recognition for speech feature enhancement
Friedrich Faubel, Matthias Wölfel
Extension and further analysis of higher order cepstral moment normalization (HOCMN) for robust features in speech recognition
Chang-wen Hsu, Lin-shan Lee
An improved mel-wiener filter for mel-LPC based speech recognition
Md. Babul Islam, Hiroshi Matsumoto, Kazumasa Yamamoto
A stochastic approach for dialog management based on neural networks
Lluis F. Hurtado, David Griol, Encarna Segarra, Emilio Emilio, Sanchis Sanchis
Discourse structure and speech recognition problems
Mihai Rotaru, Diane J. Litman
A texttiling based approach to topic boundary detection in meetings
Satanjeev Banerjee, Alexander I. Rudnicky
An user-centered development of an intuitive dialog control for speech-controlled music selection in cars
Stefan Schulz, Hilko Donker
Doing research on a deployed spoken dialogue system: one year of let's go! experience
Antoine Raux, Dan Bohus, Brian Langner, Alan W. Black, Maxine Eskenazi
Detecting question-bearing turns in spoken tutorial dialogues
Jackson Liscombe, Jennifer J. Venditti, Julia Hirschberg
A computational auditory scene analysis system for robust speech recognition
Soundararajan Srinivasan, Yang Shao, Zhaozhang Jin, DeLiang Wang
CASA based speech separation for robust speech recognition
Runqiang Han, Pei Zhao, Qin Gao, Zhiping Zhang, Hao Wu, Xihong Wu
Enhancement of harmonic content of speech based on a dynamic programming pitch tracking algorithm
Mark R. Every, Philip J.B. Jackson
Recent advances in speech fragment decoding techniques
Jon Barker, André Coy, Ning Ma, Martin Cooke
Speech recognition using factorial hidden Markov models for separation in the feature space
Tuomas Virtanen
Combining missing-feature theory, speech enhancement and speaker-dependent/-independent modeling for speech separation
Ji Ming, Timothy J. Hazen, James R. Glass
Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge system
T. Kristjansson, J. Hershey, P. Olsen, S. Rennie, Ramesh Gopinath
Modified phase opponency based solution to the speech separation challenge
Om D. Deshmukh, Carol Y. Espy-Wilson
The 2006 RWTH parliamentary speeches transcription system
J. Lööf, M. Bisani, Ch. Gollan, G. Heigold, Björn Hoffmeister, Ch. Plahl, Ralf Schlüter, Hermann Ney
Multilingual non-native speech recognition using phonetic confusion-based acoustic model modification and graphemic constraints
G. Bouselmi, D. Fohr, I. Illina, Jean-Paul Haton
Automatic speech recognition of Cantonese-English code-mixing utterances
Joyce Y. C. Chan, P. C. Ching, Tan Lee, Houwei Cao
The ICSI+ multilingual sentence segmentation system
M. Zimmerman, Dilek Hakkani-Tür, J. Fung, N. Mirghafori, L. Gottlieb, Elizabeth Shriberg, Yang Liu
Cross-language evaluation of voice-to-phoneme conversions for voice-tag application in embedded platforms
Yan Ming Cheng, Changxue Ma, Lynette Melnar
A multi-space distribution (MSD) approach to speech recognition of tonal languages
Huanliang Wang, Yao Qian, Frank K. Soong, Jian-Lai Zhou, Jiqing Han
Comparison of acoustic modeling techniques for Vietnamese and Khmer ASR
Viet Bac Le, Laurent Besacier
Multi-accent Chinese speech recognition
Yi Liu, Pascale Fung
Comparative analysis of formants of British, american and australian accents
Seyed Ghorshi, Saeed Vaseghi, Qin Yan
Automatic initial/final generation for dialectal Chinese speech recognition
Linquan Liu, Thomas Fang Zheng, Wenhu Wu
Maximum entropy modeling for diacritization of Arabic text
Ruhi Sarikaya, Ossama Emam, Imed Zitouni, Yuqing Gao
Comparison of Slovak and Czech speech recognition based on grapheme and phoneme acoustic models
Slavomír Lihan, Jozef Juhár, Anton Cizmár
Integrating Festival and Windows
Rhys James Jones, Ambrose Choy, Briony Williams
Measuring the acceptable word error rate of machine-generated webcast transcripts
Cosmin Munteanu, Gerald Penn, Ron Baecker, Elaine Toms, David James
Analyzing reusability of speech corpus based on statistical multidimensional scaling method
Goshu Nagino, Makoto Shozakai
Redundancy and productivity in the speech technology lexicon - can we do better?
Susan Fitt, Korin Richmond
Word intelligibility estimation of noise-reduced speech
Takeshi Yamada, Masakazu Kumakura, Nobuhiko Kitawaki
Exploring the unknown - collecting 1000 speakers over the internet for the ph@ttsessionz database of adolescent speakers
Christoph Draxler
A new single-ended measure for assessment of speech quality
Timothy Murphy, Dorel Picovici, Abdulhussain E. Mahdi
Speech technology for minority languages: the case of Irish (gaelic)
Ailbhe Ní Chasaide, John Wogan, Brian Ó Raghallaigh, Áine Ní Bhriain, Eric Zoerner, Harald Berthelsen, Christer Gobl
Further investigations on the relationship between objective measures of speech quality and speech recognition rates in noisy environments
Francisco José Fraga, Carlos Alberto Ynoguti, André Godoi Chiovato
Non-intrusive speech quality assessment with low computational complexity
Volodya Grancharov, David Y. Zhao, Jonas Lindblom, W. Bastiaan Kleijn
Using speech recognition technique for constructing a phonetically transcribed taiwanese (min-nan) text corpus
Min-Siong Liang, Ren-Yuan Lyu, Yuang-Chin Chiang
Sloparl - slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition
Andrej Zgank, Tomas Rotovnik, Matej Grasic, Marko Kos, Damjan Vlaj, Zdravko Kacic
An annotation scheme for agreement analysis
Siew Leng Toh, Fan Yang, Peter A. Heeman
Conversational quality estimation model for wideband IP-telephony services
Hitoshi Aoki, Atsuko Kurashima, Akira Takahashi
The vocal joystick data collection effort and vowel corpus
Kelley Kilanski, Jonathan Malkin, Xiao Li, Richard Wright, Jeff A. Bilmes
Comparison of the ITU-t p.85 standard to other methods for the evaluation of text-to-speech systems
Dmitry Sityaev, Katherine Knill, Tina Burrows
An annotation scheme for complex disfluencies
Peter A. Heeman, Andy McMillin, J. Scott Yaruss
Automatic phonetic transcription of large speech corpora: a comparative study
Christophe Van Bael, Lou Boves, Henk van den Heuvel, Helmer Strik
Examining knowledge sources for human error correction
Yongmei Shi, Lina Zhou
Signal modification incorporating perceptual weighting filter
Joon-Hyuk Chang, Woohyung Lim, Nam Soo Kim
Enhanced dynamic codebook reordering for advanced quantizer structures
Jani Nurminen
An efficient segment-based speech compression technique for hand-held TTS systems
Chang-Heon Lee, Sung-Kyo Jung, Thomas Eriksson, Won-Suk Jun, Hong-Goo Kang
An unified unit-selection framework for ultra low bit-rate speech coding
V. Ramasubramanian, D. Harish
Efficient VQ techniques and general noise shaping in noise feedback coding
Jes Thyssen, Juin-Hwey Chen
Classified comfort noise generation for efficient voice transmission
Yasheng Qian, Wei-Shou Hsu, Peter Kabal
Integration of a CELP coder in the ARDOR universal sound codec
Balázs Kövesi, Dominique Massaloux, David Virette, Julien Bensa
Two stage transform vector quantization of LSFs for wideband speech coding
Saikat Chatterjee, T. V. Sreenivas
Comparison of prediction based LSF quantization methods using split VQ
Saikat Chatterjee, T. V. Sreenivas
High-rate data embedding in unvoiced speech
Konrad Hofbauer, Gernot Kubin
Pitch resynchronization while recovering from a late frame in a predictive speech decoder
Kyle D. Anderson, Philippe Gournay
A novel environment-dependent speech enhancement method with optimized memory footprint
Suhadi Suhadi, Sorel Stan, Tim Fingscheidt
Weighted codebook mapping for noisy speech enhancement using harmonic-noise model
Esfandiar Zavarehei, Saeed Vaseghi, Qin Yan
MMSE estimation of complex-valued discrete Fourier coefficients with generalized gamma priors
J. Jensen, R. C. Hendriks, J. S. Erkelens, R. Heusdens
Automatic removal of typed keystrokes from speech signals
Amarnag Subramanya, Michael L. Seltzer, Alex Acero
Lattice LP filtering for noise reduction in speech signals
Erhard Rank, Gernot Kubin
Speech enhancement using modified phase opponency model
Om D. Deshmukh, Carol Y. Espy-Wilson
Single channel speech enhancement by frequency domain constrained optimization and temporal masking
Wen Jin, Michael Scordilis
Speech enhancement based on residual noise shaping
Jong Won Shin, Seung Yeol Lee, Hwan Sik Yun, Nam Soo Kim
Quality improvement of telephone speech by artificial bandwidth expansion - listening tests in three languages
Hannu Pulakka, Laura Laaksonen, Paavo Alku
Role of phase estimation in speech enhancement
Benjamin J. Shannon, Kuldip K. Paliwal
Speech enhancement based on spectral estimation from higher-lag autocorrelation
Benjamin J. Shannon, Kuldip K. Paliwal, Climent Nadeu
Noise update modeling for speech enhancement: when do we do enough?
Nitish Krishnamurthy, John H. L. Hansen
Mapping neural networks for bandwidth extension of narrowband speech
A. Shahina, B. Yegnanarayana
Decision directed constrained iterative speech enhancement
Amit Das, John H. L. Hansen
Adaptive filtering for attenuating musical noise caused by spectral subtraction
Takahiro Murakami, Yoshihisa Ishida
Evaluation of objective measures for speech enhancement
Yi Hu, Philipos C. Loizou
Performance analysis of various single channel speech enhancement algorithms for automatic speech recognition
Myung-Suk Song, Chang-Heon Lee, Hong-Goo Kang
Computer-assisted closed-captioning of live TV broadcasts in French
G. Boulianne, J.-F. Beaumont, M. Boisvert, J. Brousseau, P. Cardinal, C. Chapdelaine, M. Comeau, Pierre Ouellet, F. Osterrath
On the use of morphological analysis for dialectal Arabic speech recognition
Mohamed Afify, Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Laurent Besacier, Yuqing Gao
Recognition of classroom lectures in european portuguese
Isabel Trancoso, Ricardo Nunes, Luís Neves, Céu Viana, Helena Moniz, Diamantino Caseiro, Ana Isabel Mata
Investigating automatic decomposition for ASR in less represented languages
Thomas Pellegrini, Lori Lamel
Automatic transcription of Somali language
Abdillahi Nimaan, Pascal Nocéra, Jean-François Bonastre
Analysis of overlaps in meetings by dialog factors, hot spots, speakers, and collection site: insights for automatic speech recognition
Özgür Çetin, Elizabeth Shriberg
Improving speech recognition of two simultaneous speech signals by integrating ICA BSS and automatic missing feature mask generation
Ryu Takeda, Shun'ichi Yamamoto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Missing-feature reconstruction for band-limited speech recognition in spoken document retrieval
Wooil Kim, John H. L. Hansen
Incremental learning of MAP context-dependent edit operations for spoken phone number recognition in an embedded platform
Hahn Koo, Yan Ming Cheng
Development and evaluation of speech database in automotive environments for practical speech recognition systems
Yasunari Obuchi, Nobuo Hataoka
An effective and efficient utterance verification technology using word n-gram filler models
Dong Yu, Yun-Cheng Ju, Alex Acero
An efficient bispectrum phase entropy-based algorithm for VAD
J. M. Górriz, Javier Ramírez, C. G. Puntonet, José C. Segura
Two-step unsupervised speaker adaptation based on speaker and gender recognition and HMM combination
Petr Cerva, Jan Nouza, Jan Silovsky
CENSREC2: corpus and evaluation environments for in car continuous digit speech recognition
Satoshi Nakamura, Masakiyo Fujimoto, Kazuya Takeda
Detection of word fragments in Mandarin telephone conversation
Cheng-Tao Chu, Yun-Hsuan Sung, Yuan Zhao, Daniel Jurafsky
A DTW-based dissimilarity measure for left-to-right hidden Markov models and its application to word confusability analysis
Qiang Huo, Wei Li
Multi-flow block interleaving applied to distributed speech recognition over IP networks
Angel M. Gómez, Juan J. Ramos-Muñoz, Antonio M. Peinado, Victoria Sánchez
Moving speech recognition from software to silicon: the in silico vox project
Edward C. Lin, Kai Yu, Rob A. Rutenbar, Tsuhan Chen
A study on detection based automatic speech recognition
Chengyuan Ma, Yu Tsao, Chin-Hui Lee
Novel time domain multi-class SVMs for landmark detection
Rahul Chitturi, Mark Hasegawa-Johnson
Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling
Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan
On the correlation between energy and pitch accent in read English speech
Andrew Rosenberg, Julia Hirschberg
Corpus-based generation of fundamental frequency contours using generation process model and considering emotional focuses
Keikichi Hirose, Yasufumi Asano, Nobuaki Minematsu
Prosodic boundaries in Czech: an experiment based on delexicalized speech
Tomás Dubeda
Totally data-driven intonation prediction model using a novel F0 contour parametric representation
Lifu Yi, Jian Li, Xiaoyan Lou, Jie Hao
A comparison of inter-transcriber reliability for two systems of prosodic annotation: rap (rhythm and pitch) and toBI (tones and break indices)
Laura Dilley, Mara Breen, Marti Bolivar, John Kraemer, Edward Gibson
Saliency parsing for automated directory assistance
Issac Alphonso, Shuangyu Chang
Open-vocabulary spoken document retrieval based on new subword models and subword phonetic similarity
Kohei Iwata, Yoshiaki Itoh, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee
Improved topic classification over maximum entropy model using k-norm based new objectives
Xiang Li, Ea-Ee Jan, Cheng Wu, David Lubensky
Efficient interactive retrieval of spoken documents with key terms ranked by reinforcement learning
Yi-cheng Pan, Jia-yu Chen, Yen-shin Lee, Yi-sheng Fu, Lin-shan Lee
Discriminative named entity recognition of speech data using speech recognition confidence
Katsuhito Sudoh, Hajime Tsukada, Hideki Isozaki
Using latent semantic indexing for morph-based spoken document retrieval
Ville T. Turunen, Mikko Kurimo
Feature combination using linear discriminant analysis and its pitfalls
Ralf Schlüter, András Zolnay, Hermann Ney
Discriminant linear processing of time-frequency plane
Fabio Valente, Hynek Hermansky
Automatic speech recognition experiments with articulatory data
Esmeralda Uraga, Thomas Hain
Speech recognition with phonological features: some issues to attend
Frederik Stouten, Jean-Pierre Martens
Multi-source far-distance microphone selection and combination for automatic transcription of lectures
Matthias Wölfel, Christian Fügen, Shajith Ikbal, John W. McDonough
Statistical analysis and performance of DFT domain noise reduction filters for robust speech recognition
Colin Breithaupt, Rainer Martin
Normalization of the inter-frame information using smoothing filtering
L. García, José C. Segura, Carmen Benítez, Javier Ramírez, Ángel de la Torre
Comparative study on contributions of pitch-synchronization and peak-amplitude towards robustness issue of ASR
Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta
Phoneme recognition based on fisher weight map to higher-order local auto-correlation
Yasuo Ariki, Shunsuke Kato, Tetsuya Takiguchi
Data-driven design of front-end filter bank for Lombard speech recognition
Hynek Boril, Petr Fousek, Petr Pollák
Optimization of class weights for LDA feature transformations
Andrej Ljolje
LDA based feature estimation methods for LVCSR
Janne Pylkkönen
Robust feature extraction based on spectral peaks of group delay and autocorrelation function and phase domain analysis
G. Farahani, S.M. Ahadi, M. Mehdi Homayounpour
Frequency warping by linear transformation of standard MFCC
Sankaran Panchapagesan
Automatic language identification using wavelets
Ana Lilia Reyes-Herrera, Luis Villaseñor-Pineda, Manuel Montes-y-Gómez
Minimum classification error training of hidden Markov models for acoustic language identification
Josef G. Bauer, Ekaterina Timoshenko
Unsupervised adaptation for acoustic language identification
Ekaterina Timoshenko, Josef G. Bauer
Low complexity LID using pruned pattern tables of LZW
S. V. Basavaraja, T. V. Sreenivas
Improved language identification using support vector machines for language modeling
Xi Yang, Lu-Feng Zhai, Manhung Siu, Herbert Gish
Recent advances in phonotactic language recognition using binary-decision trees
Jiri Navratil
Fusion of phonotactic and prosodic knowledge for language identification
Chi-Yueh Lin, Hsiao-Chuan Wang
Vector-based spoken language recognition using output coding
Haizhou Li, Bin Ma, Rong Tong
Basque-Spanish language identification using phone-based methods
Victor G. Guijarrubia, M. Ines Torres
The role of prosody in the perception of US native English accents
Ayako Ikeno, John H. L. Hansen
Perceptual identification and phonetic analysis of 6 foreign accents in French
Bianca Vieru-Dimulescu, Philippe Boula de Mareüil
Unsupervised Spanish dialect classification
Rongqing Huang, John H. L. Hansen
Dynamic extension of a grammar-based dialogue system: constructing an all-recipes knowing robot
Petra Gieselmann, Alex Waibel
Scalable and portable web-based multimodal dialogue interaction with geographical databases
Alexander Gruenstein, Stephanie Seneff, Chao Wang
System- versus user-initiative dialog strategy for driver information systems
Chantal Ackermann, Marion Libossek
Have we met? MDP based speaker ID for robot dialogue
Filip Krsmanovic, Curtis Spencer, Daniel Jurafsky, Andrew Y. Ng
Prominent words as anchors for TRP projection
Rob J. J. H. van Son, Wieneke Wesseling, Louis C. W. Pols
Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces
Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira
Pitch range and pause duration as markers of discourse hierarchy: perception experiments
Jörg Mayer, Ekaterina Jasinskaja, Ulrike Kölsch
Radiobot-CFF: a spoken dialogue system for military training
Antonio Roque, Anton Leuski, Vivek Rangarajan, Susan Robinson, Ashish Vaswani, Shrikanth Narayanan, David Traum
Is voice quality enough? - study on how the situation and user²s awareness influence the utterance features
Shinya Yamada, Toshihiko Itoh, Kenji Araki
Development of slovak GALAXY/voiceXML based spoken language dialogue system to retrieve information from the internet
Jozef Juhár, Stanislav Ondas, Anton Cizmár, Milan Rusko, Gregor Rozinaj, Roman Jarina
LINTest: a development tool for testing dialogue systems
Lars Degerstedt, Arne Jönsson
A user simulator based on voiceXML for evaluation of spoken dialog systems
Akinori Ito, Keisuke Shimada, Motoyuki Suzuki, Shozo Makino
User expectations and real experience on a multimodal interactive system
Kristiina Jokinen, Topi Hurtig
Detecting anger in automated voice portal dialogs
F. Burkhardt, J. Ajmera, Roman Englert, J. Stegmann, W. Burleson
Evaluation of a spoken dialogue system with usability tests and long-term pilot studies: similarities and differences
Markku Turunen, Jaakko Hakulinen, Anssi Kainulainen
CHAT: a conversational helper for automotive tasks
Fuliang Weng, Sebastian Varges, Badri Raghunathan, Florin Ratiu, Heather Pon-Barry, Brian Lathrop, Qi Zhang, Harry Bratt, Tobias Scheideck, Kui Xu, Matthew Purver, Rohit Mishra, Annie Lien, M. Raya, S. Peters, Y. Meng, J. Russell, Lawrence Cavedon, Elizabeth Shriberg, H. Schmidt, R. Prieto
User simulation for spoken dialogue systems: learning and evaluation
Kallirroi Georgila, James Henderson, Oliver Lemon
Improving the characterization of the alternative hypothesis via kernel discriminant analysis for likelihood ratio-based speaker verification
Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang
A discriminative method for speaker verification using the difference information
Zhenchun Lei, Yingchun Yang, Zhaohui Wu
A multiclass framework for speaker verification within an acoustic event sequence system
Nicolas Scheffer, Jean-François Bonastre
Speaker cluster based GMM tokenization for speaker recognition
Bin Ma, Donglai Zhu, Rong Tong, Haizhou Li
Intra-speaker variability compensation in speaker verification with limited enrolling data
Claudio Garreton, Nestor Becerra Yoma, Carlos Molina, Fernando Huenupan
Speaking faces for face-voice speaker identity verification
Girija Chetty, Michael Wagner
Significance of formants from difference spectrum for speaker identification
Kishore Prahallad, Varanasi Sudhakar, Veluru Ranganatham, Krishna M. Bharat, S. Roy Debashish
Using genetic algorithms to weight acoustic features for speaker recognition
Maider Zamalloa, Germán Bordel, Luis Javier Rodríguez, Mikel Penagarikano, Juan Pedro Uribe
Missing feature theory with soft spectral subtraction for speaker verification
Michael T. Padilla, Thomas F. Quatieri, Douglas A. Reynolds
Prosodic features for speaker verification
Leena Mary, B. Yegnanarayana
Unsupervised learning of HMM topology for text-dependent speaker verification
Ming Liu, Thomas S. Huang
On the use of Jacobian adaptation in real speaker verification applications
Jan Anguita, Javier Hernando
A novel framework of text-independent speaker verification based on utterance transform and iterative cohort modeling
Ming Liu, Huazhong Ning, Thomas S. Huang, Zhengyou Zhang
A cohort - UBM approach to mitigate data sparseness for in-set/out-of-set speaker recognition
Vinod Prakash, John H. L. Hansen
Analysis of lombard effect under different types and levels of noise with application to in-set speaker ID systems
Vaishnevi S. Varadarajan, John H. L. Hansen
Reducing speech coding distortion for speaker identification
Alan McCree
A text-prompted distributed speaker verification system implemented on a cellular phone and a mobile terminal
Tsuneo Kato, Hisashi Kawai
Automatic detection of irregular phonation in continuous speech
Srikanth Vishnubhotla, Carol Y. Espy-Wilson
Highly noise robust text-dependent speaker recognition based on hypothesized wiener filtering
V. Ramasubramanian, Deepak Vijaywargiay, Kumar V. Praveen
Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weighting
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Enhancing the performance of a GMM-based speaker identification system in a multi-microphone setup
Andreas Stergiou, Aristodemos Pnevmatikakis, Lazaros C. Polymenakos
Discriminative adaptation for speaker verification
C. Longworth, M. J. F. Gales
Within-class covariance normalization for SVM-based speaker recognition
Andrew O. Hatch, Sachin Kajarekar, Andreas Stolcke
A new set of features for text-independent speaker identification
Carol Y. Espy-Wilson, Sandeep Manocha, Srikanth Vishnubhotla
Detection of a third speaker in telephone conversations
Uchechukwu O. Ofoegbu, Ananth N. Iyer, Robert E. Yantorno, Stanley J. Wenndt
Improvement speaker clustering using global similarity features
Konstantin Biatov, Joachim Köhler
Voting for two speaker segmentation
Balakrishnan Narayanaswamy, Rashmi Gangadharaiah, Richard M. Stern
Unsupervised model adaptation for speaker verification
Alexandre Preti, Jean-François Bonastre
A quality measure method using Gaussian mixture models and divergence measure for speaker identification
Rong Zheng, Shuwu Zhang, Bo Xu
Gammatone auditory filterbank and independent component analysis for speaker identification
Yushi Zhang, Waleed H. Abdulla
Study on speaker verification on emotional speech
Wei Wu, Thomas Fang Zheng, Ming-Xing Xu, Huan-Jun Bao
On the fusion of prosody, voice spectrum and face features for multimodal person verification
M. Farrús, A. Garde, P. Ejarque, J. Luque, Javier Hernando
An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition
Tarun Pruthi, Carol Y. Espy-Wilson
Speaker verification with non-audible murmur segments
Mariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano
Automatic recognition of speakers² age and gender on the basis of empirical studies
Christian Müller
Text-independent speaker identification in birds
E. J. S. Fox, J. D. Roberts, M. Bennamoun
Automatic acoustic identification of insects inspired by the speaker recognition paradigm
Ilyas Potamitis, Todor Ganchev, Nikos Fakotakis
A study on lattice rescoring with knowledge scores for automatic speech recognition
Sabato Marco Siniscalchi, Jinyu Li, Chin-Hui Lee
Cross-system adaptation and combination for continuous speech recognition: the influence of phoneme set and acoustic front-end
Sebastian Stüker, Christian Fügen, Susanne Burger, Matthias Wölfel
Generating complementary systems for speech recognition
C. Breslin, M. J. F. Gales
Investigations of issues for using multiple acoustic models to improve continuous speech recognition
Rong Zhang, Alexander I. Rudnicky
A new framework for system combination based on integrated hypothesis space
I-Fan Chen, Lin-shan Lee
Frame based system combination and a comparison with weighted ROVER and CNC
Björn Hoffmeister, Tobias Klein, Ralf Schlüter, Hermann Ney
Towards an integrated understanding of speaking rate in conversation
Jiahong Yuan, Mark Liberman, Christopher Cieri
Prosody of interrogative and affirmative sentences in vietnamese language: analysis and perceptive results
Minh Quang Vu, Ðô Ðat Trân, Eric Castelli
Intonational cues to student questions in tutoring dialogs
Jennifer J. Venditti, Julia Hirschberg, Jackson Liscombe
Testing the effect of audiovisual cues to prominence via a reaction-time experiment
Emiel Krahmer, Marc Swerts
Effect of genre, speaker, and word class on the realization of given and new information
Agustín Gravano, Julia Hirschberg
Word order and tonal shape in the production of focus in short Finnish utterances
Martti Vainio, Juhani Järvikivi, Stefan Werner
Modeling sensory-to-motor mappings using neural nets and a 3d articulatory speech synthesizer
Bernd J. Kröger, Peter Birkholz, Jim Kannampuzha, Christiane Neuschaefer-Rube
Semi-automatic extraction of vocal tract movements from cineradiographic data
Julie Fontecave, Frédéric Berthommier
Towards continuous speech recognition using surface electromyography
Szu-Chen Jou, Tanja Schultz, Matthias Walliczek, Florian Kraft, Alex Waibel
A trajectory mixture density network for the acoustic-articulatory inversion mapping
Korin Richmond
Articulatory features for "meeting" speech recognition
Florian Metze
Training of coarticulation models using dominance functions and visual unit selection methods for audio-visual speech synthesis
Zdenek Krnoul, Milos Zelezný, Ludek Müller, Jakub Kanis
Phone recognition analysis for trajectory HMM
Le Zhang, Steve Renals
Discriminative kernel-based phoneme sequence recognition
Joseph Keshet, Shai Shalev-Shwartz, Samy Bengio, Yoram Singer, Dan Chazan
Combining phonetic attributes using conditional random fields
Jeremy Morris, Eric Fosler-Lussier
Discriminative MLE training using a product of Gaussian likelihoods
T. Nagarajan, Douglas O'Shaughnessy
State-level variable modeling for phoneme classification
Hao-Zheng Li, Douglas O'Shaughnessy
A time-synchronous phonetic decoder for a long-contextual-Span hidden trajectory model
Xiaolong Li, Li Deng, Dong Yu, Alex Acero
Analysis of HMM temporal evolution for automatic speech recognition and utterance verification
Marta Casar, Jose A. R. Fonollosa
Improvements to bucket box intersection algorithm for fast GMM computation in embedded speech recognition systems
Min Tang, Aravind Ganapathiraju
Forward-backwards training of hybrid HMM/BN acoustic models
Konstantin Markov, Satoshi Nakamura
A comparative study of Gaussian selection methods in large vocabulary continuous speech recognition
Dirk Gehrig, Thomas Schaaf
A successive state and mixture splitting for optimizing the size of models in speech recognition
Soo-Young Suk, Seong-Jun Hahm, Ho-Youl Jung, Hyun-Yeol Chung
Improved source modeling and predictive classification for channel robust speech recognition
Valentin Ion, Reinhold Haeb-Umbach
Automatic English stop consonants classification using wavelet analysis and hidden Markov models
Marco Kühne, Roberto Togneri
Single frame selection for phoneme classification
Tingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Hugo Van hamme
On the relation between maximum spectral transition positions and phone boundaries
Sorin Dusan, Lawrence Rabiner
Objective estimation of suicidal risk using vocal output characteristics
T. Yingthawornsuk, H. Kaymaz Keskinpala, D. France, D. M. Wilkes, R. G. Shiavi, R. M. Salomon
A wavelet-based parameterization for speech/music segmentation
E. Didiot, I. Illina, O. Mella, D. Fohr, Jean-Paul Haton
Distance measure between Gaussian distributions for discriminating speaking styles
Goshu Nagino, Makoto Shozakai
Bayesian networks for phonetic classification using time-scale features
Franz Pernkopf, Tuan Van Pham
Fast and effective retraining on contrastive vocal characteristics with bidirectional long short-term memory nets
Nicole Beringer
Exploiting dendritic autocorrelogram structure to identify spectro-temporal regions dominated by a single sound source
Ning Ma, Phil Green, André Coy
Locating phone boundaries from acoustic discontinuities using a two-staged approach
Pairote Leelaphattarakij, Proadpran Punyabukkana, Atiwong Suchato
Investigation on rescoring using minimum verification error (MVE) detectors
Qiang Fu, Biing-Hwang Juang
Generalization of the minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP)
Qiang Fu, Antonio Moreno-Daniel, Biing-Hwang Juang, Jian-Lai Zhou, Frank K. Soong
Unsupervised detection of whispered speech in the presence of normal phonation
Michael A. Carlin, Brett Y. Smolenski, Stanley J. Wenndt
Friends and enemies: a novel initialization for speaker diarization
Xavier Anguera, Chuck Wooters, Javier Hernando
Acoustic cues for the classification of regular and irregular phonation
Kushan Surana, Janet Slifka
Realizations and representations of Thai tones in monomoraic syllables
Rattima Nitisaroj
Measuring and comparing vowel qualities in a Dutch spontaneous speech corpus
Irene Jacobi, Louis C. W. Pols, Jan Stroop
Phonetic research on accented Chinese in three dialectal regions: Shanghai, Wuhan and Xiamen
Aijun Li, Qiang Fang, Ziyu Xiong
Pronunciation variation modeling for Mandarin with accent
Chi Zhang, Ji Wu, Xi Xiao, Zuoying Wang
Specificity and generalizability of spontaneous phonetic imitation
Kuniko Y. Nielsen
On the sufficiency of automatic phonetic transcriptions for pronunciation variation research
Christophe Van Bael, Hans van Halteren
Automatic detection of voice onset time contrasts for use in pronunciation assessment
Abe Kazemzadeh, Joseph Tepperman, Jorge Silva, Hong You, Sungbok Lee, Abeer Alwan, Shrikanth Narayanan
Unfilled pauses in Japanese sentences read aloud by non-native learners
Hiroko Hirano, Goh Kawai, Keikichi Hirose, Nobuaki Minematsu
Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese
Ryoji Hamabe, Kiyotaka Uchimoto, Tatsuya Kawahara, Hitoshi Isahara
Chinese input method based on reduced Mandarin phonetic alphabet
Chun-Han Tseng, Chia-Ping Chen
Thesaurus expansion using similar word pairs from patent documents
Yoshimi Suzuki, Fumiyo Fukumoto
Low-resource autodiacritization of abjads for speech keyword search
Patrick Schone
A model of the regularities underlying speaker variation: evidence from hybrid synthesis
Susan R. Hertz
Pauses as a tool to ensure rhythmic wellformedness
Augustin Speyer
Factors affecting speakers² choice of fillers in Japanese presentations
Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu
Developing consistent pronunciation models for phonemic variants
Marelie Davel, Etienne Barnard
Grapheme-to-phoneme conversion using automatically extracted associative rules for Korean TTS system
Jinsik Lee, Seungwon Kim, Gary Geunbae Lee
Example-based grapheme-to-phoneme conversion for Thai
Paisarn Charoenpornsawat, Tanja Schultz
Building an English-iraqi Arabic machine translation system for spoken utterances with limited resources
Jason Riesa, Behrang Mohit, Kevin Knight, Daniel Marcu
A phrase-level machine translation approach for disfluency detection using weighted finite state transducers
Sameer Maskey, Bowen Zhou, Yuqing Gao
Improving phrase-based Korean-English statistical machine translation
Jonghoon Lee, Donghyeon Lee, Gary Geunbae Lee
A hybrid phrase-based/statistical speech translation system
David Stallard, Fred Choi, Kriste Krstovski, Prem Natarajan, Rohit Prasad, Shirin Saleem
High-quality speech translation in the flight domain
Chao Wang, Stephanie Seneff
Optimizing components for handheld two-way speech translation for an English-iraqi Arabic system
Roger Hsiao, Ashish Venugopal, Thilo Köhler, Ying Zhang, Paisarn Charoenpornsawat, Andreas Zollmann, Stephan Vogel, Alan W. Black, Tanja Schultz, Alex Waibel
Distant-talking continuous speech recognition based on a novel reverberation model in the feature domain
Armin Sehr, Marcus Zeller, Walter Kellermann
Robust feature space adaptation for telephony speech recognition
Xin Lei, Jon Hamaker, Xiaodong He
A simulated-data adaptation technique for robust speech recognition
Nattanun Thatphithakkul, Boontee Kruatrachue, Chai Wutiwiwatchai, Sanparith Marukatat, Vataya Boonpiam
A new HMM adaptation approach for the case of a hands-free speech input in reverberant rooms
Hans-Günter Hirsch, Harald Finster
A vector space approach to environment modeling for robust speech recognition
Yu Tsao, Chin-Hui Lee
Subspace modeling and selection for noisy speech recognition
Jen-Tzung Chien, Chuan-Wei Ting
Recognition of interest in human conversational speech
Björn Schuller, Niels Köhler, Ronald Müller, Gerhard Rigoll
Using system and user performance features to improve emotion detection in spoken tutoring dialogs
Hua Ai, Diane J. Litman, Kate Forbes-Riley, Mihai Rotaru, Joel Tetreault, Amruta Purandare
Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs
Laurence Devillers, Laurence Vidrascu
Real vs. acted emotional speech
Janneke Wilting, Emiel Krahmer, Marc Swerts
Emotion recognition in spontaneous speech using GMMs
Daniel Neiberg, Kjell Elenius, Kornel Laskowski
Personality factors in human deception detection: comparing human to machine performance
Frank Enos, Stefan Benus, Robin L. Cautin, Martin Graciarena, Julia Hirschberg, Elizabeth Shriberg
Developing an automatic assessment tool for children²s oral reading
Leen Cleuren, Jacques Duchateau, Alain Sips, Pol Ghesquière, Hugo Van hamme
Prototyping a call system for students of Japanese using dynamic diagram generation and interactive hints
Christopher Waple, Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara
A multilingual embodied conversational agent for tutoring speech and language learning
Dominic W. Massaro, Ying Liu, Trevor H. Chen, Charles Perfetti
Classroom success of an intelligent tutoring system for lexical practice and reading comprehension
Michael Heilman, Kevyn Collins-Thompson, Jamie Callan, Maxine Eskenazi
Assessing the reading level of web pages
Sarah E. Petersen, Mari Ostendorf
Is ASR accurate enough for automated reading tutors, and how can we tell?
Jack Mostow
Development of a program for self assessment of Japanese pronunciation by English learners
Chiharu Tsurutani, Yutaka Yamauchi, Nobuaki Minematsu, Dean Luo, Kazutaka Maruyama, Keikichi Hirose
Pronunciation verification of children²s speech for automatic literacy assessment
Joseph Tepperman, Jorge Silva, Abe Kazemzadeh, Hong You, Sungbok Lee, Abeer Alwan, Shrikanth Narayanan
Computer aided pronunciation learning system using speech recognition techniques
Sherif Mahdy Abdou, Salah Eldeen Hamid, Mohsen Rashwan, Abdurrahman Samir, Ossama Abdel-Hamid, Mostafa Shahin, Waleed Nazih
An information theoretic tool for investigating speech perception
Bryce Lobdell, Jont B. Allen
An adaptive sampling procedure for speech perception experiments
Geoffrey Stewart Morrison
Disentangling gestural and auditory contrast accounts of compensation for coarticulation
Navin Viswanathan, James S. Magnuson, Carol A. Fowler
The role of positional probability in the segmentation of Cantonese speech
Michael C. W. Yip
Nasality perception of vowels in different language background
Shahina Haque, Tomio Takara
Steady-state suppression in reverberation: a comparison of native and nonnative speech perception
Nao Hodoshima, Dawn Behne, Takayuki Arai
Effect of dynamic information of formants on discrimination of English vowels in consonantal contexts by Japanese listeners
Akiyo Joto
Native and nonnative audio-visual perception of English fricatives in quiet and cafe-noise backgrounds
Yue Wang, Dawn Behne, Haisheng Jiang, Chad Danyluck
Perceptive and acoustic measurement of average speaking pitch of female and male speakers in German radio news
Sven Grawunder, Ines Bose, Birgit Hertha, Franziska Trauselt, Lutz Christian Anders
Effects of frequency shifts on perceived naturalness and gender information in speech
Peter F. Assmann, Sophia Dembling, Terrance M. Nearey
Influence of pause length on listeners² impressions in simultaneous interpretation
Hitomi Tohyama, Shigeki Matsubara
New measures to chart toddlers² speech perception and language development: a test of the lexical restructuring hypothesis
Iris-Corinna Schwarz, Denis Burnham
Perception of fundamental frequency in cochlear implant patients
Ángel de la Torre, Cristina Roldán, Manuel Sainz
Effects of featural similarity and overlap position on lexical confusions and overt similarity judgments
Sarah C. Creel, Delphine Dahan, Daniel Swingley
Word structure and tone perception in Mandarin
Hansjörg Mixdorff, Yu Hu
Identification of regional accents in French: perception and categorization
Cecile Woehrling, Philippe Boula de Mareüil
Consonant and vowel confusions in speech-weighted noise
Sandeep Phatak, Jont B. Allen
Accident - execute: increased activation in nonnative listening
Mirjam Broersma
Estimation of the quality dimension "directness/frequency content" for the instrumental assessment of speech quality
Kirstin Scholz, Marcel Waltermann, Lu Huo, Alexander Raake, Sebastian Möller, Ulrich Heute
Effects of word frequency on the acoustic durations of affixes
Mark Pluymaekers, Mirjam Ernestus, R. Harald Baayen
A noninvasive, low-cost device to study the velopharyngeal port during speech and some preliminary results
Xiaochuan Niu, Alexander B. Kain, Jan P. H. van Santen
Characterization of cued speech vowels from the inner lip contour
Noureddine Aboutabit, Denis Beautemps, Laurent Besacier
Modelling aspiration noise during phonation using the LF voice source model
Christer Gobl
A simulation based parameter optimization for a coarticulation model
Jianguo Wei, Xugang Lu, Jianwu Dang
Multivariate analysis of frame-based acoustic cues of dysperiodicities in connected speech
A. Kacha, Francis Grenez, Jean Schoentgen
Effects of midline tongue piercing on spectral centroid frequencies of sibilants
Tom Kovacs, Donald S. Finan
Assessment of articulatory sub-systems of dysarthric speech using an isolated-style phoneme recognition system
P. Vijayalakshmi, M. R. Reddy, Douglas O’Shaughnessy
Respiratory/laryngeal interactions during sustained vowel production in children
Donald S. Finan, Carol A. Boliek
Acoustic characterization of children with speech delay
H. Timothy Bunnell, James B. Polikoff
Study of time and frequency variability in pathological speech and error reduction methods for automatic speech recognition
Oscar Saz, Antonio Miguel, Eduardo Lleida, Alfonso Ortega, Luis Buera
Voice source correlates of prosodic features in american English: a pilot study
Markus Iseli, Yen-Liang Shue, Melissa A. Epstein, Patricia Keating, Jody Kreiman, Abeer Alwan
On speech variation and word type differentiation by articulatory feature representations
Louis ten Bosch, R. Harald Baayen, Mirjam Ernestus
A study of emotional speech articulation using a fast magnetic resonance imaging technique
Sungbok Lee, Erik Bresch, Jason Adams, Abe Kazemzadeh, Shrikanth Narayanan
Reconstructing tongue movements from audio and video
Hedvig Kjellström, Olov Engwall, Olle Bälter
New considerations for vowel nasalization based on separate mouth-nose recording
Gang Feng, Cyril Kotenkoff
An acoustic and articulatory study of Lombard speech: global effects on the utterance
Maeva Garnier, Lucie Bailly, Marion Dohen, Pauline Welby, Helene Loevenbruck
Tracking of involuntary formant frequency variations and application to parkinsonian speech
Laurence Cnockaert, Jean Schoentgen, Pascal Auzou, Canan Ozsancak, Francis Grenez
All-pole model estimation of vocal tract on the frequency domain
Luis Weruaga, Amar Al-Khayat
HMM-based MAP prediction of voiced and unvoiced formant frequencies from noisy MFCC vectors
Jonathan Darch, Ben Milner
Extracting formants from short segments of speech using group delay functions
Joseph M. Anand, S. Guruprasad, B. Yegnanarayana
Tracking of visible vocal tract resonances (VVTR) based on kalman filtering
I. Yücel Özbek, Mübeccel Demirekler
Wavelet ridge track interpretation in terms of formants
Salma Chaari, Kais Ouni, Noureddine Ellouze
Unsupervised segmentation of words into morphemes - morpho challenge 2005 application to automatic speech recognition
Mikko Kurimo, Mathias Creutz, Matti Varjokallio, Ebru Arsoy, Murat Saraclar
Lattice extension and rescoring based approaches for LVCSR of Turkish
Ebru Arsoy, Murat Saraclar
Exploiting semantic relations for a spoken language understanding application
Catherine Kobus, Geraldine Damnati, Lionel Delphin-Poulat, Renato De Mori
Sentence boundary detection of spontaneous Japanese using statistical language model and support vector machines
Yuya Akita, Masahiro Saikou, Hiroaki Nanjo, Tatsuya Kawahara
Compact n-gram models by incremental growing and clustering of histories
Sami Virpioja, Mikko Kurimo
Opinion mining in a telephone survey corpus
Nathalie Camelin, Geraldine Damnati, Frederic Bechet, Renato De Mori
An integrated solution for error concealment in DSR systems over wireless channels
Antonio M. Peinado, Angel M. Gómez, Victoria Sánchez, José L. Pérez-Córdoba, Antonio J. Rubio
Interleaving and MMSE estimation with VQ replicas for distributed speech recognition over lossy packet networks
Angel M. Gómez, Antonio M. Peinado, Victoria Sánchez, José L. Carmona, Antonio J. Rubio
Noise-robust speech recognition of conversational telephone speech
Gang Chen, Hesham Tolba, Douglas O’Shaughnessy
Lost speech reconstruction method using speech recognition based on missing feature theory and HMM-based speech synthesis
Shingo Kuroiwa, Satoru Tsuge, Fuji Ren
Speaker adaptation using evolutionary-based linear transform
Sid-Ahmed Selouani, Douglas O’Shaughnessy
A speaker adaptation algorithm using principal curves in noisy environments
Jingying Wang, Zuoying Wang
Limitations of MLLR adaptation with Spanish-accented English: an error analysis
Constance Clarke, Daniel Jurafsky
Issues with uncertainty decoding for noise robust speech recognition
H. Liao, M. J. F. Gales
Vector taylor series based joint uncertainty decoding
Haitian Xu, Luca Rigazio, David Kryze
A maximum likelihood training approach to irrelevant variability compensation based on piecewise linear transformations
Qiang Huo, Donglai Zhu
Speaker clustered regression-class trees for MLLR adaptation
Arindam Mandal, Mari Ostendorf, Andreas Stolcke
Robust speech recognition over mobile networks using combined weighted viterbi decoding and subvector based error concealment
Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg
Speaker adaptation of trajectory HMMs using feature-space MLLR
Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura
Feature and model space speaker adaptation with full covariance Gaussians
Daniel Povey, George Saon
Linguistic tuple segmentation in n-gram-based statistical machine translation
Adrià de Gispert, José B. Mariño
Sentence boundary detection using sequential dependency analysis combined with CRF-based chunking
Takanobu Oba, Takaaki Hori, Atsushi Nakamura
Sequence classification for machine translation
Srinivas Bangalore, Patrick Haffner, Stephan Kanthak
Two-stage vocabulary-free spoken document retrieval - subword identification and re-recognition of the identified sections
Yoshiaki Itoh, Takayuki Otake, Kohei Iwata, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee
Design and performance analysis of a factoid question answering system for spontaneous speech transcriptions
Mihai Surdeanu, David Dominguez-Sal, Pere R. Comas
Performance improvement of dialog speech translation by rejecting unreliable utterances
Toshiyuki Takezawa, Tohru Shimizu
Cross-lingual dialog model for speech to speech translation
Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth Narayanan
A robust fusion method for multilingual spoken document retrieval systems employing tiered resources
Murat Akbacak, John H. L. Hansen
Recent advances of IBM’s handheld speech translation system
Weizhong Zhu, Bowen Zhou, Charles Prosser, Pavel Krbec, Yuqing Gao
QASR: question answering using semantic roles for speech interface
Svetlana Stenchikova, Dilek Hakkani-Tür, Gokhan Tur
Towards a multimodal topic tracking system for a mobile robot
Jan F. Maas, Britta Wrede, Gerhard Sagerer
Edge-splitting in a cumulative multimodal system, for a no-wait temporal threshold on information fusion, combined with an under-specified display
Edward C. Kaiser, Paulo Barthelmess
Joint interpretation of input speech and pen gestures for multimodal human-computer interaction
Pui-Yu Hui, Helen M. Meng
Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm
David Cournapeau, Tatsuya Kawahara, Kenji Mase, Tomoji Toriyama
A constrained baum-welch algorithm for improved phoneme segmentation and efficient training
David Huggins-Daines, Alexander I. Rudnicky
Infinite models for speaker clustering
Fabio Valente
The segmentation of multi-channel meeting recordings for automatic speech recognition
John Dines, Jithendra Vepa, Thomas Hain
Minimum boundary error training for automatic phonetic segmentation
Jen-Wei Kuo, Hsin-Min Wang
Dynamic evidence models in a DBN phone recognizer
William Schuler, Tim Miller, Stephen Wu, Andrew Exley
The IBM 2006 speech transcription system for european parliamentary speeches
B. Ramabhadran, Olivier Siohan, L. Mangu, G. Zweig, M. Westphal, H. Schulz, A. Soneiro
Advances in lecture recognition: the ISL RT-06s evaluation system
Christian Fügen, Matthias Wölfel, John W. McDonough, Shajith Ikbal, Florian Kraft, Kornel Laskowski, Mari Ostendorf, Sebastian Stüker, Kenichi Kumatani
Investigation on Mandarin broadcast news speech recognition
Mei-Yuh Hwang, Xin Lei, Wen Wang, Takahiro Shinozaki
Improved tone modeling for Mandarin broadcast news speech recognition
Xin Lei, Manhung Siu, Mei-Yuh Hwang, Mari Ostendorf, Tan Lee
Prosodic modeling in large vocabulary Mandarin speech recognition
Jui-Ting Huang, Lin-shan Lee
Experiments on Chinese speech recognition with tonal models and pitch estimation using the Mandarin speecon data
Ying Sun, Daniel Willett, Raymond Brueckner, Rainer Gruhn, Dirk Bühler
Visual correlates to prominence in several expressive modes
Jonas Beskow, Björn Granström, David House
How auditory and visual prosody is used in end-of-utterance detection
Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts
The importance of different facial areas for signalling visual prominence
Marc Swerts, Emiel Krahmer
Visual speech segmentation and speaker recognition for transcription of TV news
Josef Chaloupka
HMM-based continuous sign language recognition using a fast optical flow parameterization of visual information
G. Cortés, L. García, Carmen Benítez, José C. Segura
Audio-visual speech recognition in the presence of a competing speaker
Xu Shao, Jon Barker
Expressive prosody for unit-selection speech synthesis
Volker Strom, Robert A. J. Clark, Simon King
Cues for hesitation in speech synthesis
Rolf Carlson, Kjell Gustafson, Eva Strangert
Multi-domain text-to-speech synthesis by automatic text classification
Francesc Alías, Joan Claudi Socoró, Xavier Sevillano, Ignasi Iriondo, Xavier Gonzalvo
Phrase break prediction using logistic generalized linear model
Lifu Yi, Jian Li, Xiaoyan Lou, Jie Hao
Joint prosodic and segmental unit selection speech synthesis
Robert A. J. Clark, Simon King
Phonetically enriched labeling in unit selection TTS synthesis
Yeon-Jun Kim, Ann K. Syrdal, Alistair Conkie, Mark C. Beutnagel
Further developments in LSM-based boundary training for unit selection TTS
Jerome R. Bellegarda
A style control technique for speech synthesis using multiple regression HSMM
Takashi Nose, Junichi Yamagishi, Takao Kobayashi
Acoustic model training based on linear transformation and MAP modification for HSMM-based speech synthesis
Katsumi Ogata, Makoto Tachibana, Junichi Yamagishi, Takao Kobayashi
Improving Arabic HMM based speech synthesis quality
Ossama Abdel-Hamid, Sherif Mahdy Abdou, Mohsen Rashwan
Farsbayan: a unit selection based Farsi speech synthesizer
M. Mehdi Homayounpour, Majid Namnabat
Amharic speech synthesis using cepstral method with stress generation rule
Tadesse Anberbir, Tomio Takara
Automatic syllable-pattern induction in statistical Thai text-to-phone transcription
Ausdang Thangthai, Chatchawarn Hansakunbuntheung, Rungkarn Siricharoenchai, Chai Wutiwiwatchai
Development of prototype text-to-speech systems for northern sotho
H. J. Oosthuizen, S. T. Phihlela, M. J. D. Manamela
Identify language origin of personal names with normalized appearance number of web pages
Jiali You, Yining Chen, Min Chu, Yong Zhao, Jinlin Wang
Conditional random fields for hierarchical segment selection in text-to-speech synthesis
Christian Weiss, Wolfgang Hess
Corpus design based on the kullback-leibler divergence for text-to-speech synthesis application
Aleksandra Krul, Géraldine Damnati, François Yvon, Thierry Moudenc
HMM-based unit selection using frame sized speech segments
Zhen-Hua Ling, Ren-Hua Wang
The target cost formulation in unit selection speech synthesis
Paul Taylor
Unit selection and its relation to symbolic prosody: a new approach
Daniel Tihelka, Jindrich Matousek
Minimum generation error criterion for tree-based clustering of context dependent HMMs
Yi-Jian Wu, Wu Guo, Ren-Hua Wang
Selective-LPC based representation of STRAIGHT spectrum and its applications in spectral smoothing
Heng Kang, Wenju Liu
Towards a comprehensive investigation of factors relevant to peak alignment using a unit selection corpus
Matthias Jilka, Bernd Möbius
Six approaches to limited domain concatenative speech synthesis
Robert J. Utama, Ann K. Syrdal, Alistair Conkie
From pre-recorded prompts to corporate voices: on the migration of interactive voice response applications
V. Fischer, S. Kunzmann
Automatic speech segmentation with multiple statistical models
Seung Seop Park, Jong Won Shin, Nam Soo Kim
Evaluation of perceptual quality of control point reduction in rule-based synthesis
Kimmo Pärssinen, Marko Moberg
Segment connection networks for corpus-based speech synthesis
Geert Coorman
Observations of the spoken language acquisition process based on a multimodal infant behavior corpus
Ryo Tsuji, Tomohiko Kasami, Shogo Ishikawa, Shinya Kiriyama, Yoichi Takebayashi, Shigeyoshi Kitazawa
Infants² ability to extract verbs from continuous speech
Ellen Marklund, Francisco Lacerda
Category formation and the role of spectral quality in the perception and production of English front vowels
Ricardo A.H. Bion, Paola Escudero, Andréia S. Rauber, Barbara O. Baptista
Productions in bilinguism, early foreign language learning and monolinguism: a prosodic comparison
Ranka Bijeljac-Babic, Christelle Dodane, Sabine Metta, Claire Gérard
Training native English speakers to identify Japanese vowel length with fast rate sentences
Yukari Hirata, Elizabeth Whitehurst, Emily Cullings, Jacob Whiton, Carol Glenn
Formant-based English vowel assessment for Chinese in Taiwan
Jiang-Chun Chen, Wei-Tang Hsu, J.-S. Roger Jang, Ren-Yuan Lyu, Yuang-Chin Chiang
Substitute sounds for ventriloquism and speech disorders
Jörg Metzner, Marcel Schmittfull, Karl Schnell
Automatic Mandarin pronunciation scoring for native learners with dialect accent
Si Wei, Qing-Sheng Liu, Yu Hu, Ren-Hua Wang
Quick individual fitting methods of simplified hearing compensation for elderly people
Kengo Fujita, Tsuneo Kato, Hisashi Kawai
An online adaptive filtering algorithm for the vocal joystick
Xiao Li, Jonathan Malkin, Susumu Harada, Jeff A. Bilmes, Richard Wright, James Landay
Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech
Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
A Spanish speech to sign language translation system for assisting deaf-mute people
R. San-Segundo, R. Barra, L. F. D’Haro, J. M. Montero, R. Córdoba, J. Ferreiros
Potential relevance of audio-visual integration in mammals for computational modeling
Eeva Klintfors, Francisco Lacerda
Finding the gaps: applying a connectionist model of word segmentation to noisy phone-recognized speech data
C. Anton Rytting
Rapid speaker adaptation using regression-tree based spectral peak alignment
Shizhen Wang, Xiaodong Cui, Abeer Alwan
Physiologically-motivated synchrony-based processing for robust automatic speech recognition
Chanwoo Kim, Yu-Hsiang Chiu, Richard M. Stern
Sub-word unit based non-audible speech recognition using surface electromyography
Matthias Walliczek, Florian Kraft, Szu-Chen Jou, Tanja Schultz, Alex Waibel
Individual on-line variance adaptation of frequency filtered parameters for robust ASR
Jesús Vicente-Peña, Fernando Díaz-de-María, Bastiaan Kleijn
Recent progress on the discriminative region-dependent transform for speech feature extraction
Bing Zhang, Spyros Matsoukas, Richard Schwartz
Improved warping-invariant features for automatic speech recognition
Jan Rademacher, Matthias Wächter, Alfred Mertins
Summarization evaluation for text and speech: issues and approaches
Ani Nenkova
Summarization of spontaneous conversations
Xiaodan Zhu, Gerald Penn
Perplexity based linguistic model adaptation for speech summarisation
Pierre Chatain, Edward Whittaker, Joanna Mrozinski, Sadaoki Furui
Multi-layered summarization of spoken document archives by information extraction and semantic structuring
Lin-shan Lee, Sheng-yi Kong, Yi-cheng Pan, Yi-sheng Fu, Yu-tsun Huang
Soundbite detection in broadcast news domain
Sameer Maskey, Julia Hirschberg
Dialogue act compression via pitch contour preservation
Gabriel Murray, Steve Renals
Manifold HLDA and its application to robust speech recognition
Toshiaki Kubo, Tetsuji Ogawa, Tetsunori Kobayashi
Time-dependent cross-probability model for multi-environment model based LInear normalization
Luis Buera, Eduardo Lleida, Juan A. Nolazco-Flores, Antonio Miguel, Alfonso Ortega
SPAM and full covariance for speech recognition
Daniel Povey
The use of Bayesian network for incorporating accent, gender and wide-context dependency information
Sakriani Sakti, Konstantin Markov, Satoshi Nakamura
Integrating phonetic boundary discrimination explicitly into HMM systems
Yu Wang, Eric Fosler-Lussier
Robust acoustic-based syllable detection
Zhimin Xie, Partha Niyogi
A tone recognition framework for continuous Mandarin speech
Lei He, Jie Hao
Pronunciation variant-based multi-path HMMs for syllables
Annika Hämäläinen, Louis ten Bosch, Lou Boves
A new state-dependent phonetic tied-mixture model with head-body-tail structured HMM for real-time continuous phoneme recognition system
Junho Park, Hanseok Ko
Conversion from phoneme based to grapheme based acoustic models for speech recognition
Andrej Zgank, Zdravko Kacic
Phone vector DHMM to decode a phone recognizer's output
Bong-Wan Kim, Dae-Lim Choi, Yongnam Um, Yong-Ju Lee
Combining multiple-sized sub-word units in a speech recognition system using baseform selection
T. Nagarajan, P. Vijayalakshmi, Douglas O'Shaughnessy
Local transformation models for speech recognition
Antonio Miguel, Eduardo Lleida, Alfons Juan, Luis Buera, Alfonso Ortega, Oscar Saz
Online speech detection and dual-gender speech recognition for captioning broadcast news
Toru Imai, Shoei Sato, Akio Kobayashi, Kazuo Onoe, Shinichi Homma
Automatic alignment and error correction of human generated transcripts for long speech recordings
Timothy J. Hazen
Improving speech recognition accuracy with multi-confidence thresholding
Shuangyu Chang
Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA
Christophe Servan, Christian Raymond, Frédéric Béchet, Pascal Nocéra
Improving the performance of out-of-vocabulary word rejection by using support vector machines
Shilei Huang, Xiang Xie, Jingming Kuang
Robust phone lattice decoding
Kris Demuynck, Dirk Van Compernolle, Hugo Van hamme
Imperfect transcript driven speech recognition
Benjamin Lecouteux, Georges Linarès, Pascal Nocéra, Jean-François Bonastre
New improvements in decoding speed and latency for automatic captioning
Jian Xue, Rusheng Hu, Yunxin Zhao
Colloquial Iraqi ASR for speech translation
Shirin Saleem, Rohit Prasad, Prem Natarajan
Reducing computation on parallel decoding using frame-wise confidence scores
Tomohiro Hakamata, Akinobu Lee, Yoshihiko Nankaku, Keiichi Tokuda
Posterior based keyword spotting with a priori thresholds
Hamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard
A multi-pass error detection and correction framework for Mandarin LVCSR
Zhengyu Zhou, Helen M. Meng, Wai Kit Lo
Continual on-line monitoring of Czech spoken broadcast programs
Jan Nouza, Jindrich Zdansky, Petr Cerva, Jan Kolorenc
Fast SVM training based on the choice of effective samples for audio classification
Shilei Zhang, Hongchen Jiang, Shuwu Zhang, Bo Xu
Online speaker change detection by combining BIC with microphone array beamforming
Joerg Schmalenstroeer, Reinhold Haeb-Umbach
Speech/non-speech discrimination combining advanced feature extraction and SVM learning
Javier Ramírez, Pablo Yélamos, J. M. Górriz, José C. Segura, L. García
Cooperation between global and local methods for the automatic segmentation of speech synthesis corpora
Safaa Jarifi, Dominique Pastor, Olivier Rosec
Speaker independent voiced-unvoiced detection evaluated in different speaking styles
Martin Heckmann, Marco Moebus, Frank Joublin, Christian Goerick
Robust speaker diarization for meetings: ICSI RT06s evaluation system
Xavier Anguera, Chuck Wooters, Jose M. Pardo
A multipitch tracker for monaural speech segmentation
André Coy, Jon Barker
Novel entropy based moving average refiners for HMM landmarks
Rahul Chitturi, Mark Hasegawa-Johnson
Two-microphone voice activity detection in the presence of coherent interference
Gibak Kim, Nam Ik Cho
On a greedy learning algorithm for dPLRM with applications to phonetic feature detection
Tor André Myrvoll, Tomoko Matsui
Improving glottal waveform estimation through rank-based glottal quality assessment
Elliot Moore II, Juan Torres
A pitch marks filtering algorithm based on restricted dynamic programming
Francesc Alías, Carlos Monzo, Joan Claudi Socoró
Analysis of nonmodal phonation using minimum entropy deconvolution
Nicolas Malyska, Thomas F. Quatieri
An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features
Tomoyasu Nakano, Masataka Goto, Yuzuru Hiraga
A spectral-temporal method for pitch tracking
Stephen A. Zahorian, Princy Dikshit, Hongbing Hu
Pitch determination using aligned AMDF
M. Shahidur Rahman, Hirobumi Tanaka, Tetsuya Shimamura
Syllable-length path mixture hidden Markov models with trajectory clustering for continuous speech recognition
Yan Han, Lou Boves
Acoustic modeling for spoken dialogue systems based on unsupervised utterance-based selective training
Tobias Cincarek, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
GMM-based acoustic modeling for embedded speech recognition
Christophe Lévy, Georges Linarès, Jean-François Bonastre
Boosting HMM performance with a memory upgrade
Mathias De Wachter, Kris Demuynck, Dirk Van Compernolle
An integrated approach to improve speech recognition rate for non-native speakers
Y. Deng, X. Li, C. Kwan, R. Xu, B. Raj, Richard M. Stern, D. Williamson
Bayesian decision tree state tying for conversational speech recognition
Rusheng Hu, Yunxin Zhao
Feature extraction for spectral continuity measures in concatenative speech synthesis
Barry Kirkpatrick, Darragh O’Brien, Ronán Scaife
Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesis
Shinsuke Sakai, Tatsuya Kawahara
Constructing stylistic synthesis databases from audio books
Yong Zhao, Di Peng, Lijuan Wang, Min Chu, Yining Chen, Peng Yu, Jun Guo
Expanding phonetic coverage in unit selection synthesis through unit substitution from a donor voice
Alistair Conkie, Ann K. Syrdal
Unifying unit selection and hidden Markov model speech synthesis
Paul Taylor
CLUSTERGEN: a statistical parametric synthesizer using trajectory modeling
Alan W. Black
Cluster-based user simulations for learning dialogue strategies
Verena Rieser, Oliver Lemon
Prompt selection with reinforcement learning in an AT&t call routing application
Charles Lewis, Giuseppe Di Fabbrizio
Developing speech dialogs for multimodal HMIs using finite state machines
Silke Goronzy, Raquel Mochales, Nicole Beringer
Development of advanced dialog systems with PATE
Norbert Pfleger, Jan Schehl
A joint intention-based dialogue engine
Rajah Annamalai Subramanian, Philip Cohen
Memo: towards automatic usability evaluation of spoken dialogue services by user error simulations
Sebastian Möller, Roman Englert, Klaus Engelbrecht, Verena Hafner, Anthony Jameson, Antti Oulasvirta, Alexander Raake, Norbert Reithinger
Synthesizing breathiness in natural speech with sinusoidal modelling
Brett Matthews, Raimo Bakis, Ellen Eide
Voice GMM modelling for FESTIVAL/MBROLA emotive TTS synthesis
Mauro Nicolao, Carlo Drioli, Piero Cosi
Emovoice: a system to generate emotions in speech
João P. Cabral, Luís C. Oliveira
Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar
Zhiyong Wu, Shen Zhang, Lianhong Cai, Helen M. Meng
Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesis
Hongwu Yang, Helen M. Meng, Lianhong Cai
Automatic emotion recognition of speech signal in Mandarin
Sheng Zhang, P. C. Ching, Fanrang Kong
Feature analysis for emotion recognition from Mandarin speech considering the special characteristics of Chinese language
Yi-hao Kao, Lin-shan Lee
Timing levels in segment-based speech emotion recognition
Björn Schuller, Gerhard Rigoll
Analyzing dialogue data for real-world emotional speech classification
Ryuichi Nisimura, Souji Omae, Hideki Kawahara, Toshio Irino
Evolving emotional prosody
Cecilia Ovesdotter Alm, Xavier Llorà
Vocal emotion recognition with cochlear implants
Xin Luo, Qian-Jie Fu, John J. Galvin III
Emotion detection in infants² cries based on a maximum likelihood approach
S. Matsunaga, S. Sakaguchi, M. Yamashita, S. Miyahara, S. Nishitani, K. Shinohara
yeah right: sarcasm recognition for spoken dialogue systems
Joseph Tepperman, David Traum, Shrikanth Narayanan
Identification of confusion and surprise in spoken dialog using prosodic features
Rohit Kumar, Carolyn P. Rosé, Diane J. Litman
Analysis and detection of speech under sleep deprivation
Tin Lay Nwe, Haizhou Li, Minghui Dong
Language, gender, speaking style and language proficiency as factors influencing the autonomous vocalic filler production in spontaneous speech
Ioana Vasilescu, Martine Adda-Decker
How to handle gender and number agreement in statistical language models?
Caroline Lavecchia, Kamel Smaïli, Jean-Paul Haton
Prosodic features for a maximum entropy language model
Oscar Chan, Roberto Togneri
Language model adaptation with a word list and a raw corpus
Shinsuke Mori
Topic-based language modeling with dynamic Bayesian networks
Pascal Wiggers, Léon J.M. Rothkrantz
Speech recognition of foreign out-of-vocabulary words using a hierarchical language model
Hirofumi Yamamoto, Genichiro Kikui, Satoshi Nakamura, Yoshinori Sagisaka
Language modeling of Chinese personal names based on character units for continuous Chinese speech recognition
Xinhui Hu, Hirofumi Yamamoto, Genichiro Kikui, Yoshinori Sagisaka
A syllable based continuous speech recognizer for Tamil
A. Lakshmi, Hema A. Murthy
Spontaneous Thai speech recognition
Monika Woszczyna, Paisarn Charoenpornsawat, Tanja Schultz
Acoustic analysis and automatic recognition of spontaneous children's speech
M. Gerosa, D. Giuliani, Shrikanth Narayanan
Speech and speech recognition during dictation corrections
Keith Vertanen
Comparison of keyword spotting methods for searching in speech
Lubos Smídl, Josef V. Psutka
Automatic generation of statistical language models for interactive voice response applications
Mithun Balakrishna, Cyril Cerovic, Dan Moldovan, Ellis Cave
Call analysis with classification using speech and non-speech features
Yun-Cheng Ju, Ye-Yi Wang, Alex Acero
A spoken language understanding approach using successive learners
Wei-Lin Wu, Ru-Zhan Lu, Hui Liu, Feng Gao
Conversational help desk: vague callers and context switch
Osamuyimen Stewart, Juan Huerta, Ea-Ee Jan, Cheng Wu, Xiang Li, David Lubensky
Integrating spoken dialog and question answering: the ritel project
Sophie Rosset, Olivier Galibert, Gabriel Illouz, Aurélien Max
Rapid simulation-driven reinforcement learning of multimodal dialog strategies in human-robot interaction
Thomas Prommer, Hartwig Holzapfel, Alex Waibel
Software architectures for incremental understanding of human speech
Gregory Aist, James Allen, Ellen Campana, Lucian Galescu, Carlos A. Gómez Gallo, Scott C. Stoness, Mary Swift, Michael Tanenhaus
Lingua machinae - an unorthodox proposal
Florian Schiel, Christoph Draxler, Marion Libossek
Evaluation of content presentation strategies for an in-car spoken dialogue system
Heather Pon-Barry, Fuliang Weng, Sebastian Varges
On designing context sensitive language models for spoken dialog systems
Vaibhava Goel, Ramesh Gopinath
Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus
Yang Liu
A multilingual expectations model for contextual utterances in mixed-initiative spoken dialogue
Hartwig Holzapfel, Alex Waibel
Dynamic help generation by estimating user²s mental model in spoken dialogue systems
Yuichiro Fukubayashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Dialog act tagging with support vector machines and hidden Markov models
Dinoj Surendran, Gina-Anne Levow
Noise robust model-based voice activity detection
Ángel de la Torre, Javier Ramírez, Carmen Benítez, José C. Segura, L. García, Antonio J. Rubio
Auto-segmentation based VAD for robust ASR
Yu Shi, Frank K. Soong, Jian-Lai Zhou
Improved speech activity detection using cross-channel features for recognition of multiparty meetings
Kofi Boakye, Andreas Stolcke
Evaluation of voice activity detection by combining multiple features with weight adaptation
Yusuke Kida, Tatsuya Kawahara
Voice activity detection in personal audio recordings using autocorrelogram compensation
Keansub Lee, Daniel P. W. Ellis
Discriminating speech and non-speech with regularized least squares
Ryan Rifkin, Nima Mesgarani
Automatic grammar correction for second-language learners
John Lee, Stephanie Seneff
ASR-based corrective feedback on pronunciation: does it really work?
Ambra Neri, Catia Cucchiarini, Helmer Strik
Evaluating prosody of Mandarin speech for language learning
Minghui Dong, Haizhou Li, Tin Lay Nwe
Spoken language technologies applied to digital talking books
Isabel Trancoso, Carlos Duarte, António Serralheiro, Diamantino Caseiro, Luís Carriço, Céu Viana
Building an English speech synthesis system from a Japanese ALS patient²s voice
Akemi Iida, Jun Ito, Shimpei Kajima, Tsutomu Sugawara
Multi-modal system ICANDO: intellectual computer assistant for disabled operators
Alexey Karpov, Andrey Ronzhin, Alexandre Cadiou
User responses to prosodic variation in fragmentary grounding utterances in dialog
Gabriel Skantze, David House, Jens Edlund
Analysis of prosodic and linguistic cues of phrase finals for turn-taking and dialog acts
Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita
From reaction to prediction: experiments with computational models of turn-taking
David Schlangen
On speaker-specific prosodic models for automatic dialog act segmentation of multi-party meetings
Jáchym Kolár, Elizabeth Shriberg, Yang Liu
A case study in the identification of prosodic cues to turn-taking: back-channeling in Arabic
Nigel G. Ward, Yaffa Al Bayyari
/nailon/ - software for online analysis of prosody
Jens Edlund, Mattias Heldner
Improved hybrid microphone array post-filter by integrating a robust speech absence probability estimator for speech enhancement
Junfeng Li, Masato Akagi, Yôiti Suzuki
Soft decision combining for dual channel noise reduction
Timo Gerkmann, Rainer Martin
An improved affine projection algorithm based crosstalk resistant adaptive noise canceller
Guo Chen, Vijay Parsa
An optimum microphone array post-filter for speech applications
Stamatis Leukimmiatis, Dimitrios Dimitriadis, Petros Maragos
Multi-microphone periodicity function for robust F0 estimation in real noisy and reverberant environments
Federico Flego, Maurizio Omologo
A new dual-microphone speech enhancement method for oriented noises
H. R. Abutalebi, M. Pourahmadi, M.R. Aghabozorgi
50 years late: repeating miller-nicely 1955
Andrew Lovitt, Jont B. Allen
New 20-word lists for word intelligibility test in Japanese
Shuichi Sakamoto, Tadahiro Yoshikawa, Shigeaki Amano, Yôiti Suzuki, Tadahisa Kondo
Sparseness and speech perception in noise
Guoping Li, Mark E. Lutman
An assessment of automatic speech recognition as speech intelligibility estimation in the context of additive noise
Wei M. Liu, John S. D. Mason, Nicholas W. D. Evans, Keith A. Jellyman
Underlying quality dimensions of modern telephone connections
Marcel Wältermann, Kirstin Scholz, Alexander Raake, Ulrich Heute, Sebastian Möller
An ERB loudness pattern based objective speech quality measure
Guo Chen, Vijay Parsa, Susan Scollie
A spectral clustering approach to speaker diarization
Huazhong Ning, Ming Liu, Hao Tang, Thomas S. Huang
BINSEG: an efficient speaker-based segmentation technique
Jindrich Zdansky
Multi-stream speaker diarization systems for the meetings domain
Ascensión Gallardo-Antolín, Xavier Anguera, Chuck Wooters
Improved performance evaluation of speech event detectors
Carla Lopes, Fernando Perdigão
Speaker diarization for multiple distant microphone meetings: mixing acoustic features and inter-channel time differences
Jose M. Pardo, Xavier Anguera, Chuck Wooters
Low-complexity and efficient classification of voiced/unvoiced/silence for noisy environments
Tuan Van Pham, Gernot Kubin
Unsupervised language model adaptation based on automatic text collection from WWW
Motoyuki Suzuki, Yasutomo Kajiura, Akinori Ito, Shozo Makino
Unsupervised language model adaptation using latent semantic marginals
Yik-Cheung Tam, Tanja Schultz
Unsupervised language model adaptation for Mandarin broadcast conversation transcription
David Mrva, Philip C. Woodland
Language model adaptation for tiny adaptation corpora
Dietrich Klakow
Pronunciation dependent language models
Andrej Ljolje
Improving perplexity measures to incorporate acoustic confusability
Amit Anil Nanavati, Nitendra Rajput
Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix format
Long Qin, Yi-Jian Wu, Zhen-Hua Ling, Ren-Hua Wang
Map-based adaptation for speech conversion using adaptation data selection and non-parallel training
Chung-Han Lee, Chung-Hsien Wu
Novel method for data clustering and mode selection with application in voice conversion
Jani Nurminen, Jilei Tian, Victor Popa
Text-independent cross-language voice conversion
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Julia Hirschberg
Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
Improving body transmitted unvoiced speech with statistical voice conversion
Mikihiro Nakagiri, Tomoki Toda, Hideki Kashioka, Kiyohiro Shikano
An HMM-based singing voice synthesis system
Keijiro Saino, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda
Voice conversion based on mixtures of factor analyzers
Yosuke Uto, Yoshihiko Nankaku, Tomoki Toda, Akinobu Lee, Keiichi Tokuda
Efficient Gaussian mixture model evaluation in voice conversion
Jilei Tian, Jani Nurminen, Victor Popa
Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis
Yuji Nakano, Makoto Tachibana, Junichi Yamagishi, Takao Kobayashi
Frequency warping based on mapping formant parameters
Zhi-Wei Shuang, Raimo Bakis, Slava Shechtman, Dan Chazan, Yong Qin
Automatic phonetic segmentation by using a SPM-based approach for a Mandarin singing voice corpus
Cheng-Yuan Lin, J.-S. Roger Jang
A comparison of singing evaluation algorithms
Partha Lal
Towards automatic parameter extraction of command-response model for Cantonese
Raymond W. M. Ng, Tan Lee, Wentao Gu
A model for the f0 reset in corpus-based intonation approaches
Francisco Campillo, Jan P. H. van Santen, Eduardo R. Banga
Generating German intonation with a trainable prosodic model
Gérard Bailly, Jan Gorisch
Incorporating second-order information into two-step major phrase break prediction for Korean
Seungwon Kim, Jinsik Lee, Byeongchang Kim, Gary Geunbae Lee
Totally data-driven duration modeling based on generalized linear model for Mandarin TTS
Lifu Yi, Jian Li, Xiaoyan Lou, Jie Hao
Segmental duration modeling in Turkish
Özlem Özturk, Tolga Ciloglu
Lexical stress in continuous speech recognition
Rogier C. van Dalen, Pascal Wiggers, Léon J. M. Rothkrantz
Improving tone recognition with combined frequency and amplitude modelling
Siwei Wang, Gina-Anne Levow
Latent prosodic modeling (LPM) for speech with applications in recognizing spontaneous Mandarin speech with disfluencies
Che-Kuang Lin, Lin-shan Lee
Tone recognition of continuous speech of standard Chinese using neural network and tone nucleus model
Keikichi Hirose, Hui Hu, Xiaodong Wang, Nobuaki Minematsu
Prosodic feature generation for back-channel prediction
Thamar Solorio, Olac Fuentes, Nigel G. Ward, Yaffa Al Bayyari
On the sufficiency and redundancy of pitch for TRP projection
Wieneke Wesseling, Rob J. J. H. van Son, Louis C. W. Pols
Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition
Matthew Gibson, Thomas Hain
Minimum divergence based discriminative training
Jun Du, Peng Liu, Frank K. Soong, Jian-Lai Zhou, Ren-Hua Wang
Solving large margin estimation of HMMS via semidefinite programming
Xinwei Li, Hui Jiang
Use of incrementally regulated discriminative margins in MCE training for speech recognition
Dong Yu, Li Deng, Xiaodong He, Alex Acero
Soft margin estimation of hidden Markov model parameters
Jinyu Li, Ming Yuan, Chin-Hui Lee
Discriminative models for spoken language understanding
Ye-Yi Wang, Alex Acero
Evaluating a virtual speech cuer
G. Gibert, Gérard Bailly, F. Elisei
Intelligibility of machine translation output in speech synthesis
Laura Mayfield Tomokiyo, Kay Peterson, Alan W. Black, Kevin A. Lenzo
A technique for controlling voice quality of synthetic speech using multiple regression HSMM
Makoto Tachibana, Takashi Nose, Junichi Yamagishi, Takao Kobayashi
Learning from errors in grapheme-to-phoneme conversion
Tatyana Polyakova, Antonio Bonafonte
Eigenvoice conversion based on Gaussian mixture model
Tomoki Toda, Yamato Ohtani, Kiyohiro Shikano
Generating time-constrained audio presentations of structured information
Brian Langner, Rohit Kumar, Arthur Chan, Lingyun Gu, Alan W. Black
Multimodal authentication using qualitative support vector machines
F. Alsaade, A. Ariyaeeinia, L. Meng, A. Malegaonkar
Adaptive multimodal fusion by uncertainty compensation
Vassilis Pitsikalis, Athanassios Katsamanis, George Papandreou, Petros Maragos
Effects of familiarity with faces and voices on second-language speech processing: components of memory traces
Debra M. Hardison
Automatic metadata generation and video editing based on speech and image recognition for medical education contents
Satoshi Tamura, Koji Hashimoto, Jiong Zhu, Satoru Hayamizu, Hirotsugu Asai, Hideki Tanahashi, Makoto Kanagawa
Analysis of correlation between audio and visual speech features for clean audio feature prediction in noise
Ibrahim Almajai, Ben Milner, Jonathan Darch
TDA: a new trainable trajectory formation system for facial animation
Oxana Govokhina, Gérard Bailly, Gaspard Breton, Paul Bagshaw
Modeling of speech signals based on Bessel-like orthogonal transform
Giorgio Biagetti, Paolo Crippa, Claudio Turchetti
Glottal closure and opening detection for flexible parametric voice coding
Pamornpol Jinachitra
Independent components for acoustic modeling
Jan Trmal, Jan Vanek, Ludek Müller, Jan Zelinka
Pitch-scale modification using the modulated aspiration noise source
Daryush Mehta, Thomas F. Quatieri
Max-Gabor analysis and synthesis of spectrograms
Tony Ezzat, Jake Bouvrie, Tomaso Poggio
Monitoring of the natural voice variations in open and closed phases with frequency warped ARMA modeling
Pedro J. Quintana-Morales, Juan L. Navarro-Mesa, Antonio G. Ravelo-Garcia, Fernando D. Lorenzo-Garcia
Speech analyzer using a joint estimation model of spectral envelope and fine structure
Hirokazu Kameoka, Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama
An investigation of manifold learning for speech analysis
Andrew Errity, John McKenna
An incremental algorithm for signal reconstruction from short-time fourier transform magnitude
Jake Bouvrie, Tony Ezzat
Automatic assignment of anchoring points on vowel templates for defining correspondence between time-frequency representations of speech samples
Toru Takahashi, Masashi Nishi, Toshio Irino, Hideki Kawahara
Nonlinear dynamical invariants for speech recognition
S. Prasad, S. Srinivasan, M. Pannuri, G. Lazarou, Joseph Picone
Exploiting polynomial-fit histogram equalization and temporal average for robust speech recognition
Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen
Missing data mask models with global frequency and temporal constraints
Sébastien Demange, Christophe Cerisara, Jean-Paul Haton
Multi-stream ASR: an oracle perspective
Hemant Misra, Jithendra Vepa, Hervé Bourlard
A weight estimation method using LDA for multi-band speech recognition
Koji Iwano, Kaname Kojima, Sadaoki Furui
Powered cepstral normalization (p-CN) for robust features in speech recognition
Chang-wen Hsu, Lin-shan Lee
Robust automatic speech recognition for accented Mandarin in car environments
Pei Ding, Lei He, Xiang Yan, Jie Hao
A robust feature extraction based on the MTF concept for speech recognition in reverberant environment
Xugang Lu, Masashi Unoki, Masato Akagi
Clean speech feature estimation based on soft spectral masking
Young Joon Kim, Woohyung Lim, Nam Soo Kim
Robust speech recognition by modifying clean and telephone feature vectors using bidirectional neural network
Mansoor Vali, Seyyed Ali Seyyed Salehi, Kazem Karimi
Silence energy normalization for robust speech recognition in additive noise environment
Chung-fu Tai, Jeih-weih Hung
Handling convolutional noise in missing data automatic speech recognition
Maarten Van Segbroeck, Hugo Van hamme
Noisy speech recognition based on selection of multiple noise suppression methods using noise GMMs
Norihide Kitaoka, Souta Hamaguchi, Seiichi Nakagawa
Using posterior-based features in template matching for speech recognition
Guillermo Aradilla, Jithendra Vepa, Hervé Bourlard
Hypothesis-based feature combination of multiple speech inputs for robust speech recognition in automotive environments
Yasunari Obuchi, Nobuo Hataoka
Continuous time-frequency masking method for blind speech separation with adaptive choice of threshold parameter using ICA
Zbynek Koldovsky, Jan Nouza, Jan Kolorenc
Multistage convolutive blind source separation for speech mixture
Yanxue Liang, Ichiro Hagiwara
Detection and separation of speech events in meeting recordings
Futoshi Asano, Jun Ogata
Audio person tracking in a smart-room environment
Alberto Abad, Carlos Segura, Duàn Macho, Javier Hernando, Climent Nadeu
Tracking and beamforming for multiple simultaneous speakers with probabilistic data association filters
Tobias Gehrig, Ulrich Klee, John W. McDonough, Shajith Ikbal, Matthias Wölfel, Christian Fügen
Modeling the precedence effect for binaural sound source localization in noisy and echoic environments
Martin Heckmann, Tobias Rodemann, Bjorn Scholling, Frank Joublin, Christian Goerick
Using a differential microphone array to estimate the direction of arrival of two acoustic sources
Fotios Talantzis, Anthony G. Constantinides, Lazaros C. Polymenakos
Speaker localization based on oriented global coherence field
Alessio Brutti, Maurizio Omologo, Piergiorgio Svaizer
Performance evaluation of three features for model-based single channel speech separation problem
M. H. Radfar, R. M. Dansereau, A. Sayadiyan
Single-channel speech separation using sparse non-negative matrix factorization
Mikkel N. Schmidt, Rasmus K. Olsson
Adaptive speech enhancement for speech separation in diffuse noise
Rong Hu, Yunxin Zhao
A probabilistic graphical model for microphone array source separation using rich pre-trained source models
H. T. Attias
Geometrically constrained permutation-free source separation in an undercomplete speech unmixing scenario
Erik Visser
Highly directional multi-beam audio loudspeaker
Dirk Olszewski, Klaus Linhard
Article |
---|