doi: 10.21437/Interspeech.2004
ISSN: 2958-1796
From decoding-driven to detection-based paradigms for automatic speech recognition
Chin-Hui Lee
In search of a universal phonetic alphabet - theory and application of an organic visible speech-
Hyun-Bok Lee
From X-ray or MRU data to sounds through articulatory synthesis: towards an integrated view of the speech communication process
Jacqueline Vaissière
Stochastic gradient adaptation of front-end parameters
Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goe
Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions
Antoine Raux, Rita Singh
Transformation and combination of hiden Markov models for speaker selection training
Chao Huang, Tao Chen, Eric Chang
Improving eigenspace-based MLLR adaptation by kernel PCA
Brian Mak, Roger Hsiao
Rapid acoustic model development using Gaussian mixture clustering and language adaptation
Nikos Chatzichrisafis, Vasilios Digalakis, Vasilios Diakoloukas, Costas Harizakis
Adaptation of front end parameters in a speech recognizer
Karthik Visweswariah, Ramesh Gopinath
Speaker normalization through constrained MLLR based transforms
Diego Giuliani, Matteo Gerosa, Fabio Brugnara
Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying
Xiangyu Mu, Shuwu Zhang, Bo Xu
Adaptation in the pronunciation space for non-native speech recognition
Georg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth
Robust ASR model adaptation by feature-based statistical data mapping
Xuechuan Wang, Douglas O'Shaughnessy
A novel target-driven generalized JMAP adaptation algorithm
Zhaobing Han, Shuwu Zhang, Bo Xu
Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA
Brian Mak, Simon Ho, James T. Kwok
Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition
Hyung Bae Jeon, Dong Kook Kim
Vocal tract normalization based on spectral warping
Wei Wang, Stephen Zahorian
Acoustic model adaptation for coded speech using synthetic speech
Koji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge
Speaker adaptation method for CALL system using bilingual speakers' utterances
Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino
Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task
Shinji Watanabe
Speaker clustering of speech utterances using a voice characteristic reference space
Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang
Performance improvement of connected digit recognition using unsupervised fast speaker adaptation
Young Kuk Kim, Hwa Jeon Song, Hyung Soon Kim
Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation
Hyung Soon Kim, Hwa Jeon Song
Speaker dependent model order selection of spectral envelopes
Matthias Wölfel
Methods for task adaptation of acoustic models with limited transcribed in-domain data
Enrico Bocchieri, Michael Riley, Murat Saraclar
Unsupervised topic adaptation for lecture speech retrieval
Atsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba
Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs
Haibin Liu, Zhenyang Wu
Design of ready-made acoustic model library by two-dimensional visualization of acoustic space
Goshu Nagino, Makoto Shozakai
Language recognition using phone latices
Jean-Luc Gauvain, Abdel Messaoudi, Holger Schwenk
ACCDIST: a metric for comparing speakers' accents
Mark Huckvale
Aspects of named entity processing
Michael Levit, Allen Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth
Finite-state-based and phrase-based statistical machine translation
Josep M. Crego, José B. Marino, Adria de Gispert
Using word latice information for a tighter coupling in speech translation systems
Tanja Schultz, Szu-Chen Jou, Stephan Vogel, Shirin Saleem
Confirmation strategy for document retrieval systems with spoken dialog interface
Teruhisa Misu, Tatsuya Kawahara, Kazunori Komatani
Multilayer subword units for open-vocabulary spoken document retrieval
Shi-Wook Lee, Kazuyo Tanaka, Yoshiaki Itoh
An efficient partial matching algorithm toward speech retrieval by speech
Yoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee
Language detection by neural discrimination
Celestin Sedogbo, Sebastien Herry, Bruno Gas, Jean Luc Zarader
Language identification techniques based on full recognition in an air traffic control task
Ricardo de Córdoba, Javier Ferreiros, Valentin Sama, Javier Macias-Guarasa, Luis F. D'Haro, Fernando Fernandez
Dialect analysis and modeling for automatic classification
John H. L. Hansen, Umit Yapanel, Rongqing Huang, Ayako Ikeno
Rhythm in read british English: interdialect variability
Emmanuel Ferragne, Francois Pellegrino
A grammar-based Chinese to English speech translation system for portable devices
Pascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu
Cost-sensitive call classification
Gokhan Tur
An evaluation of a spoken document retrieval baseline system in finish
Mikko Kurimo, Ville Turunen, Inger Ekman
Discriminative training of naive Bayes classifiers for natural language call routing
Hui Jiang, Pengfei Liu, Imed Zitouni
Phonetic confusion based document expansion for spoken document retrieval
Nicolas Moreau, Hyoung-Gook Kim, Thomas Sikora
Hybrid named entity recognition for question-answering system
Euisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang
An online audio indexing system
Jitendra Ajmera, Iain McCowan, Hervé Bourlard
Histogram normalisation and the recognition of names and ontology words in the MUMIS project
Eric Sanders, Febe de Wet
Improving the topic indexation and segmentation modules of a media watch system
Rui Amaral, Isabel Trancoso
Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches
Melissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, Francois Pellegrino
METRIC-SEQDAC: a hybrid approach for audio segmentation
Hsin-min Wang, Shih-sian Cheng
Statistical Chinese spoken document retrieval using latent topical information
Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-min Wang
Keyword recognition and extraction by multiple-LVCSRs with 60,000 words in speech-driven WEB retrieval task
Matsushita Masahiko, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro
Improved spoken language translation using n-best speech recognition hypotheses
Ruiqiang Zhang, Genichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai-Kit Lo
Automatic language identification using discrete hidden Markov model
Kakeung Wong, Man-hung Siu
Two-way speech-to-speech translation on handheld devices
Bowen Zhou, Daniel Dechelotte, Yuqing Gao
HLT modules scalability within the NESPOLE! project
Hervé Blanchon
Correlation between VOT and F0 in the perception of Korean stops and affricates
Midam Kim
The development of anticipatory labial coarticulation in French: a pionering study
Aude Noiray, Lucie Menard, Marie-Agnes Cathiard, Christian Abry, Christophe Savariaux
Speech recognition, sylabification and statistical phonetics
Melvyn John Hunt
Data-driven approaches for automatic detection of syllable boundaries
Jilei Tian
Phonemic repertoire and similarity within the vocabulary
Anne Cutler, Dennis Norris, Nuria Sebastian-Galles
Boostrapping phonetic lexicons for new languages
Sameer Maskey, Alan Black, Laura Tomokiya
Lexical representation of non-native phonemes
Mirjam Broersma, K. Marieke Kolkman
A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers
Jong-Pyo Lee, Tae-Yeoub Jang
Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI study
Emi Zuiki Murano, Mihoko Teshigawara
Effects of phonetic contexts on the duration of phonetic segments in fluent read speech
Sorin Dusan
A study on nasal coda los in continuous speech
Qiang Fang
An improved pair-wise variability index for comparing the timing characteristics of speech
Hua-Li Jian
An acoustic study of speech rhythm in taiwan English
Hua-Li Jian
Language specific phonetic rules: evidence from domain-initial strengthening
Sung-A Kim
Spectral characteristics of the release bursts in Korean alveolar stops
Hansang Park
Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian)
Rob Van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes
Assessment of non-native phones in anglicisms by German listeners
Julia Abresch, Stefan Breuer
Phonology of exceptions for for Korean grapheme-to-phoneme conversion
Sunhee Kim
Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effect
Kitazawa Shigeyoshi, Shinya Kiriyama
A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and Thai
Kimiko Tsukada
Acoustic correlates of phrase-internal lexical boundaries in dutch
Taehong Cho, Elizabeth K. Johnson
Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners' perception of dutch and English
Taehong Cho, James M. McQueen
Comparing intonation of two varieties of French using normalized F0 values
Svetlana Kaminskaia, Francois Poire
Phonetic realization of the suffix-suppressed accentual phrase in Korean
Mira Oh, Kee-Ho Kim
Spectral moment vs. bark cepstral analysis of children's word-initial voiceles stops
H. Timothy Bunnell, James Polikoff, Jane McNicholas
Pronunciation assessment based upon the compatibility between a learner's pronunciation structure and the target language's lexical structure
Nobuaki Minematsu
Spread of high tone in akita Japanese
Kenji Yoshida
Biomechanical parameter fingerprint in the mucosal wave power spectral density
Juan-Ignacio Godino-Llorente, Victoria Rodellar-Biarge, Pedro Gomez-Vilda, Francisco Diaz-Perez, Agustin Alvarez-Marquina, Rafael Martinez-Olalla
Classification of pathological voice including severely noisy cases
Cheolwoo Jo, Soo-Geon Wang, Byung-Gon Yang, Hyung-Soon Kim, Tao Li
A robust glottal source model estimation technique
Qiang Fu, Peter Murphy
F0 and formant frequency distribution of dysarthric speech - a comparative study
Hiroki Mori, Yasunori Kobayashi, Hideki Kasuya, Hajime Hirose, Noriko Kobayashi
Procedure "senza vibrato": a key component for morphing singing
Hideki Kawahara, Yumi Hirachi, Morise Masanori, Hideki Banno
Thyroplastic medialisation in unilateral vocal fold paralysis: assessing voice quality recovering
Claudia Manfredi, Giorgio Peretti, Laura Magnoni, Fabrizio Dori, Ernesto Iadanza
Voice enhancement of male speakers with laryngeal neoplasm
Gernot Kubin, Martin Hagmueller
A comparison of the perturbation analysis between PRAAT and computerize speech lab
Jong Min Choi, Myung-Whun Sung, Kwang Suk Park, Jeong-Hun Hah
Evaluation of universal compensation on Aurora 2 and 3 and beyond
Ming Ji, Baochun Hou
PROSPECT features and their application to missing data techniques for robust speech recognition
Hugo Van hamme
Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement
Hugo Van hamme, Patrick Wambacq, Veronique Stouten
Applying the Aurora feature extraction schemes to a phoneme based recognition task
Hans-Guenter Hirsch, Harald Finster
Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 database
Zhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui
Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm
Tor Andre Myrvoll, Satoshi Nakamura
HMM-based feature compensation method: an evaluation using the AURORA2
Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano
Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping
Xuechuan Wang, Douglas O'Shaughnessy
MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition
Benjamin J. Shannon, Kuldip K. Paliwal
A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR
Ghulam Muhammad, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta
Including uncertainty of speech observations in robust speech recognition
José Carlos Segura, Angel De la Torre, Javier Ramirez, Antonio J. Rubio, Carmen Benitez
Integration of n-best recognition results obtained by multiple noise reduction algorithms
Takeshi Yamada, Jiro Okada, Nobuhiko Kitawaki
Revisiting some model-based and data-driven denoising algorithms in Aurora 2 context
Panji Setiawan, Sorel Stan, Tim Fingscheidt
Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion
Guo-Hong Ding, Bo Xu
A robust training algorithm based on neighborhood information
Wing-Hei Au, Man-Hung Siu
In-phase feature induction: an effective compensation technique for robust speech recognition
Siu Wa Lee, Pak Chung Ching
Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptation
Siu-Kei Au Yeung, Man-Hung Siu
A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering
Shang-nien Tsai, Lin-shan Lee
Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition
Christian Fügen, Hartwig Holzapfel, Alex Waibel
Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs
Akinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano
Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary
Hironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa
Constrained minimization technique for topic identification using discriminative training and support vector machines
Imed Zitouni, Minkyu Lee, Hui Jiang
Characterizing task-oriented dialog using a simulated ASR chanel
Jason D. Williams, Steve Young
A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots
Takashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino
Noise adaptive spoken dialog system based on selection of multiple dialog strategies
Akinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino
Flexible dialogue management using distributed and dynamic dialogue control
Mikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk
Contextual revision in information seeking conversation systems
Keith Houck
Cross domain dialogue modelling: an object-based approach
Ian O'Neill, Philip Hanna, Xingkun Liu, Michael McTear
A comparison of confirmation styles for error handling in a speech dialog system
Hirohiko Sagawa, Teruko Mitamura, Eric Nyberg
Using computer simulation to compare two models of mixed-initiative
Fan Yang, Peter A. Heeman
Towards understanding mixed-initiative in task-oriented dialogues
Fan Yang, Peter A. Heeman, Kristy Hollingshead
Spokenquery: an alternate approach to chosing items with speech
Peter Wolf, Joseph Woelfel, Jan Van Gemert, Bhiksha Raj, David Wong
Mining customer care dialogs for "daily news"
Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert Bell, Mazin Rahim, Deborah F. Swayne, Chris Volinsky
Higgins - a spoken dialogue system for investigating error handling techniques
Jens Edlund, Gabriel Skantze, Rolf Carlson
A conversational dialogue system for cognitively overloaded users
Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao
Modeling generic dialog applications for embedded systems
Gerhard Hanrieder, Stefan W. Hamerich
A framework for dialogue data collection with a simulated ASR channel
Matthew N. Stuttle, Jason D. Williams, Steve Young
A multi-layer conversation management approach for information seeking applications
Shimei Pan
A universal speech interface for appliances
Thomas Kevin Harris, Roni Rosenfeld
Speech understanding, dialogue management and response generation in corpus-based spoken dialogue system
Keita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi
Implementation of dialog applications in an open-source voiceXML platform
Fernando Fernandez, Valentin Sama, Luis F. D'Haro, Ruben San-Segundo, Ricardo de Córdoba, Juan Manuel Montero
Fuzzy logic decision fusion in a multimodal biometric system
Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu-Sang Moon, Yeung Yam
A state model for the realization of visual perceptive feedback in smartkom
Peter Poller, Norbert Reithinger
A vector-based method for efficiently representing multivariate environmental information
Akemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa
A multi-modal dialog system for a mobile robot
Ioannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink
Structured interview-based evaluation of spoken multimodal conversation with h.c. andersen
Niels Ole Bernsen, Laila Dybkjaer
Memory efficient decoding graph compilation with wide cross-word acoustic context
Miroslav Novak, Vladimir Bergl
Dynamic beam pruning strategy using adaptive control
Dongbin Zhang, Limin Du
Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition
Takaaki Hori, Chiori Hori, Yasuhiro Minami
A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech
Peng Yu, Frank Torsten Bernd Seide
Keyword spotting for highly inflectional languages
Lubos Smidl, Ludek Müller
Optimizing an engine network that allows dynamic masking
Frédéric Tendeau
Topic structure extraction for meeting indexing
Katsutoshi Ohtsuki, Nobuaki Hiroshima, Yoshihiko Hayashi, Katsuji Bessho, Shoichi Matsunaga
Automatic detection of dialog acts based on multilevel information
Sophie Rosset, Lori Lamel
Identifying local corrections in human-computer dialogue
Gina-Anne Levow
Hot discussion or frosty dialogue? towards a temperature metric for conversational interactivity
Peter Reichl, Florian Hammer
A dynamic vocabulary spoken dialogue interface
Stephanie Seneff, Chao Wang, Lee Hetherington, Grace Chung
Learning dialogue policies using state aggregation in reinforcement learning
Matthias Denecke, Kohji Dohsaka, Mikio Nakano
Segmenting ambiguous phrases using phoneme duration
Keren B. Shatzman
A compensation method for word-familiarity difference with SNR control in intelligibility test
Shuichi Sakamoto, Yo-iti Suzuki, Shigeaki Amano, Tadahisa Kondo, Naoki Iwaoka
Phoneme-based word activation in spoken-word recognition: evidence from Japanese school children
Takashi Otake, Yoko Sakamoto, Yasuyuki Konomi
Role of segmental and suprasegmental cues in the perception of maghrebian-acented French
Belynda Brahimi, Philippe Boula de Mareuil, Cedric Gendrot
Effect of speaking rate on the acceptability of change in segment duration
Hiroaki Kato, Yoshinori Sagisaka, Minoru Tsuzaki, Makiko Muto
A cross-linguistic study of diphthongs in spoken word processing in Japanese and English
Kiyoko Yoneyama
Speech translation: past, present and future
Alex Waibel
Multilingual corpora for speech-to-speech translation research
Genichiro Kikui, Toshiyuki Takezawa, Seiichi Yamamoto
Statistical machine translation and its challenges
Hermann Ney
Translingual grammar induction
John Lee, Stephanie Seneff
Usability considerations of speech-to-speech translation system
Youngjik Lee, Jun Park, Seung-Shin Oh
Worldwide ongoing activities on multilingual speech to speech translation
Gianni Lazzari, Alex Waibel, Chengqing Zong
The automatic news transcription system: ANTS, some real time experiments
Dominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina
Use of metadata to improve recognition of spontaneous speech and named entities
Bhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig
Duration modeling techniques for continuous speech recognition
Janne Pylkkonen, Mikko Kurimo
Large vocabulary continuous speech recognition for estonian using morpheme classes
Tanel Alumae
Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling
Zhaobing Han, Shuwu Zhang, Bo Xu
Parallel tone score association method for tone language speech recognition
William S-Y. Wang, Gang Peng
Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition
Jing Zheng, Horacio Franco, Andreas Stolcke
Automatic transcription of continuous speech using unsupervised and incremental training
G.L. Sarada Ghadiyaram, N. Hemalatha Nagarajan, T. Nagarajan Thangavelu, Hema A. Murthy
Very large vocabulary speech recognition system for automatic transcription of czech broadcast programs
Jan Nouza, Dana Nejedlova, Jindrich Zdansky, Jan Kolorenc
Speech recognition error analysis on the English MALACH corpus
Olivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig
A frame level boosting training scheme for acoustic modeling
Rong Zhang, Alexander Rudnicky
Optimizing boosting with discriminative criteria
Rong Zhang, Alexander Rudnicky
Restructuring HMM states for speaker adaptation in Mandarin speech recognition
Xianghua Xu, Qiang Guo, Jie Zhu
A discriminative locally weighted distance measure for speaker independent template based speech recognition
Mike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools
Deterministic annealing EM algorithm in parameter estimation for acoustic model
Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura
TRAP based features for LVCSR of meting data
Frantisek Grezl, Martin Karafiat, Jan Cernocky
Optimal acoustic and language model weights for minimizing word verification errors
Frank K. Soong, Wai Kit Lo, Satoshi Nakamura
Structuring of baseball live games based on speech recognition using task dependant knowledge
Atsushi Sako, Yasuo Ariki
A two-level schema for detecting recognition errors
Zhengyu Zhou, Helen Meng
Large vocabulary continuous speech recognition based on cross-morpheme phonetic information
In-Jeong Choi, Nam-Hoon Kim, Su Youn Yoon
Automatic phonetic base form generation based on maximum context tree
Changxue Ma
Dictionary refinements based on phonetic consensus and non-uniform pronunciation reduction
Gustavo Hernandez-Abrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf
Transcription of arabic broadcast news
Abdel. Messaoudi, Lori Lamel, Jean-Luc Gauvain
Spontaneous speech recognition using a massively parallel decoder
Takahiro Shinozaki, Sadaoki Furui
Issues in meeting transcription - the ISL meeting transcription system
Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen
Multi-pass ASR using vocabulary expansion
Katsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi
Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition
Vlasios Doumpiotis, William Byrne
Task-specific minimum Bayes-risk decoding using learned edit distance
Izhak Shafran, William Byrne
Apply n-best list re-ranking to acoustic model combinations of boosting training
Rong Zhang, Alexander Rudnicky
Using VTLN for broadcast news transcription
D. Y. Kim, S. Umesh, M. J. F. Gales, T. Hain, P. C. Woodland
From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system
Andreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, Dave Gelbart, Nikki Mirghafori, Tuomo Pirinen
An efficient repair procedure for quick transcriptions
Anand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde
Tone information as a confidence measure for improving Cantonese LVCSR
Yao Qian, Tan Lee, Frank K. Soong
Temporal variables in parkinsonian speech
Danielle Due
Speaker adaptation of a three-dimensional tongue model
Olov Engwall
Perception of non-native phonemes in noise
Nicole Cooper, Anne Cutler
Intelligibility of degraded speech from smeared STRAIGHT spectrum
Hideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin
Sound source localization based on zero-crosing peak-amplitude coding
Young-Ik Kim, Rhee Man Kil
Adult and infant sensitivity to phonotactic features in spoken Japanese
Kajikawa Sachiyo, Fais Laurel, Amano Shigeaki, Werker Janet
Revisiting dysarthria assessment intelligibility metrics
Phil Green, James Carmichael
The effect of intonation on perception of Cantonese lexical tones
Valter Ciocca, Tara L. Whitehill, Joan K.-Y. Ma
Maximum short quantity in Japanese and finish in two perception tests with F0 and db variants
Toshiko Isei-Jaakkola
Evaluation of an inverse filtering technique using physical modeling of voice production
Paavo Alku, Matti Airas, Brad Story
Positional and phonotactic effects on the realization of taiwan Mandarin tone 2
Hui-ju Hsu, Janice Fon
Speech production based on lossy tube models: unit concatenation and sound transitions
Karl Schnell, Arild Lacroix
Modelling and ranking of differences across formants of british, australian and american accents
Qin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho
An experimental method for measuring transfer functions of acoustic tubes
Tatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto
Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks
Takuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim
Computation of the acoustic characteristics of vocal-tract models with geometrical perturbation
Kunitoshi Motoki, Hiroki Matsuzaki
Analysis of hypernasality by synthesis
P. Vijayalakshmi, M. Ramasubba Reddy
Adaptive long-term predictive analysis of disordered speech
Abdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen
Phoneme restoration in degraded speech communication
Slobodan Jovicic, Sandra Antesevic, Zoran Saric
Automatic detection of vocal fold paralysis and edema
Maria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras
A theoretical analysis of speech recognition based on feature trajectory models
Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri
Discriminative combination of multiple linear predictions for speech recognition
Zhijian Ou, Wang Zuoying
Use of formants in stressed and unstressed continuous speech recognition
Davood Gharavian, Mohammad Ahadi
Integration of articulatory dynamic parameters in HMM/BN based speech recognition system
Konstantin Markov, Satoshi Nakamura, Jianwu Dang
ASR on speech reconstructed from short-time fourier phase spectra
Leigh David Alsteris, Kuldip K. Paliwal
Estimation of semantic confidences on lattice hierarchies
Robert Lieb, Tibor Fabian, Guenther Ruske, Matthias Thomae
Learning subject drift for topic tracking
Fumiyo Fukumoto, Yoshimi Suzuki
The ICSI-SRI-UW metadata extraction system
Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary Harper, Yang Liu
Automatic detection of contrast for speech understanding
Mark Hasegawa-Johnson, Stephen Levinson, Tong Zhang
Integrating layer concept inform ation into n-gram modeling for spoken language understanding
Nick Jui-Chang Wang, Jia-Lin Shen, Ching-Ho Tsai
A robust understanding model for spoken dialogues
Junyan Chen, Ji Wu, Zuoying Wang
Belief-based nonlinear rescoring in Thai speech understanding
Chai Wutiwiwatchai, Sadaoki Furui
An understanding strategy based on plausibility score in recognition history using CSR confidence measure
Toshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi
Speech recognition error correction using maximum entropy language model
Sangkeun Jung, Minwoo Jeong, Gary Geunbae Lee
Discriminative training of compound-word based multinomial classifiers for speech routing
Xiang Li, Juan Huerta
An information extraction approach for spoken language understanding
Jihyun Eun, Changki Lee, Gary Geunbae Lee
A maximum entropy shallow functional parser for spoken language understanding
David Horowitz, Partha Lal, Pierce Gerard Buckley
Mixture language models for call routing
Qiang Huang, Stephen Cox
Speech act identification using an ontology-based partial pattern tree
Chung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen
Creating speech recognition grammars from regular expressions for alphanumeric concepts
Ye-Yi Wang, Yun-Cheng Ju
Poetry assistant
Isabel Trancoso, Paulo Araujo, Ceu Viana, Nuno Mamede
Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers
Tasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo
Robust dependency parsing of spontaneous Japanese speech and its evaluation
Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki
Strategies for optimizing a stochastic spoken natural language parser
Wolfgang Minker, Dirk Buehler, Christiane Beuschel
Prolongation in spontaneous Mandarin
Tzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund
Speech intention understanding based on decision tree learning
Yuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki
Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants
Satanjeev Banerjee, Alexander Rudnicky
An acoustic study of emotions expressed in speech
Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan, Carlos Busso
Topic classification and verification modeling for out-of-domain utterance detection
Tatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura
Partially lexicalized parsing model utilizing rich features
So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo-Hong Kim
Clustering similar nouns for selecting related news articles
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi
Chinese text word-segmentation considering semantic links among sentences
Leonardo Badino
Syllable-based probabilistic morphological analysis model of Korean
Do-Gil Lee, Hae-Chang Rim
Scoring unknown speaker clustering : VB vs. BIC
Fabio Valente, Christian Wellekens
Speaker segmentation and clustering in meetings
Qin Jin, Tanja Schultz
Speaker diarization from speech transcripts
Lori Lamel, Jean-Luc Gauvain, Leonardo Canseco-Rodriguez
Evolutive speaker segmentation using a repository system
Xavier Anguera Miro, Javier Hernando Pericas
Speaker indexing in audio archives using test utterance Gaussian mixture modeling
Hagai Aronowitz, David Burshtein, Amihood Amir
Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition
Antoine Raux
Scalable distributed speech recognition using multi-frame GMM-based block quantization
Kuldip K. Paliwal, Stephen So
Robust speech recognition over packet networks: an overview
Naveen Srinivasamurthy, Kyu Jeong Han, Shrikanth Narayanan
Theory for speaker recognition over IP
Thomas Eriksson, Samuel Kim, Hong-Goo Kang, Chungyong Lee
Voice portal services in packet network and voIP environment
Wu Chou, Feng Liu
Synchronization of speaker selection for centralized tandem free voIP conferencing
Peter Kabal, Colm Elliott
Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networks
Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo
Comparison of transmitter - based packet-loss recovery techniques for voice transmission
Moo Young Kim, W. Bastiaan Kleijn
Context dependent "long units" for speech recognition
Denis Jouvet, Ronaldo Messina
Rapid EM training based on model-integration
Shinichi Yoshizawa, Kiyohiro Shikano
Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system
Dominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara
A statistical discrimination measure for hidden Markov models based on divergence
Jorge Silva, Shrikanth Narayanan
A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition
Jan Stadermann, Gerhard Rigoll
Data driven number-of-states selection in HMM topologies
Dirk Knoblauch
Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizers
Youngkyu Cho, Sung-a Kim, Dongsuk Yook
Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model format
Peder Olsen, Karthik Visweswariah
Feature-based pronunciation modeling with trainable asynchrony probabilities
Karen Livescu, James Glass
Maximum entropy direct model as a unified model for acoustic modeling in speech recognition
Hong-Kwang Jeff Kuo, Yuqing Gao
Explicit duration modeling for Cantonese connected-digit recognition
Yu Zhu, Tan Lee
Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems
Arthur Chan, Ravishankar Mosur, Alexander Rudnicky, Jahanzeb Sherwani
Compact acoustic model for embedded implementation
Junho Park, Hanseok Ko
Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach
Takatoshi Jitsuhiro, Satoshi Nakamura
Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognition
Panu Juhani Somervuo
Discriminative training with tied covariance matrices
Wolfgang Macherey, Ralf Schlüter, Hermann Ney
Acoustic phonetic modeling using local codebook features
Frank Diehl, Asuncion Moreno
An efficient codebook design in SDCHMM for mobile communication environments
Gue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh
Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models
Makoto Shozakai, Goshu Nagino
Context dependent phoneme duration modeling with tree-based state tying
Myoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee
Towards better understanding of the model implied by the use of dynamic features in HMMs
John Scott Bridle
Chinese prosody phrase break prediction based on maximum entropy model
Jian-Feng Li, Guo-Ping Hu, Renhua Wang
Intonation modeling for indian languages
Krothapalli Sreenivasa Rao, Bayya Yegnanarayana
Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework
Yu Zheng, Gary Geunbae Lee, Byeongchang Kim
Using part-of-speech for predicting phrase breaks
Ian Read, Stephen Cox
A proposal to quantitatively select the right intonation unit in data-driven intonation modeling
David Escudero-Mancebo, Valentin Cardenoso-Payo
Formulating contextual tonal variations in Mandarin
Jinfu Ni, Hisashi Kawai, Keikichi Hirose
Automatic adaptation of the momel F0 stylisation algorithm to new corpora
Salma Mouline, Olivier Boeffard, Paul Bagshaw
Joint extraction and prediction of fujisaki's intonation model parameters
Pablo Daniel Aguero, Klaus Wimmer, Antonio Bonafonte
Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesis
Panagiotis Zervas, Nikos Fakotakis, George Kokkinakis, George Kouroupetroglou, Gerasimos Xydas
The duration of pitch transition phase and its relative factors
Ziyu Xiong, Juanwen Chen
Polynomial regression model for duration prediction in Mandarin
Yu Hu, Renhua Wang, Lu Sun
Prediction of the glottal LF parameters using regression trees
Michelle Tooher, John G. McKenna
Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rate
Volker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner
Analysis of F0 contours of Cantonese utterances based on the command-response model
Wentao Gu, Keikichi Hirose, Hiroya Fujisaki
Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French
Marion Dohen, Helene Loevenbruck
Duration modeling for hindi text-to-speech synthesis system
Sridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan
A new prosodic phrasing model for indian language telugu
Nemala Sridhar Krishna, Hema A. Murthy
Evolutionary optimization of an adaptive prosody model
Oliver Jokisch, Michael Hofmann
An intonation model for embedded devices based on natural F0 samples
Gerasimos Xydas, Georgios Kouroupetroglou
Prosodic characteristics of czech contrastive topic
Katerina Vesela, Nino Peterek, Eva Hajicova
Combination of standard and throat microphones for robust speech recognition in highly noisy environments
Martin Graciarena, Federico Cesari, Horacio Franco, Greg Myers, Cregg Cowan, Victor Abrash
Noise robust digit recognition using a glottal radar sensor for voicing detection
Cenk Demiroglu, Anderson David
A cepstral domain maximum likelihod beamformer for speech recognition
Dominik Raub, John McDonough, Matthias Wöfel
Recognition of three simultaneous utterance of speech by four-line directivity microphone mounted on head of robot
Naoya Mochiki, Tetsunori Kobayashi, Toshiyuki Sekiya, Tetsuji Ogawa
Complex spectrum circle centroid for microphone-array-based noisy speech recognition
Shigeki Sagayama, Okajima Takashi, Kamamoto Yutaka, Nishimoto Takuya
Automatic speech recognition of co-channel speech: integrated speaker and speech recognition approach
Larry Heck, Mark Mao
A first experience on multilingual acoustic modeling of the languages spoken in morocco
José B. Marino, Asuncion Moreno, Albino Nogueiras
Data driven multidialectal phone set for Spanish dialects
Monica Caballero, Asuncion Moreno, Albino Nogueiras
Multilingual e-mail text processing for speech synthesis
Daniela Oria, Akos Vetek
Multi-context rules for phonological processing in polyglot TTS synthesis
Harald Romsdorfer, Beat Pfister
A general approach to TTS reading of mixed-language texts
Leonardo Badino, Claudia Barolo, Silvia Quazza
Context dependent statistical augmentation of persian transcripts
Panayiotis G. Georgiou, Shrikanth S. Narayanan, Hooman Shirani Mehr
A soft decision MMSE amplitude estimator as a noise preprocessor to speech coder s using a glottal sensor
Cenk Demiroglu, David V. Anderson
Single acoustic-channel speech enhancement based on glottal correlation using non-acoustic sensor
Rongqiang Hu, David V. Anderson
In-vehicle based speech processing for hearing impaired subjects
Xianxian Zhang, John H. L. Hansen, Kathryn Arehart, Jessica Rossi-Katz
Speech enhancement using adaptive time-domain segmentation
Sriram Srinivasan, W. Bastiaan Kleijn
Harmonicity based monaural speech dereverberation with time warping and F0 adaptive window
Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham S. Zolfaghari
Dereverberation of speech signals based on linear prediction
Marc Delcroix, Takafumi Hikichi, Masato Miyoshi
Perception of affect in speech - towards an automatic processing of paralinguistic information in spoken conversation
Nick Campbell
Analysis of emotional speech in voice mail messages: the influence of speakers' gender
Noel Chateau, Valerie Maffiolo, Christophe Blouin
Emotion recognition based on phoneme classes
Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan
Visualizing dynamic features of expressions in speech
Peter Robinson, Tal Sobol Shikler
Friendly speech analysis and perception in standard Chinese
Aijun Li, Haibo Wang
Decomposing linguistic and affective components of phonatory quality
Ailbhe Ní Chasaide, Christer Gobl
Classifying emotion in Chinese speech by decomposing prosodic features
Dan-Ning Jiang, Lian-Hong Cai
Detecting user engagement in everyday conversations
Chen Yu, Paul Aoki, Allison Woodruff
Identifying emotion in speech prosody using acoustical cues of harmony
Takashi Fujisawa, Norman D. Cook
Context based emotion detection from text input
Jianhua Tao
Complex emotion recognition system for a specific user using SOM based on prosodic features
Atsushi Iwai, Yoshikazu Yano, Shigeru Okuma
Emotion verification for emotion detection and unknown emotion rejection
Hoon-Young Cho, Kaisheng Yao, Te-Won Lee
Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesis
Keikichi Hirose
Continuous speech recognition using joint features derived from the modified group delay function and MFCC
Rajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde
Phase-space representation of speech
Hua Yu
The modified group delay feature: a new spectral representation of speech
Hema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde
ICA-based feature extraction for phoneme recognition
Oh-Wook Kwon, Te-Won Lee
On using MLP features in LVCSR
Qifeng Zhu, Barry Chen, Nelson Morgan, Andreas Stolcke
Learning long-term temporal features in LVCSR using neural networks
Barry Chen, Qifeng Zhu, Nelson Morgan
Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition
T. V. Sreenivas, G. V. Kiran, A. G. Krishna
An adaptive MEL-LPC analysis for speech recognition
Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada
Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition
Kentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami
A new acoustic measure for aspiration noise detection
Carlos Toshinori Ishi
Synthesizing speech from speech recognition parameters
Kris Demuynck, Oscar Garcia, Dirk Van Compernolle
LP-TRAP: linear predictive temporal patterns
Marios Athineos, Hynek Hermansky, Daniel P.W. Ellis
Parallel feature generation based on maximizing normalized acoustic likelihood
Xiang Li, Richard Stern
An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments
Kun-Ching Wang
Improved voice activity detection combining noise reduction and subband divergence measures
Javier Ramirez, José Carlos Segura, Carmen Benitez, Angel de la Torre, Antonio Rubio
Voice activity detection using global soft decision with mixture of Gaussian model
Kiyoung Park, Changkyu Choi, Jeongsu Kim
Environmental robust features for speech detection
Thomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros
Crosscorrelation-based multispeaker speech activity detection
Kornel Laskowski, Qin Jin, Tanja Schultz
Improved robustness of time-frequency principal components (TFPC) by synergy of methods in different domains
Shang-nien Tsai
A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech
Li Deng, Yu Dong, Alex Acero
DWT-based classification of acoustic-phonetic classes and phonetic units
Gernot Kubin, Van Tuan Pham
Learning nonnegative features of spectro-temporal sounds for classification
Yong-Choon Cho, Seungjin Choi
N-gram language modeling of Japanese using bunsetsu boundaries
Sungyup Chung, Keikichi Hirose, Nobuaki Minematsu
Dynamic language modeling for broadcast news
Langzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda
A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects"
Ren-Yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu
The influence of target size and distance on the production of speech and gesture in multimodal referring expressions
Ielka van der Sluis, Emiel Krahmer
Dynamic time windows for multimodal input fusion
Anurag Kumar Gupta, Tasos Anastasakos
MICot : a tool for multimodal input data collection
Raymond H. Lee, Anurag Kumar Gupta
Simulating multimodal applications
Chakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Levy
A multimodal communication aid for global aphasia patients
Jakob Schou Pedersen, Paul Dalsgaard, Borge Lindberg
Mis-recognized utterance detection using hierarchical language model
Hirofumi Yamamoto, Genichiro Kikui, Yoshinori Sagisaka
Cross-lingual phoneme mapping for multilingual synthesis systems
Marko Moberg, Kimmo Parssinen, Juha Iso-Sipila
Robot motion control using listener's back-channels and head gesture information
Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi
Indonesian speech recognition for hearing and speaking impaired people
Sakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol
A two phase arabic language model for speech recognition and other language applications
Mohsen Rashwan
Language model adaptation based on PLSA of topics and speakers
Yuya Akita, Tatsuya Kawahara
Unified language modeling using finite-state transducers with first applications
Hans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz
Effects of language modeling on speech-driven question answering
Katsunobu Itou, Atsushi Fujii, Tomoyosi Akiba
Measuring convergence in language model estimation using relative entropy
Abhinav Sethy, Shrikanth Narayanan, Bhuvana Ramabhadran
High-level feature weighted GMM network for audio stream classification
Rongqing Huang, John H. L. Hansen
An improved preprocessor for the automatic transcription of broadcast news audio stream
Jindrich Zdansky, Petr David, Jan Nouza
Speaker-and-environment change detection in broadcast news using the common component GMM-based divergence measure
Yih-Ru Wang, Chi-Han Huang
Beginning of utterance detection algorithm for low complexity ASR engines
Tommi Lahti
Convolutional networks for speech detection
Somsak Sukittanon, Arun C. Surendran, John C. Platt, Chris J.C. Burges
Detection of vowel on set points in continuous speech using autoassociative neural network models
Suryakanth V. Gangashetty, Chellu Chandra Sekhar, Bayya Yegnanarayana
Reconstruction filter design for bone-conducted speech
Toshiki Tamiya, Tetsuya Shimamura
Frequency warped ARMA analysis of the closed and the open phase of voiced speech
Pedro J. Quintana-Morales, Juan L. Navarro-Mesa
Zeros of z-transform (ZZT) decomposition of speech for source-tract separation
Boris Doval, Baris Bozkurt, Christophe D'Alessandro, Thierry Dutoit
Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speech
Li Deng, Roberto Togneri
Graphical model approach to pitch tracking
Xiao Li, Jonathan Malkin, Jeff Bilmes
A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation
Bo Xu, Jianhua Tao, Yongguo Kang
A concurrent curve strategy for formant tracking
Yves Laprie
A formant tracking LP model for speech processing
Qin Yan, Esfandiar Zavarehei, Saeed Vaseghi, Dimitrios Rentzos
Application of long-term filtering to formant estimation
Hong You
A method for glottal formant frequency estimation
Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe D'Alessandro
Improved differential phase spectrum processing for formant tracking
Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe D'Alessandro
MAP prediction of pitch from MFCC vectors for speech reconstruction
Xu Shao, Ben P. Milner
New harmonicity measures for pitch estimation and voice activity detection
An-Tze Yu, Hsiao-Chuan Wang
Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filtering
Takuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka
Automatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signals
Attila Ferencz, Jeongsu Kim, Yong-Beom Lee, Jae-Won Lee
On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech input
Federico Flego, Luca Armani, Maurizio Omologo
A minimum mean squared error estimator for single channel speaker separation
Aarthi M. Reddy, Bhiksha Raj
Audio source separation from the mixture using empirical mode decomposition with independent subspace analysis
Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu
Audio watermarking in sub-band signals using multiple echo kernels
In-Jung Oh, Hyun-Yeol Chung, Jae-Won Cho, Ho-Youl Jung, R. Prost
A piecewise interpolation method based on log-least square error criterion for HRTF
Jie Zhang, Zhenyang Wu
Modified realizable frequency warped ARMA modeling and its application in synthesis structures for voiced speech
Juan L. Navarro-Mesa, Pedro J. Quintana-Morales
Time-scaling of speech using independent subspace analysis
R. Muralishankar, A. G. Ramakrishnan, Lakshmish N. Kaushik
Long term modeling of phase trajectories within the speech sinusoidal model framework
Laurent Girin, Mohammad Firouzmand, Sylvain Marchand
An acoustic shock limiting algorithm using time and frequency domain speech features
Tina Soltani, Dave Hermann, Etienne Cornu, Hamid Sheikhzadeh, Rob Brennan
Speech probability distribution based on generalized gama distribution
Jong Won Shin, Joon-Hyuk Chang, Nam Soo Kim
Stop consonant classification by dynamic formant trajectory
Yanli Zheng, Mark Hasegawa-Johnson, Sarah Borys
Estimating detailed spectral envelopes using articulatory clustering
Yoshinori Shiga, Simon King
From real-time MRI to 3d tongue movements
Olov Engwall
Coarticulatory variability and directionality in [s,..]: an EPG study
Mitsuhiro Nakamura
Flow representation through the glottis having a polygonal boundary shape
Yosuke Tanabe, Tokihiko Kaburagi
Analysis of the voice source in different phonation types: simultaneous high-sped imaging of the vocal fold vibration and glottal inverse filtering
Hannu Pulakka, Paavo Alku, Svante Granqvist, Stellan Hertegard, Hans Larsson, Anne-Maria Laukkanen, Per-Ake Lindestad, Erkki Vilkman
Influence of temporal discretization schemes on formant frequencies and bandwidths in time domain simulations of the vocal tract system
Peter Birkholz, Dietmar Jackel
Acoustic-to-articulatory inversion mapping with Gaussian mixture model
Tomoki Toda, Alan Black, Keiichi Tokuda
Audio-visual spoken language processing
Jinyoung Kim, Jeesun Kim, Chris Davis
Issues in the development of auditory-visual speech perception: adults, infants, and children
Kaoru Sekiyama, Denis Burnham
Signaling and detecting uncertainty in audiovisual speech by children and adults
Emiel Krahmer, Marc Swerts
Effect of intensive audiovisual perceptual training on the perception and production of the /l/-/r/ contrast for Japanese learners of English
Valerie Hazan, Anke Sennema, Andrew Faulkner
Visual recalibration of auditory speech versus selective speech adaptation: different build-up courses
Jean Vroomen, Sabine van Linden, Beatrice de Gelder, Paul Bertelson
Of the top of the head: audio-visual speech perception from the nose up
Chris Davis, Jeesun Kim
Aspects of speaking-face data corpus design methodology
J. Bruce Millar, Michael Wagner, Roland Goecke
Modeling audio-visual speech perception: back on fusion architectures and fusion control
Jean-Luc Schwartz, Marie Cathiard
Neurocognition of speech-specific audiovisual perception
Mikko Sams, Ville Ojanen, Jyrki Tuomainen, Vasily Klucharev
Target practice on talking faces
Adriano Vilela Barbosa, Eric Vatikiotis-Bateson, Andreas Daffertshofer
Audiovisual perceptual evaluation of resynthesised speech movements
Matthias Odisio, Gérard Bailly
Video-realistic synthetic speech with a parametric visual speech synthesizer
Sascha Fagel
Mutual information based visual feature selection for lipreading
Patricia Scanlon, Gerasimos Potamianos, Vit Libal, Stephen M. Chu
AVICAR: audio-visual speech corpus in a car environment
Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas Huang
Adaptive classifier cascade for multimodal speaker identification
Engin Erzin, Yucel Yemez, A. Murat Tekalp
Use of visual cues in the perception of a labial/labiodental contrast by Spanish-L1 and Japanese-L1 learners of English
Midori Iba, Anke Sennema, Valerie Hazan, Andrew Faulkner
Audio-visual SPeaker localization for car navigation systems
Xianxian Zhang, Kazuya Takeda, John H. L. Hansen, Toshiki Maeno
Automatic lips reading for audio-visual speech processing and recognition
Josef Chaloupka
liveness verification in audio-video authentication
Michael Wagner, Girija Chetty
Speech recognition using motion based lipreading
Maria José Sanchez Martinez, Juan Pablo de la Cruz Gutierrez
Comparative study of linear and non-linear models for viseme in version: modeling of a cortical associative function
Frédéric Berthommier
3d lip-tracking for audio-visual speech recognition in real applications
Petr Cisar, Zdenek Krnoul, Milos Zelezny
The audio-video australian English speech data corpus AVOZES
J. Bruce Millar, Roland Goecke
Correcting Korean vowel speech recognition errors with limited lip features
Ki-Hyung Hong, Yong-Ju Lee, Jae-Young Suh, Kyong-Nim Lee
Segmental differences in the visual contribution to speech inteligibility
Kuniko Nielsen
Voice conversion for unknown speakers
Hui Ye, Steve Young
Domain adaptation methods in the IBM trainable text-to-speech system
Volker Fischer, Jaime Botella Ordinas, Siegfried Kunzmann
Applying pitch connection control in Mandarin speech synthesis
Yi Zhou, Yiqing Zu, Zhenli Yu, Dongjian Yue, Guilin Chen
A first step towards text-independent voice conversion
Hermann Ney, David Suendermann, Antonio Bonafonte, Harald Hoege
Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems
Zhenli Yu, Kaizhi Wang, Yiqing Zu, Dongjian Yue, Guilin Chen
Subjective evaluation of join cost functions used in unit selection speech synthesis
Jithendra Vepa, Simon King
Constructing emotional speech synthesizers with limited speech database
Heiga Zen, Tadashi Kitamura, Murtaza Bulut, Shrikanth Narayanan, Ryosuke Tsuzuki, Keiichi Tokuda
A two-phase pitch marking method for TD-PSOLA synthesis
Cheng-Yuan Lin, Jyh-Shing Roger Jang
Including dynamic and phonetic information in voice conversion systems
Antonio Bonafonte, Alexander Kain, Jan van Santen, Helenca Duxans
A novel voice conversion system based on codebook mapping with phoneme-tied weighting
Zixiang Wang, Renhua Wang, Zhiwei Shuang, Zhenhua Ling
Compression of speech database by feature separation and pattern clustering using STRAIGHT
Zhenhua Ling, Yu Hu, Zhiwei Shuang, Renhua Wang
Decision-tree backing-off in HMM-based speech synthesis
Shunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, Tadashi Kitamura
Using a depth-restricted search to reduce delays in unit selection
Nobuyuki Nishizawa, Hisashi Kawai
MLLR adaptation for hidden semi-Markov model based speech synthesis
Junichi Yamagishi, Takashi Masuko, Takao Kobayashi
Phoxsy: multi-phone segments for unit selection speech synthesis
Stefan Breuer, Julia Abresch
Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS
Francesc Alias, Xavier Llora, Ignasi Iriondo, Joan Claudi Socoro, Xavier Sevillano, Lluis Formiga
A voice conversion method based on joint pitch and spectral envelope transformation
Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel
Fast GMM-based voice conversion for text-to-speech synthesis systems
Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel
A genetic algorithm for unit selection based speech synthesis
Rohit Kumar
A memory efficient grapheme-to-phoneme conversion system for speech processing
Jun Huang, Lex Olorenshaw, Gustavo Hernandez-Abrego, Lei Duan
Automatic pruning of unit selection speech databases for synthesis without loss of naturalness
Rohit Kumar, S. Prahallad Kishore
A database design for a TTS synthesis system using lexical diphones
Tanya Lambert, Andrew Breen
A family-of-models approach to HMM-based segmentation for unit selection speech synthesis
John Kominek, Alan W Black
Mutual-information based segment pre-selection in concatenative text-to-speech
Wei Zhang, Ling Jin, Xijun Ma
Hidden semi-Markov model based speech synthesis
Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura
DFW-based spectral smoothing for concatenative speech synthesis
Hartmut R. Pfitzinger
Korean prosody generation and artificial neural networks
Kyung-Joong Min, Un-Cheon Lim
A prosodic phrasing model for a Korean text-to-speech synthesis system
Kyuchul Yoon
A comparison of statistical methods and features for the prediction of prosodic structures
Qin Shi, Volker Fischer
Letter-to-sound for small-footprint multilingual TTS engine
Guilin Chen, Ke-Song Han
Grapheme-to-phoneme conversion for Chinese text-to-speech
Jun Xu, Guohong Fu, Haizhou Li
XML representation languages as a way of interconnecting TTS modules
Marc Schröder, Stefan Breuer
Approach to interchange-format based Chinese generation
Wenjie Cao, Chengqing Zong, Bo Xu
Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis
Enrico Zovato, Stefano Sandri, Silvia Quazza, Leonardo Badino
Number of output nodes of artificial neural networks for Korean prosody generation
Kyung-Joong Min, Chan-Goo Kang, Un-Cheon Lim
A Korean grapheme-to-phoneme conversion system using selection procedure for exceptions
Sunhee Kim, Ju-Eun Ahn, Soon-Hyob Kim, Yang-Hee Lee
Synthesis of vowels and tones in Thai language by articulatory modeling
Thanate Khaorapapong, Montri Karnjanadecha, Keerati Inthavisas
Source-filter separation for articulation-to-speech synthesis
Yoshinori Shiga, Simon King
Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabet
Asano Hisako, Nakajima Hideharu, Mizuno Hideyuki, Oku Masahiro
Inexactness and robustness in cepstral-to-formant transformation of spoken and sung vowels
Frantz Clermont, Thomas John Millhouse
Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice
Takeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi
Statistical corpus-based speech segmentation
Vincent Pollet, Geert Coorman
Recent improvements on ARTIC: czech text-to-speech system
Jindrich Matousek, Jan Romportl, Daniel Tihelka, Zbynek Tychtl
Learning for transliteration of arabic-numeral expressions using decision tree for Korean TTS
HyeonSook Nam, Youngim Jung, Donghun Lee, Hyuk-chul Kwon, Aesun Yoon
How to integrate phonetic and linguistic knowledge in a text-to-phoneme conversion task: a syllabic TPC tool for French
Nicole Beringer
Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis system
Wael Hamza, Ellen Eide, Raimo Bakis
High quality text-to-pinyin conversion using two-phase unknown word prediction
Juhong Ha, Yu Zheng, Gary Geunbae Lee, Yoon-Suk Seong, Byeongchang Kim
Pronunciation lexicon adaptation for TTS voice building
Yeon-Jun Kim, Ann Syrdal, Alistair Conkie
Improving letter-to-pronunciation accuracy with automatic morphologically-based stress prediction
Gabriel Webster
The IBM expressive speech synthesis system
Wael Hamza, Ellen Eide, Raimo Bakis, Michael Picheny, John Pitrelli
What concept-to-speech can gain for prosody
Markus Schnell, Rüdiger Hoffmann
Dependency structure analysis and sentence boundary detection in spontaneous Japanese
Tatsuya Kawahara, Kiyotaka Uchimoto, Hitoshi Isahara, Kazuya Shitaoka
Statistical feature language model
Salma Jamoussi, David Langlois, Jean-Paul Haton, Kamel Smaili
Vocabulary and language model adaptation using information retrieval
Brigitte Bigi, Yan Huang, Renato De Mori
Word n-gram probability estimation from a Japanese raw corpus
Shinsuke Mori, Daisuke Takuma
Mining of association patterns for language modeling
Jen-Tzung Chien, Hung-Ying Chen
On latent semantic language modeling and smoothing
Jen-Tzung Chien, Meng-Sung Wu, Hua-Jui Peng
Conditional maximum likelihood estimation for improving annotation performance of n-gram models incorporating stochastic finite state grammars
Vaibhava Goel
Fast parameter estimation for joint maximum entropy language models
Edward James Schofield
Morphology-based language modeling for arabic speech recognition
Dimitra Vergyri, Katrin Kirchhoff, Kevin Duh, Andreas Stolcke
Speech enhanced multi-Span language model
A. Nayeemulla Khan, Bayya Yegnanarayana
Neural network language models for conversational speech recognition
Holger Schwenk, Jean-Luc Gauvain
A PLSA-based language model for conversational telephone speech
David Mrva, Philip C. Woodland
Segmentation and relevance measure for speaker verification
Jerome Louradour, Regine André-Obrecht, Khalid Daoudi
A new nonlinear feature extraction algorithm for speaker verification
Mohamed Chetouani, Bruno Gas, Jean-Luc Zarader, Marcos Faundez-Zanuy
SVM modeling of "SNERF-grams" for speaker recognition
Elizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, Sachin Kajarekar
SVM kernel adaptation in speaker classification and verification
Purdy Ho, Pedro Moreno
Noise-robust speaker verification using F0 features
Koji Iwano, Taichi Asami, Sadaoki Furui
Eigen-prosody analysis for robust speaker recognition under mismatch handset environment
Zi-He Chen, Yuan-Fu Liao, Yau-Tarng Juang
Triphone-based confidence system for speaker identification
Aaron Lawson, Mark Huggins
Improved model training and automatic weight adjustment for multi-SNR multi-band speaker identification system
Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki
A new approach to channel robust speaker verification via constrained stochastic feature transformation
Man-Wai Mak, Kwok-kwong Yiu, Ming-Cheung Cheun, Sun-Yuan Kung
Best speaker-based structure tree for speaker verification
Chakib Tadj, Christian Gargour, Nabil Badri
Robust speaker identification based on perceptual log area ratio and Gaussian mixture models
David Chow, Waleed Abdulla
Channel frequency response correction for speaker recognition
Stanley Wenndt, Richard Floyd
Unseen handset mismatch compensation based on a priori knowledge interpolation for robust speaker recognition
Jyh-Her Yang, Yuan-Fu Liao
A comparison of soft and hard spectral subtraction for speaker verification
Michael Padilla, Thomas Quatieri
Comparison of several speaker verification procedures based on GMM
Vlasta Radova, Ales Padrta
Improving performance of text-independent speaker identification by utilizing contextual principal curves filtering
Yong Guan, Wenju Liu, Hongwei Qi, Jue Wang
Speaker identification using probabilistic PCA model selection
Jen-Tzung Chien, Chuan-Wei Ting
Text independent speaker recognition using speaker dependent word spotting
Hagai Aronowitz, David Burshtein, Amihood Amir
A study on model-based equal error rate estimation for automatic speaker verification
Hsiao-Chuan Wang, Jyh-Min Cheng
Probabilistic speaker identification with dual penalized logistic regression machine
Tomoko Matsui, Kunio Tanabe
Model quality evaluation during enrolment for speaker verification
Javier R. Saeta, Javier Hernando
Real-time speaker identification
Pasi Frati, Evgeny Karpov, Tomi Kinnunen
Multi-codebook vector quantization algorithm for speaker identification
Mohammed Abu El-Yazeed, Nemat Abdel Kader, Mohammed El-Henawy
Multi-sample fusion with constrained feature transformation for robust speaker verification
Ming-Cheung Cheung, Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung
Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs
Michael Betser, Frédéric Bimbot, Mathieu Ben, Guillaume Gravier
Time -frequency analysis of vocal source signal for speaker recognition
Nengheng Zheng, P. C. Ching, Tan Lee
A novel method for two-speaker segmentation
Rashmi Gangadharaiah, Balakrishnan Narayanaswamy, Narayanaswamy Balakrishnan
Throat microphone signal for speaker recognition
Bayya Yegnanarayana, A. Shahina, M. R. Kesheorey
Posteriori probabilities and likelihoods combination for speech and speaker recognition
Mohamed Faouzi Ben Zeghiba, Hervé Bourlard
The use of typical sequences for robust speaker identification
Mohamed Mihoubi, Douglas O'Shaughnessy, Pierre Dumouchel
A forensic phonetic investigation into the duration and speech rate
KyungHwa Kim
Mixture Gaussian model training against impostor model parameters: an application to speaker identification
T. V. Sreenivas, Sameer Badaskar, Sameer Badaskar
Jacobian adaptation with improved noise reference for speaker verification
Jan Anguita, Javier Hernando, Alberto Abad
Objective wavelet packet features for speaker verification
Mihalis Siafarikas, Todor Ganchev, Nikos Fakotakis
Policy analysis framework for conversational biometrics
Upendra V. Chaudhari, Ganesh N. Ramaswamy
A new score normalization method for speaker verification with virtual impostor model
Woo-Yong Choi, Jung Gon Kim, Hyung Soon Kim, Sung Bum Pan
On the time variability of vocal tract for speaker recognition
Samuel Kim, Thomas Eriksson, Hong-Goo Kang
Distributed speaker recognition
Veena Desai, Hema A. Murthy
Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identification
Pongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen
Distributed speaker recognition using earth mover's distance
Yoshiyuki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren
A forensically-motivated tool for selecting cepstrally-consistent steady-states from non-contemporaneous vowel utterances
Michael Barlow, Mehrdad Khodai-Joopari, Frantz Clermont
Scoring and direct methods for the interpretation of evidence in forensic speaker recognition
Anil Alexander, Andrzej Drygajlo
Efficient online cohort selection method for speaker verification
Tomi Kinnunen, Evgeny Karpov, Pasi Franti
Statistical model migration in speaker recognition
Jiri Navratil, Ganesh N. Ramaswamy, Ran D. Zilca
Latent semantic analysis for speaker recognition
A. Nayeemulla Khan, Bayya Yegnanarayana
Model-based sequential organization for cochannel speaker identification
Yang Shao, DeLiang Wang
Articulatory feature-based conditional pronunciation modeling for speaker verification
Ka-Yee Leung, Man-Wai Mak, Sun-Yuan Kung
A comparison of normalization and training approaches for ASR-dependent speaker identification
Alex Park, Timothy J. Hazen
New background modeling for speaker verification
Dat Tran
A trainable prosodic model: learning the contours implementing communicative functions within a superpositional model of intonation
Gérard Bailly, Bleicke Holm, Veronique Auberge
Fujisaki model based F0 contours in vietnamese TTS
Dung Tien Nguyen, Mai Chi Luong, Bang Kim Vu, Hansjoerg Mixdorff, Huy Hoang Ngo
Estimating speaking rate in spontaneous speech from z-scores of pattern durations
Kazuyuki Ashimura, Hideki Kashioka, Nick Campbell
A style control technique for HMM-based speech synthesis
Takashi Masuko, Takao Kobayashi, Keisuke Miyanaga
Children's emotion recognition in an intelligent tutoring scenario
Mark Hasegawa-Johnson, Stephen Levinson, Tong Zhang
Use of prosodic features for speech recognition
Keikichi Hirose, Nobuaki Minematsu
Transformation-based error correction for speech-to-text systems
Jochen Peters, Christina Drexel
Phone classification in pseudo-euclidean vector spaces
Alexander Gutkin, Simon King
Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation
Grace Chung, Chao Wang, Stephanie Seneff, Ed Filisko, Min Tang
Modeling pronunciation variation using artificial neural networks for English spontaneous speech
Ken Chen, Mark Hasegawa-Johnson
Foreign-accented speaker-independent speech recognition
Stefanie Aalburg, Harald Hoege
Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphone
Panikos Heracleous, Yoshitaka Nakajima, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
Recognition of read and spontaneous children's speech using two new corpora
Martin Russell, Shona D'Arcy, Lit Ping Wong
Articulatory feature recognition using dynamic Bayesian networks
Joe Frankel, Mirjam Wester, Simon King
Predicting word correct rate from acoustic and linguistic confusability
Gies Bouwman, Bert Cranen, Lou Boves
Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition
Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Word confusability prediction in automatic speech recognition
Jan Anguita, Stephane Peillon, Javier Hernando, Alexandre Bramoulle
Adaptation for soft whisper recognition using a throat microphone
Szu-Chen Jou, Tanja Schultz, Alex Waibel
A statistical lexicon for non-native speech recognition
Rainer Gruhn, Konstantin Markov, Satoshi Nakamura
Modeling auxiliary features in tandem systems
Mathew Magimai Doss, Shajith Ikbal, Todd Stephenson, Hervé Bourlard
Survey of spontaneous speech phenomena in a multimodal dialogue system and some implications for ASR
Louis ten Bosch, Lou Boves
Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models
Tobias Cincarek, Rainer Gruhn, Satoshi Nakamura
Coping with disfluencies in spontaneous speech recognition
Frederik Stouten, Jean-Pierre Martens
Speaker model quantization for unsupervised speaker indexing
Soonil Kwon, Shrikanth Narayanan
Investigating automatic recognition of non-native children's speech
Matteo Gerosa, Diego Giuliani
Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detection
Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary Harper
Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergence
Minho Jin, Gyucheol Jang, Sungrack Yun, Chang D. Yoo
Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations
Masataka Goto, Koji Kitayama, Katsunobu Itou, Tetsunori Kobayashi
Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition
Kyong-Nim Lee, Minhwa Chung
Performance of speech recognition and synthesis in packet-based networks
Sebastian Möller, Jan Felix Krebber, Alexander Raake
A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss
Alastair Bruce James, Ben P. Milner, Angel Manuel Gomez
An analysis of packet loss models for distributed speech recognition
Ben P. Milner, Alastair Bruce James
Pronunciation assessment based upon the phonological distortions observed in language learners' utterances
Nobuaki Minematsu
Analysis of the phone level contributions to objective evaluation of English speech by non-natives
Yasuo Suzuki, Yoshinori Sagisaka, Katsuhiko Shirai, Makiko Muto
An interactive English pronunciation dictionary for Korean learners
Chao Wang, Mitchell Peabody, Stephanie Seneff, Jong-mi Kim
Development of the knowledge-based spoken English evaluation system and its application
Seok-Chae Rhee, Jeon G. Park
Theory and data in spoken language assessment
Jared Bernstein, Isabella Barbier, Elizabeth Rosenfeld, John H.A.L. de Jong
Practical use of English pronunciation system for Japanese students in the CALL classroom
Tatsuya Kawahara, Masatake Dantsuji, Yasushi Tsubota
Design strategies for a virtual language tutor
Jonas Beskow, Olov Engwall, Bjorn Granstrom, Preben Wik
Evaluating cognitive load in spoken language interfaces using a dual-task paradigm
Ellen Campana, Michael K. Tanenhaus, James F. Allen, Roger W. Remington
The voice-logbook: integrating human factors for a chronic care system
Lesley-Ann Black, Norman Black, Roy Harper, Michelle Lemon, Michael McTear
Communicative competence and adaptation in a spoken dialogue system
Kristiina Jokinen
Evaluation of the difference between the driving behavior of a speech based and a speech-visual based task of an in-car compute
Zhan Fu, Lay Ling Pow, Fang Chen
Evaluating system metaphors via the speech output of a smart home system
Sebastian Möller, Jan Felix Krebber, Paula M. T. Smeele
Elements of interactivity in telephone conversations
Florian Hammer, Peter Reichl, Alexander Raake
Generating gestures from speech
Ruben San-Segundo, Juan Manuel Montero, Javier Macias-Guarasa, Ricardo de Córdoba, Javier Ferreiros, José Manuel Pardo
Subtopic segmentation in the lecture speech
Noboru Kanedera, Sumida Asuka, Takao Ikehata, Tetsuo Funada
Some articulatory measurements of real sadness
Donna Erickson, Caroline Menezes, Akinori Fujino
Application of voice conversion to hearing-impaired Mandarin speech enhancement
Chen-Long Lee, Wen-Whei Chang, Yuan-Chuan Chiang
A Japanese dialogue-based CALL system with mispronunciation and grammar error detection
Oh Pyo Kweon, Akinori Ito, Motoyuki Suzuki, Shozo Makino
Statistics-based direction finding for training vowels
Cheolwoo Jo, Ilsuh Bak
Reference marking in children's computer-directed speech: an integrated analysis of discourse and gestures
Simona Montanari, Serdar Yildirim, Elaine Andersen, Shrikanth Narayanan
What makes a non-native accent?: a study of Korean English
Jong-mi Kim, Suzanne Flynn
Study on emotional speech features in Korean with its aplication to voice color conversion
Sang-Jin Kim, Kwang-Ki Kim, Minsoo Hahn
Developmental changes in voiced-segment ratio for Japanese infants and parents
Shigeaki Amano, Tomohiro Nakatani, Tadahisa Kondo
Implementation of an intonational quality assessment system for a handheld device
Kisun You, Hoyoun Kim, Wonyong Sung
Characterizing and classifying cued speech vowels from labial parameters
Denis Beautemps, Thomas Burger, Laurent Girin
Cough detection in spoken dialogue system for home health care
Shin-ya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta
Unsupervised learning from users' error correction in speech dictation
Dong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, Li Deng
Robustness aspects of active learning for acoustic modeling
Gerard G. L. Meyer, Teresa M. Kamm
Task adaptation of acoustic and language models based on large quantities of data
Karthik Visweswariah, Ramesh Gopinath, Vaibhava Goel
Unsupervised language model adaptation methods for spontaneous speech
Luc Lussier, Edward W.D. Whittaker, Sadaoki Furui
On-line incremental adaptation based on reinforcement learning for robust speech recognition
Masafumi Nishida, Yoshitaka Mamiya, Yasuo Horiuchi, Akira Ichikawa
Unsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems
Tomohiro Watanabe, Hiromitsu Nishizaki, Takehito Utsuro, Seiichi Nakagawa
Speech coding using trajectory compression and multiple sensors
Sorin Dusan, James Flanagan, Amod Karve, Mridul Balaraman
How sparse can we make the auditory representation of speech?
Christian Feldbauer, Gernot Kubin
Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications
Malah David, Slava Shectman
Perceptual wavelet packet audio coder
Teddy Surya Gunawan, Eliathamby Ambikairajah, Julien Epps
Performance analysis of transcoding algorithms in packet-loss environments
Sung-Kyo Jung, Hong-Goo Kang, Dae-Hee Youn, Chang-Heon Lee
Speech quality estimation using Gaussian mixture models
Tiago Falk, Wai-Yip Chan, Peter Kabal
Why speech recognizers make errors ? a robustness view
Hong Kook Kim, Mazin Rahim
An energy normalization scheme for improved robustness in speech recognition
Mohammad Ahadi, Hamid Sheikhzadeh, Robert Brennan, George Freeman
Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments
Juan Huerta, Etienne Marcheret, Sreeram Balakrishnan
Modeling phones coarticulation effects in a neural network based speech recognition system
Leila Ansary, Seyyed Ali Seyyed Salehi
Error - weighted discriminative training for HMM parameter estimation
Daniel Willett
Robust verification of recognized words in noise
Wai Kit Lo, Frank K. Soong, Satoshi Nakamura
Robust automatic speech recognition using an optimal spectral amplitude estimator algorithm in low-SNR car environments
Zili Li, Hesham Tolba, Douglas O'Shaughnessy
Robust speech recognition using data-driven temporal filters based on independent component analysis
Junhui Zhao, Jingming Kuang, Xiang Xie
Robust distant speech recognition based on position dependent CMN
Norihide Kitaoka, Longbiao Wang, Seiichi Nakagawa
Robust speech recognition based on HMM composition and modified wiener filter
Sumitaka Sakauchi, Yoshikazu Yamaguchi, Satoshi Takahashi, Satoshi Kobashikawa
Feature-dependent compensation in speech recognition
Ivan Brito, Nestor Becerra Yoma, Carlos Molina
Using context to correct phone recognition errors
Stephen Cox
Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation
Yasunari Obuchi
Weighting observation vectors for robust speech recognition in noisy environments
Zhenyu Xiong, Fang Zheng, Wenhu Wu
Hands-free speech recognition using blind source separation post-processed by two-stage spectral subtraction
Masanori Tsujikawa, Ken-ichi Iso
Robust speech recognition with spectral subtraction in low SNR
Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
Active perception: using a priori knowledge from clean speech models to ignore non-target features
Bert Cranen, Johan de Veth
Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition
Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Borge Lindberg
Using linear interpolation to improve histogram equalization for speech recognition
Filipp Korkmazsky, Dominique Fohr, Irina Illina
A factorial HMM aproach to robust isolated digit recognition in background music
Mark Hasegawa-Johnson, Ameya Deoras
Multi-eigenspace normalization for robust speech recognition in noisy environments
Yoonjae Lee, Hanseok Ko
Exploiting models intrinsic robustness for noisy speech recognition
Christophe Cerisara, Dominique Fohr, Odile Mella, Irina Illina
Speech recognition experiments with the SPEECON database using several robust front-ends
Pere Pujol, Jaume Padrell, Climent Nadeu, Dusan Macho
Spectro-temporal activity pattern (STAP) features for noise robust ASR
Shajith Ikbal, Mathew Magimai Doss, Hemant Misra, Hervé Bourlard
Improvement of confidence measure performance using background model set algorithm
Byoung-Don Kim, Jin-Young Kim, Seung-Ho Choi, Young-Bum Lee, Kyoung-Rok Lee
Using RASTA in task independent TANDEM feature extraction
Guillermo Aradilla, John Dines, Sunil Sivadas
A distributed speech recognition system in multi-user environments
Kyu Jeong Han, Shrikanth Narayanan, Naveen Srinivasamurthy
Soft features for improved distributed speech recognition over wireless networks
Reinhold Haeb-Umbach, Valentin Ion
Analysis on disappearing and thriving of speech applications for ergonomic design guidelines and recommendations
Rinzou Ebukuro
Evaluation of the speech output of a smart-home system in a car environment
Paula M. T. Smeele, Sebastian Möller, Jan Felix Krebber
How does the integration of speech recognition controls and spatialized auditory displays affect user workload?
Ellen Haas
Speech interaction system - how to increase its usability?
Fang Chen
Human language acquisition methods in a machine learning task
Nicole Beringer
New challenges in usability evaluation - beyond task-oriented spoken dialogue systems
Laila Dybkjaer, Niels Ole Bernsen, Wolfgang Minker
Using quick transcriptions to improve conversational speech models
Owen Kimball, Chia-lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul
A wizard of oz framework for collecting spoken human-computer dialogs
Rohit Mishra, Elizabeth Shriberg, Sandra Upson, Joyce Chen, Fuliang Weng, Stanley Peters, Lawrence Cavedon, John Niekrasz, Hua Cheng, Harry Bratt
Subjective evaluation of spoken dialogue systems using SER VQUAL method
Mikko Hartikainen, Esa-Pekka Salonen, Markku Turunen
Fiction database for emotion detection in abnormal situations
Ioana Vasilescu, Laurence Devillers, Chloe Clavel, Thibaut Ehrette
Fast semi-automatic semantic annotation for spoken dialog systems
Ruhi Sarikaya, Yuqing Gao, Paola Virga
A study on automatic detection of Japanese vowel devoicing for speech synthesis
Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Renhua Wang
Orientel-turkish: telephone speech database description and notes on the experience
Tolga Ciloglu, Dinc Acar, Ahmet Tokatli
Intertranscriber reliability of prosodic labeling on telephone conversation using toBI
Tae-Jin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson
Efficient compression method for pronunciation dictionaries
Jilei Tian
Construct a multi-lingual speech corpus in taiwan with extracting phonetically balanced articles
Min-siong Liang, Dau-cheng Lyu, Yuang-chin Chiang, Renyuan Lyu
Automatic prosody labeling of read norwegian
Per Olav Heggtveit, Jon Emil Natvig
Towards automatic word segmentation of dialect speech
Eric Sanders, Andrea Diersen, Willy Jongenburger, Helmer Strik
New nonsense syllables database - analyses and preliminary ASR experiments
Petr Fousek, Frantisek Grezl, Hynek Hermansky, Petr Svojanovsky
Speech input and output module assessment for remote access to a smart-home spoken dialog system
Jan Felix Krebber, Sebastian Möller, Alexander Raake
An implement of speech DB gathering system using voiceXML
Dong-Hyun Kim, Yong-Wan Roh, Kwang-Seok Hong
Precise phone boundary detection using wavelet packet and recurrent neural networks
Farshad Almasganj
From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition
Andrew Cameron Morris, Viktoria Maier, Phil Green
Design and construction of Korean-spoken English corpus
Seok-Chae Rhee, Sook-Hyang Lee, Young-Ju Lee, Seok-Keun Kang
Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective
Folkert De Vriend, Giulio Maltese
Spoken language interface in ECMA/ISO telecommunication standards
Kuansan Wang
The efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping
Marelie Davel, Etienne Barnard
Towards a new level of anotation detail of multilingual speech corpora
Anja Geumann
CIAIR in-car speech database
Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura
Investigating speech style specific pronunciation variation in large spoken language corpora
Christophe Van Bael, Henk van den Heuvel, Helmer Strik
The efficient generation of pronunciation dictionaries: human factors during bootstrapping
Marelie Davel, Etienne Barnard
Modeling data entry rates for ASR and alternative input methods
Roger K. Moore
Speech recognition using synchronization between speech and finger tapping
Hiromitsu Ban, Chiyomi Miyajima, Katsunobu Itou, Fumitada Itakura, Kazuya Takeda
Integration patterns during multimodal interaction
Anurag Kumar Gupta, Tasos Anastasakos
Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition
Etienne Marcheret, Stephen M. Chu, Vaibhava Goel, Gerasimos Potamianos
Separation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beam-forming
Changkyu Choi, Donggeon Kong, Hyoung-Ki Lee, Sang Min Yoon
Multimodal expression for humanoid robots by integration of human speech mimicking and facial color
Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino
Towards large vocabulary ASR on embedded platforms
Miroslav Novak
Analysis of in-car speech recognition experiments using a large-scale multi-mode dialogue corpus
Hiroshi Fujimura, Katsunobu Itou, Kazuya Takeda, Fumitada Itakura
On the integration of speech recognition into personal networks
Zheng-Hua Tan, Paul Dalsgaard, Borge Lindberg
Robust speech recognition in client-server scenarios
Richard Rose, Hong Kook Kim
Memory and computation reduction for embedded ASR systems
Sangbae Jeong, Iksang Han, Eugene Jon, Jeongsu Kim
Canonicalization of feature parameters for automatic speech recognition
Takashi Fukuda, Tsuneo Nitta
On binary and ratio time-frequency masks for robust speech recognition
Soundararajan Srinivasan, Nicoleta Roman, DeLiang Wang
New features based on multiple word graphs for utterance verification
Alberto Sanchis, Alfons Juan, Enrique Vidal
Combination of speech features using smoothed heteroscedastic linear discriminant analysis
Lukas Burget
Entropy based combination of tandem representations for noise robust ASR
Shajith Ikbal, Hemant Misra, Sunil Sivadas, Hynek Hermansky, Hervé Bourlard
Fast speech adaptation in linear spectral domain for additive and convolutional noise
Dongsuk Yook, Donghyun Kim
The MIT finite-state transducer toolkit for speech and language processing
Lee Hetherington
Question-answering in webtalk: an evaluation study
Junlan Feng, Srinivas Bangalore, Mazin Rahim
Automatic network optimization of voice applications
Juan Huerta, Chaitanya Ekanadham
Voicebuilder: a framework for automatic speech application development
Miguel Angel Rodriguez-Moreno, Heriberto Cuayahuitl, Juventino Montiel-Hernandez
On the development of telephone applications: some practical issues and evaluation
Andrea Facco, Daniele Falavigna, Roberto Gretter, Marcello Vigano
The GEMINI platform: semi-automatic generation of dialogue applications
Stefan Hamerich, Volker Schless, Basilis Kladis, Volker Schubert, Otilia Kocsis, Stefan Igel, Ricardo de Córdoba, Luis Fernando Dharo, José Manuel Pardo
A packet loss concealment method using recursive linear prediction
Kazuhiro Kondo, Kiyoshi Nakagawa
On a n-gram model approach for packet loss concealment
Minkyu Lee, Imed Zitouni, Qiru Zhou
Efficient vector quantisation of line spectral frequencies using the switched split vector quantiser
Stephen So, Kuldip K. Paliwal
Enhancement of reverberant speech using excitation source information
M. Chaitanya, S. R. M. Prasanna, Bayya Yegnanarayana
Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberation
Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi
Inner product based-multiband vector quantization for wideband speech coding at 16 kbps
Seung Yeol Lee, Nam Soo Kim, Joon-Hyuk Chang
Speech enhancement and recognition by integrating adaptive beamforming and wiener filtering
Alberto Abad, Javier Hernando
Temporal normalization techniques for transform-type speech coding and application to split-band wideband coders
Kyung-Tae Kim, Sung-Kyo Jung, MiSuk Lee, Hong-Goo Kang, Dae Hee Youn
Interface for barge-in free spoken dialogue system using adaptive sound field control
Tatsunori Asai, Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano
Multi-mode harmonic transfrom excitation LPC coding for speech and music
Jong-Hark Kim, Jae-Hyun Shin, In-Sung Lee
Source separation using particle filters
Mital Gandhi, Mark Hasegawa-Johnson
Segmental speech coding model for storage applications
Anssi Ramo, Jani Nurminen, Sakari Himanen, Ari Heikkinen
Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition
Gwo-hwa Ju, Lin-shan Lee
Minimum phase compensation in speech coding using hammerstein model
Jari Juhani Turunen, Juha Tanttu, Frank Cameron
Optimizing regression for in-car speech recognition using multiple distributed microphones
Weifeng Li, Fumitada Itakura, Kazuya Takeda
Speech enhancement based on magnitude estimation using the gamma prior
Weifeng Li, Kazuya Takeda, Fumitada Itakura, Huy Dat Tran
Unscented kalman filtering of line spectral frequencies
Andrew Errity, John McKenna, Stephen Isard
Speech enhancement based on smoothing of spectral noise floor
Hyoung-Gook Kim, Thomas Sikora
Noise reduction using hybrid noise estimation technique and post-filtering
Junfeng Li, Masato Akagi
An adaptive kalman filter for the enhancement of speech signals
Marcel Gabrea
Improved iterative wiener filtering for non-stationary noise speech enhancement
T. V. Sreenivas, K. Sharath Rao, A. Sreenivasa Murthy
Highband spectrum envelope estimation of telephone speech using hard/soft-classification
Yasheng Qian, Peter Kabal
Hidden factor dynamic Bayesian networks for speech recognition
Filipp Korkmazsky, Murat Deviren, Dominique Fohr, Irina Illina
Design of compact acoustic models through clustering of tied-covariance Gaussians
Mark Mao, Vincent Vanhoucke
Model composition by lagrange polynomial approximation for robust speech recognition in noisy environment
Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama
A study of minimum classification error training for segmental switching linear Gaussian hidden Markov models
Jian Wu, Donglai Zhu, Qiang Huo
Speech recognition system robust to noise and speaking styles
Shigeki Matsuda, Takatoshi Jitsuhiro, Konstantin Markov, Satoshi Nakamura
The stochastic weighted viterbi algorithm: a frame work to compensate additive noise and low-bit rate coding distortion
Nestor Becerra Yoma, Ivan Brito, Carlos Molina
Shaping spoken input in user-initiative systems
Stefanie Tomko, Roni Rosenfeld
Etiology of user experience with natural language speech
Christopher Pavlovski, Jennifer Lai, Stella Mitchell
Side effect free dialogue management in a voice enabled procedure browser
Manny Rayner, Beth Ann Hockey
Example-based training of dialogue planning incorporating user and situation models
Ian Richard Lane, Tatsuya Kawahara, Shinichi Ueno
Prosody based attitude recognition with feature selection and its application to spoken dialog system as para-linguistic information
Shinya Fujie, Tetsunori Kobayashi, Daizo Yagi, Hideaki Kikuchi
MS connect: a fully featured auto-attendant: system design, implementation and performance
David Ollason, Yun-Cheng Ju, Siddharth Bhatia, Dan Herron, Jackie Liu
Adaptive beamforming combined with particle filtering for acoustic source localization
Reinhold Haeb-Umbach, Sven Peschke, Ernst Warsitz
Time delay estimation using weighted CPSP function
Hongseok Kwon, Siho Kim, Keunsung Bae
DOA estimation of speech signals using semi-blind source separation techniques
Ilyas Potamitis, Panos Zervas, Nikos Fakotakis
Blind separation of speech and sub-Gaussian signals in underdetermined case
SangGyun Kim, Chang D. Yoo
Adaptive cross-channel interference cancellation on blind signal separation outputs using source absence/presence detection and spectral subtraction
Gil-Jin Jang, Changkyu Choi, Yong-Beom Lee, Yung-Hwan Oh
A comparison of simultaneous 3-channel blind source separation to selective separation on channel pairs using 2-channel BSS
Erik Visser, Kwokleung Chan, Stanley Kim, Te-Won Lee
Towards a harmonious coexistence of spoken and written language
Hyun-Bok Lee
Towards a grammar of spoken language - prosody of ill-formed utterances and listener's understanding in discourse -
Miyoko Sugito
Automatic transformation of lecture transcription into document style using statistical framework
Tatsuya Kawahara, Kazuya Shitaoka, Hiroaki Nanjo
Automatic extraction of phonetically rich sentences from large text corpus of indian languages
Karunesh Arora, Sunita Arora, Kapil Verma, Shyam Sunder Agrawal
European initiatives to promote cooperation between speech and text communities
Nicoletta Calzolari
Evaluation of a threshold for detecting local slower phrases in Japanese spontaneous conversational speech
Keiichi Takamaru
Intonation recognition for indonesian speech based on fujisaki model
Nazrul Effendy, Ekkarit Maneenoi, Patavee Charnvivit, Somchai Jitapunkul
Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features
Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose
Clause types and filed pauses in Japanese spontaneous monologues
Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Nobuaki Minematsu
Effect of voice prosody on the decision making process in human-computer interaction
Yohei Yabuta, Yasuhiro Katagiri, Noriko Suzuki, Yugo Takeuchi
Alignment of human prosodic patterns for spoken dialogue systems
Noriko Suzuki, Yasuhiro Katagiri
Evaluation of a prosodic labeling system utilizing linguistic information
Shinya Kiriyama, Shigeyoshi Kitazawa
Functions of intonation boundaries during spoken language comprehension in English
Allison Blodgett
Voice activation using prosodic features
Marco Kühne, Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann
The role of prosodic cues in word segmentation of Korean
Sahyang Kim
Default phrasing and attachment preference in Korean
Sun-Ah Jun
Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models
Sarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole
The role of pitch range variation in the discourse structure and intonation structure of Korean
Eunjong Kong
Dependency analysis of read Japanese sentences using pause and F0 information: a speaker independent case
Kazuyuki Takagi, Kazuhiko Ozeki
Effects of prosodic boundaries on ambiguous syntactic clause boundaries in Japanese
Shari Speer, Soyoung Kang
The superior effectivenes of the F0 range for identifying the context from sounds without phonemes
Yasuko Nagasaki, Takanori Komatsu
A study of tone classification for continuous Thai speech recognition
Tan Li, Montri Karnjanadecha, Thanate Khaorapapong
An acoustic-analytic role for the deviation between the scansion and reading of poems
Key-Seop Kim, Un Lim, Dong-Il Shin
Estimating syntactic structure from prosodic features in Japanese speech
Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa
Perceptual discrimination of prosodic types and their preliminary acoustic analysis
Masahiko Komatsu, Tsutomu Sugawara, Takayuki Arai
DORIS, a multiagent/IP platform for multimodal dialogue applications
Johann L'Hour, Olivier Boeffard, Jacques Siroux, Laurent Miclet, Francis Charpentier, Thierry Moudenc
EVITA-RAD: an extensible enterprise voice porTAI - rapid application development tool
Yu Chen
Strategies to reduce design time in multimodal/multilingual dialog applications
Luis F. D'Haro, Ricardo de Córdoba, Ruben San-Segundo, Juan Manuel Montero, Javier Macias-Guarasa, José Manuel Pardo
Three-way system-user-expert interactions help you expand the capabilities of an existing spoken dialogue system
Gregory Aist
Florence: a dialogue manager framework for spoken dialogue systems
Giuseppe Di Fabbrizio, Charles Lewis
Recent progress of open-source LVCSR engine julius and Japanese model repository
Tatsuya Kawahara, Akinobu Lee, Kazuya Takeda, Katsunobu Itou, Kiyohiro Shikano
Example-based spoken dialogue system with online example augmentation
Hiroya Murao, Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Yasuyoshi Inagaki
Enhancing existing form-based dialogue managers with reasoning capabilities
Dirk Bühler
Robust and adaptive architecture for multilingual spoken dialogue systems
Markku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen
Towards ubiquitous task management
Porfirio Filipe, Nuno Mamede
Article |
---|