doi: 10.21437/Interspeech.2007
ISSN: 2958-1796
On organic interfaces
Victor Zue
The neural basis of speech perception - a view from functional imaging
Sophie K. Scott
Computer-supported human-human multilingual communication
Alex Waibel, Keni Bernardin, Matthias Wölfel
Self-organization in the evolution of shared systems of speech sounds: a computational study
Pierre-Yves Oudeyer
Soft margin feature extraction for automatic speech recognition
Jinyu Li, Chin-Hui Lee
A fast optimization method for large margin estimation of HMMs based on second order cone programming
Yan Yin, Hui Jiang
Frame margin probability discriminative training algorithm for noisy speech recognition
Hao-Zheng Li, Douglas O'Shaughnessy
Hierarchical neural networks feature extraction for LVCSR system
Fabio Valente, Jithendra Vepa, Christian Plahl, Christian Gollan, Hynek Hermansky, Ralf Schlüter
Bhattacharyya error and divergence using variational importance sampling
Peder A. Olsen, John R. Hershey
Phoneme dependent frame selection preference
Tingyao Wu, Jacques Duchateau, Dirk Compernolle
An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI
Xinhui Zhou, Carol Y. Espy-Wilson, Mark Tiede, Suzanne Boyce
An MRI study of european portuguese nasals
Paula Martins, Inês Carbone, Augusto Silva, António J. S. Teixeira
A four-cube FEM model of the extrinsic and intrinsic tongue muscles to simulate the production of vowel /i/
Sayoko Takano, Hiroki Matsuzaki, Kunitoshi Motoki
Performance evaluation of glottal quality measures from the perspective of vocal tract filter consistency
Juan Torres, Elliot Moore
Statistical identification of critical, dependent and redundant articulators
Veena D. Singampalli, Philip J. B. Jackson
An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping
Chao Qin, Miguel Á. Carreira-Perpiñán
Vocal tract length during speech production
Sorin Dusan
Approximation method of subglottal system using ARMA filter
Nobuhiro Miki, Kyohei Hayashi
Enhancing acoustic-to-EPG mapping with lip position information
Asterios Toutios, Konstantinos Margaritis
A model of glottal flow incorporating viscous-inviscid interaction
Tokihiko Kaburagi, Yosuke Tanabe
Thinking outside the cube: modeling language processing tasks in a multiple resource paradigm
Kilian G. Seeber
Experimental validation of direct and inverse glottal flow models for unsteady flow conditions
Julien Cisonni, Annemie Van Hirtum, Jan Willems, Xavier Pelorson
Effect of unsteady glottal flow on the speech production process
Hideyuki Nomura, Tetsuo Funada
Word stress correlates in spontaneous child-directed speech in German
Katrin Schneider, Bernd Möbius
Acquisition and synchronization of multimodal articulatory data
Michael Aron, Nicolas Ferveur, Erwan Kerrien, Marie-Odile Berger, Yves Laprie
A phonetic concatenative approach of labial coarticulation
Vincent Robert, Yves Laprie, Anne Bonneau
Visual analysis of lip coarticulation in VCV utterances
Aseel Turkmani, Adrian Hilton, Philip J. B. Jackson, James Edge
Comparison of multiple voice source parameters in different phonation types
Matti Airas, Paavo Alku
Acoustic and affective comparisons of natural and imaginary infant-, foreigner- and adult-directed speech
Monja Knoll, Lisa Scharrer
Vowel production in two occlusal classes
André Araújo, Luis M. T. Jesus, Isabel M. Costa
Nepalese retroflex stops: a static palatography study of inter- and intra-speaker variability
Rajesh Khatiwada
Effects of testosterone levels on temporal and intonational aspects of speech: more exploratory data
Charles A. Lamoureux, Victor J. Boucher
Fixed-size kernel logistic regression for phoneme classification
Peter Karsmakers, Kristiaan Pelckmans, Johan Suykens, Hugo Van hamme
A multiple-model based framework for automatic speech segmentation
Seung Seop Park, Jong Won Shin, Jong Kyu Kim, Nam Soo Kim
Semi-supervised learning of speech sounds
Aren Jansen, Partha Niyogi
Evaluation of syllable stress using single class classifier
Abhinav Parate, Ashish Verma, Jayanta Basak
Distinctive phonetic feature (DPF) based phone segmentation using hybrid neural networks
Mohammad Nurul Huda, Ghulam Muhammad, Junsei Horikawa, Tsuneo Nitta
A methodology for the automatic detection of perceived prominent syllables in spoken French
J. -Ph. Goldman, M. Avanzi, A. -C. Simon, Anne Lacheret, A. Auchlin
Dual-channel acoustic detection of nasalization states
Xiaochuan Niu, Jan P. H. van Santen
Acoustic parameters for the automatic detection of vowel nasalization
Tarun Pruthi, Carol Y. Espy-Wilson
On the use of time-delay neural networks for highly accurate classification of stop consonants
Jun Hou, Lawrence R. Rabiner, Sorin Dusan
A new approach for phoneme segmentation of speech signals
Ladan Golipour, Douglas O'Shaughnessy
Automatically learning the units of speech by non-negative matrix factorisation
Veronique Stouten, Kris Demuynck, Hugo Van hamme
A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech
Ozlem Kalinli, Shrikanth S. Narayanan
Zero-crossing-based ratio masking for sound segregation
Sung Jun An, Young-Ik Kim, Rhee Man Kil
Event detection of speech signals based on auditory processing with a dynamic compressive gammachirp filterbank
Satomi Tanaka, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka
Segmentation of speech: child's play?
Odette Scharenborg, Mirjam Ernestus, Vincent Wan
Dimensionality reduction methods applied to both magnitude and phase derived features
Andrew Errity, John McKenna, Barry Kirkpatrick
Voice source and vocal tract variations as cues to emotional states perceived from expressive conversational speech
Hiroki Mori, Hideki Kasuya
Exploring initiative strategies using computer simulation
Fan Yang, Peter A. Heeman
From one base form to multiple output styles - predicting stylistic dynamics of discourse prosody
Chiu-yu Tseng, Zhao-yu Su
Topic in dialogue: prosodic and syntactic features
Claudia Crocco, Renata Savy
Features of pauses and conjunctions at syntactic and discourse boundaries in Japanese monologues
Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu
Utilizing online content as domain knowledge in a multi-domain dynamic dialogue system
Craig Wootton, Michael McTear, Terry Anderson
Handling speech input in the ritel QA dialogue system
Boris van Schooten, Sophie Rosset, Olivier Galibert, Aurélien Max, Rieks op den Akker, Gabriel Illouz
Online call quality monitoring for automating agent-based call centers
Woosung Kim
Analysis of communication failures for spoken dialogue systems
Sebastian Möller, Klaus-Peter Engelbrecht, Antti Oulasvirta
How to access audio files of large data bases using in-car speech dialogue systems
Sandra Mann, André Berton, Ute Ehrlich
Analyzing temporal transition of real user's behaviors in a spoken dialogue system
Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno
Voicepedia: towards speech-based access to unstructured information
J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero
Exploiting prosodic features for dialog act tagging in a discriminative modeling framework
Vivek Rangarajan, Srinivas Bangalore, Shrikanth S. Narayanan
Using information state to improve dialogue move identification in a spoken dialogue system
Hua Ai, Antonio Roque, Anton Leuski, David Traum
Using multiple strategies to manage spoken dialogue
Shiu-Wah Chu, Ian O'Neill, Philip Hanna
An information state based dialogue manager for a mobile robot
Marcelo Quinderé, Luís Seabra Lopes, António J. S. Teixeira
Automated directory assistance system - from theory to practice
Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, Alex Acero
The voice-rate dialog system for consumer ratings
Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, Alex Acero
The influence of user tailoring and cognitive load on user performance in spoken dialogue systems
Andi Winterboer, Jiang Hu, Johanna D. Moore, Clifford Nass
Confidence measures for voice search applications
Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Geoffrey Zweig, Alex Acero
Effects of quiz-style information presentation on user understanding
Ryuichiro Higashinaka, Kohji Dohsaka, Shigeaki Amano, Hideki Isozaki
A data visualization and analysis method for natural language call routing system design
Hong-Kwang Jeff Kuo, Vaibhava Goel
Discriminative optimization of language adapted HMMs for a language identification system based on parallel phoneme recognizers
Josef G. Bauer, Bernt Andrassy, Ekaterina Timoshenko
Fusion of contrastive acoustic models for parallel phonotactic spoken language identification
Khe Chai Sim, Haizhou Li
Multi-layer kohonen self-organizing feature map for language identification
Liang Wang, Eliathamby Ambikairajah, Eric H. C. Choi
Hierarchical language identification based on automatic language clustering
Bo Yin, Eliathamby Ambikairajah, Fang Chen
Using speech rhythm for acoustic language identification
Ekaterina Timoshenko, Harald Höge
A model-based estimation of phonotactic language verification performance
Ka-keung Wong, Man-hung Siu, Brian Mak
A tagging algorithm for mixed language identification in a noisy domain
Mike Rosner, Paulseph-John Farrugia
Improved language recognition using better phonetic decoders and fusion with MFCC and SDC features
Doroteo T. Toledano, Javier Gonzalez-Dominguez, Alejandro Abejon-Gonzalez, Danilo Spada, Ismael Mateos-Garcia, Joaquin Gonzalez-Rodriguez
An open-set detection evaluation methodology applied to language and emotion recognition
David A. van Leeuwen, Khiet P. Truong
Boosting with anti-models for automatic language identification
Xi Yang, Man-hung Siu, Herbert Gish, Brian Mak
Acoustic language identification using fast discriminative training
Fabio Castaldo, Daniele Colibro, Emanuele Dalmasso, Pietro Laface, Claudio Vair
Spoken language identification using score vector modeling and support vector machine
Ming Li, Hongbin Suo, Xiao Wu, Ping Lu, Yonghong Yan
Language identification based on n-gram frequency ranking
R. Cordoba, L. F. D'Haro, F. Fernandez-Martinez, J. Macias-Guarasa, J. Ferreiros
Improving phonotactic language recognition with acoustic adaptation
Wade Shen, Douglas Reynolds
Syllable lattices as a basis for a children's speech reading tracker
Daniel Bolanos, Wayne Ward, Sarel Van Vuuren, Javier Garrido
Mandarin vowel pronunciation quality evaluation by using formant pattern recognition
Fuping Pan, Qingwei Zhao, Yonghong Yan
Automatic detection and classification of disfluent reading miscues in young children's speech for the purpose of assessment
Matthew Black, Joseph Tepperman, Sungbok Lee, Patti Price, Shrikanth S. Narayanan
Structural assessment of language learners' pronunciation
Nobuaki Minematsu, K. Kamata, Satoshi Asakawa, T. Makino, T. Nishimura, Keikichi Hirose
Enhancing usability of CAPL system for qur'an recitation learning
Abdurrahman Samir, Sherif Mahdy Abdou, Ahmed Husien Khalil, Mohsen Rashwan
Automatic large-scale oral language proficiency assessment
Febe de Wet, Christa van der Walt, Thomas Niesler
Noise-robust hands-free voice activity detection with adaptive zero crossing detection using talker direction estimation
Yuki Denda, Takamasa Tanaka, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita
A robust mel-scale subband voice activity detector for a car platform
A. Álvarez, R. Martínez, P. Gómez, V. Nieto, V. Rodellar
Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio
Kentaro Ishizuka, Tomohiro Nakatani, Masakiyo Fujimoto, Noboru Miyazaki
Feature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition
A. M. Toh, Roberto Togneri, Sven Nordholm
Temporal masking for unsupervised minimum Bayes risk speaker adaptation
Matthew Gibson, Thomas Hain
Speech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments
Tsung-hsueh Hsieh, Jeih-weih Hung
Multiband, multisensor robust features for noisy speech recognition
Dimitrios Dimitriadis, Petros Maragos, Stamatios Lefkimmiatis
Noise robust speech recognition for voice driven wheelchair
Akira Sasou, Hiroaki Kojima
Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions
Yu Hu, Qiang Huo
On the jointly unsupervised feature vector normalization and acoustic model compensation for robust speech recognition
Luis Buera, Antonio Miguel, Eduardo Lleida, Óscar Saz, Alfonso Ortega
An ensemble modeling approach to joint characterization of speaker and speaking environments
Yu Tsao, Chin-Hui Lee
Cluster-based polynomial-fit histogram equalization (CPHEQ) for robust speech recognition
Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen
Robust distributed speech recognition using histogram equalization and correlation information
Pedro M. Martinez, Jose C. Segura, Luz Garcia
Predictive minimum Bayes risk classification for robust speech recognition
Jen-Tzung Chien, Koichi Shinoda, Sadaoki Furui
Applying word duration constraints by using unrolled HMMs
Ning Ma, Jon Barker, Phil Green
Evaluating the temporal structure normalisation technique on the Aurora-4 task
Xiong Xiao, Eng Siong Chng, Haizhou Li
Two-stage system for robust neutral/lombard speech recognition
Hynek Bořil, Petr Fousek, Harald Höge
Noise suppression using search strategy with multi-model compositions
Takatoshi Jitsuhiro, Tomoji Toriyama, Kiyoshi Kogure
Investigations into early and late reflections on distant-talking speech recognition toward suitable reverberation criteria
Takanobu Nishiura, Yoshiki Hirano, Yuki Denda, Masato Nakayama
An approach to iterative speech feature enhancement and recognition
Stefan Windmann, Reinhold Haeb-Umbach
Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition
Jeih-weih Hung
The harming part of room acoustics in automatic speech recognition
Rico Petrick, Kevin Lohde, Matthias Wolff, Rüdiger Hoffmann
A reference model weighting-based method for robust speech recognition
Yuan Fu Liao, Yh-Her Yang, Chi-Hui Hsu, Cheng-Chang Lee, Jing-Teng Zeng
Mel sub-band filtering and compression for robust speech recognition
Babak Nasersharif, Ahmad Akbari, Mohammad Mehdi Homayounpour
Clustered maximum likelihood linear basis for rapid speaker adaptation
Yun Tang, Richard Rose
Rapid speaker adaptation by reference model interpolation
Wenxuan Teng, Guillaume Gravier, Frédéric Bimbot, Frédéric Soufflet
Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection
Randy Gomez, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
Robustness of several kernel-based fast adaptation methods on noisy LVCSR
Brian Mak, Roger Hsiao
Estimating VTLN warping factors by distribution matching
Janne Pylkkönen
Frequency domain correspondence for speaker normalization
Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, Zhengyou Zhang
Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition
Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa
Application of CMLLR in narrow band wide band adapted systems
Martin Karafiát, Lukáš Burget, Jan Černocký, Thomas Hain
Fast adaptation of GMM-based compact models
Christophe Lévy, Georges Linarès, Jean-François Bonastre
Efficient estimation of speaker-specific projecting feature transforms
Jonas Lööf, Ralf Schlüter, Hermann Ney
Regularized feature-based maximum likelihood linear regression for speech recognition
Mohamed Kamal Omar
Modelling confusion matrices to improve speech recognition accuracy, with an application to dysarthric speech
Omar Caballero Morales, Stephen Cox
An active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability
Qiang Huo, Wei Li
fMPE-MAP: improved discriminative adaptation for modeling new domains
Jing Zheng, Andreas Stolcke
Discriminative MCE-based speaker adaptation of acoustic models for a spoken lecture processing task
Timothy J. Hazen, Erik McDermott
A new kernel for SVM MLLR based speaker recognition
Zahi N. Karam, William M. Campbell
A GMM-based probabilistic sequence kernel for speaker verification
Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen
Speaker recognition using kernel-PCA and intersession variability modeling
Hagai Aronowitz
Linear and non linear kernel GMM supervector machines for speaker verification
Réda Dehak, Najim Dehak, Patrick Kenny, Pierre Dumouchel
Support vector regression for speaker verification
Ignacio Lopez-Moreno, Ismael Mateos-Garcia, Daniel Ramos, Joaquin Gonzalez-Rodriguez
Derivative and parametric kernels for speaker verification
C. Longworth, M. J. F. Gales
Application of shifted delta cepstral features in speaker verification
Jose R. Calvo, Rafael Fernández, Gabriel Hernández
A smoothing kernel for spatially related features and its application to speaker verification
Luciana Ferrer, Kemal Sönmez, Elizabeth Shriberg
VZ-norm: an extension of z-norm to the multivariate case for anchor model based speaker verification
D. Charlet, M. Collet, Frédéric Bimbot
Word-conditioned HMM supervectors for speaker recognition
Howard Lei, Nikki Mirghafori
Speaker clustering using direct maximization of a BIC-based score
Wei-Ho Tsai
Confidence measure based unsupervised target model adaptation for speaker verification
A. Preti, Jean-François Bonastre, Driss Matrouf, F. Capman, B. Ravera
Emotion attribute projection for speaker recognition on emotional speech
Huanjun Bao, Ming-Xing Xu, Thomas Fang Zheng
High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling
Shi-Xiong Zhang, Man-Wai Mak, Helen Meng
Direct acoustic feature using iterative EM algorithm and spectral energy for classifying suicidal speech
T. Yingthawornsuk, H. Kaymaz Keskinpala, D. M. Wilkes, R. G. Shiavi, R. M. Salomon
On comparing and combining intra-speaker variability compensation and unsupervised model adaptation in speaker verification
Claudio Garreton, Nestor Becerra Yoma, Fernando Huenupán, Carlos Molina
Comparison of two kinds of speaker location representation for SVM-based speaker verification
Xianyu Zhao, Yuan Dong, Hao Yang, Jian Zhao, Liang Lu, Haila Wang
Jitter and shimmer measurements for speaker recognition
Mireia Farrús, Javier Hernando, Pascual Ejarque
Natural-emotion GMM transformation algorithm for emotional speaker recognition
Zhenyu Shan, Yingchun Yang, Ruizhi Ye
Optimized one-bit quantization for adapted GMM-based speaker verification
Ivy H. Tseng, Olivier Verscheure, Deepak S. Turaga, Upendra V. Chaudhari
A comparison of session variability compensation techniques for SVM-based speaker recognition
Mitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan
Influence of task duration in text-independent speaker verification
Benoît Fauve, Nicholas Evans, Neil Pearson, Jean-François Bonastre, John Mason
A text-constrained prosodic system for speaker verification
Elizabeth Shriberg, Luciana Ferrer
Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification
Asmaa El Hannani, Dijana Petrovska-Delacrétaz
Continuous prosodic features and formant modeling with joint factor analysis for speaker verification
Najim Dehak, Patrick Kenny, Pierre Dumouchel
Loquendo - Politecnico di torino's 2006 NIST speaker recognition evaluation system
Claudio Vair, Daniele Colibro, Fabio Castaldo, Emanuele Dalmasso, Pietro Laface
A straightforward and efficient implementation of the factor analysis model for speaker verification
Driss Matrouf, Nicolas Scheffer, Benoît Fauve, Jean-François Bonastre
Multi-modal user authentication from video for mobile or variable-environment applications
Timothy J. Hazen, Daniel Schultz
Quasi text-independent speaker-verification based on pattern matching
Michael Gerber, René Beutler, Beat Pfister
Virtual fusion for speaker recognition
Yosef A. Solewicz, Moshe Koppel
Evolutionary minimum verification error learning of the alternative hypothesis model for LLR-based speaker verification
Yi-Hsiang Chao, Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang, Ruei-Chuan Chang
Speaker recognition by combining MFCC and phase information
Seiichi Nakagawa, Kouhei Asakawa, Longbiao Wang
A semi-automatic approach for speaker mining of tapped telephone conversations
Sandeep Manocha, Carol Y. Espy-Wilson
Cluster adaptive training weights as features in SVM-based speaker verification
Hao Yang, Yuan Dong, Xianyu Zhao, Jian Zhao, Liang Lu, Haila Wang
Study on speaker verification with non-audible murmur segments
Hideki Okamoto, Mariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano
Dimension reduction for speaker identification based on mutual information
Xugang Lu, Jianwu Dang
Robustness of long time measures of fundamental frequency
Jonas Lindh, Anders Eriksson
Score distribution scaling for speaker recognition
Vinod Prakash, John H. L. Hansen
Global features for rapid identity verification with dynamic biometric data
A. C. Morris, J. Koreman, B. Ly-Van, H. Sellahewa, S. Jassim, R. Llarena Gómez
Robust voice activity detection for narrow-bandwidth speaker verification under adverse environments
Tuan Van Pham, Michael Neffe, Gernot Kubin
Speaker verification with multiple classifier fusion using Bayes based confidence measure
Fernando Huenupán, Nestor Becerra Yoma, Carlos Molina, Claudio Garreton
Audiovisual speaker identity verification based on lip motion features
Girija Chetty, Michael Wagner
Duration and pronunciation conditioned lexical modeling for speaker verification
Gokhan Tur, Elizabeth Shriberg, Andreas Stolcke, Sachin Kajarekar
Artificial impostor voice transformation effects on false acceptance rates
Jean-François Bonastre, Driss Matrouf, Corinne Fredouille
Rapid and accurate spoken term detection
David R. H. Miller, Michael Kleber, Chia-Lin Kao, Owen Kimball, Thomas Colthurst, Stephen A. Lowe, Richard M. Schwartz, Herbert Gish
Subword-based position specific posterior lattices (s-PSPL) for indexing speech information
Yi-cheng Pan, Hung-lin Chang, Berlin Chen, Lin-shan Lee
Improved methods for language model based question classification
Andreas Merkel, Dietrich Klakow
Error-tolerant question answering for spoken documents
Tomoyosi Akiba, Hirofumi Tsujimura
Exploiting information extraction annotations for document retrieval in distillation tasks
Dilek Hakkani-Tür, Gokhan Tur, Michael Levit
Learning spoken document similarity and recommendation using supervised probabilistic latent semantic analysis
K. Thambiratnam, F. Seide
A phonetic search approach to the 2006 NIST spoken term detection evaluation
Roy Wallace, Robbie Vogt, Sridha Sridharan
An integration method of retrieval results using plural subword models for vocabulary-free spoken document retrieval
Yoshiaki Itoh, Kohei Iwata, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee
The SRI/OGI 2006 spoken term detection system
Dimitra Vergyri, Izhak Shafran, Andreas Stolcke, Ramana R. Gadde, Murat Akbacak, Brian Roark, Wen Wang
Podcastle: a web 2.0 approach to speech recognition research
Masataka Goto, Jun Ogata, Kouichirou Eto
Speech mining in noisy audio message corpus
Nathalie Camelin, Frédéric Béchet, Géraldine Damnati, Renato De Mori
A fast fuzzy keyword spotting algorithm based on syllable confusion network
Jian Shao, Qingwei Zhao, Pengyuan Zhang, Zhaojie Liu, Yonghong Yan
Advances in speechfind: transcript reliability estimation employing confidence measure based on discriminative sub-word model for SDR
Wooil Kim, John H. L. Hansen
An interactive timeline for speech database browsing
Benoit Favre, Jean-François Bonastre, Patrice Bellot
Spoken word recognition of Chinese homophones: a further investigation
Michael C. W. Yip
The role of outer hair cell function in the perception of synthetic versus natural speech
Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, David Owens
Hybridizing conversational and clear speech
Akiko Kusumoto, Alexander B. Kain, John-Paul Hosom, Jan P. H. van Santen
Neighborhood density and neighborhood frequency effects in French spoken word recognition
Sophie Dufour, Ulrich Hans Frauenfelder
Discrimination and recognition of scaled word sounds
Toshio Irino, Yoshie Aoki, Yoshie Hayashi, Hideki Kawahara, Roy D. Patterson
Benchmarking human performance on the acoustic and linguistic subtasks of ASR systems
László Tóth
Contributions of temporal fine structure cues to Chinese speech recognition in cochlear implant simulation
Lin Yang, Jianping Zhang, Yonghong Yan
Effect of number of masking talkers on speech-on-speech masking in Chinese
Xihong Wu, Jing Chen, Zhigang Yang, Qiang Huang, Mengyuan Wang, Liang Li
Do different boundary types induce subtle acoustic cues to which French listeners are sensitive?
Odile Bagou, Sophie Dufour, Cécile Fougeron, Alain Content, Ulrich Hans Frauenfelder
An information theoretic approach to predict speech intelligibility for listeners with normal and impaired hearing
Svante Stadler, Arne Leijon, Björn Hagerman
Speaking rate effects in a landmark-based phonetic exemplar model
Travis Wade, Bernd Möbius
Acoustic correlates of intelligibility enhancements in clearly produced fricatives
Kazumi Maniwa, Allard Jongman, Travis Wade
Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model
Tim Jürgens, Thomas Brand, Birger Kollmeier
Lombard speech impact on perceptual speaker recognition
Ayako Ikeno, John H. L. Hansen
Effect of within- and between-talker variability on word identification in noise by younger and older adults
Huiwen Goy, Kathleen Pichora-Fuller, Pascal van Lieshout, Gurjit Singh, Bruce Schneider
Speech perception in children with speech sound disorder
H. Timothy Bunnell, N. Carolyn Schanen, Linda D. Vallino, Thierry G. Morlet, James B. Polikoff, Jennette D. Driscoll, James T. Mantell
Speech coding and information processing by auditory neurons
Huan Wang, Werner Hemmert
What do listeners attend to in hearing prosodic structures? investigating the human speech-parser using short-term recall
Annie C. Gilbert, Victor J. Boucher
Time-compressed speech perception with speech and noise maskers
Douglas S. Brungart, Nandini Iyer
L2 consonant identification in noise: cross-language comparisons
Anne Cutler, Martin Cooke, Maria Luisa Garcia Lecumberri, Dennis Pasveer
Effects of non-native dialects on spoken word recognition
Jennifer T. Le, Catherine T. Best, Michael D. Tyler, Christian Kroos
Identification of natural whistled vowels by non-whistlers
Julien Meyer, Fanny Meunier, Laure Dentel
Prelexical adjustments to speaker idiosyncrasies: are they position-specific?
Alexandra Jesse, James M. McQueen
Top-down effects on compensation for coarticulation are not replicable
Holger Mitterer
Pitch pattern alternation in goshogawara Japanese: evidence for a prosodic phrase above the domain for downstep
Yosuke Igarashi
Some evidence on the phonetics and phonology of prosodic phrasing in Russian
Irina Nesterenko, Pavel Skrelin
Temporal downtrends in Czech read speech
Jan Volín, Radek Skarnitzl
Empirical evidence for prosodic phrasing: pauses as linguistic annotation in Korean read speech
Hyongsil Cho, Daniel Hirst
Exploiting prosody for PCFGs with latent annotations
Markus Dreyer, Izhak Shafran
Combining length distribution model with decision tree in prosodic phrase prediction
Qin Shi, DanNing Jiang, FanPing Meng, Yong Qin
Duration and pauses as boundary-markers in speech: a cross-linguistic study
Li-chiung Yang
Modeling incompletion phenomenon in Mandarin dialog prosody
Jian Yu, Lixing Huang, Jianhua Tao, Xia Wang
Accent assignment algorithm in Hungarian, based on syntactic analysis
Anne Tamm, Kálmán Abari, Gábor Olaszy
An effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese
Cheng-Yuan Lin, Pei-Chi Jao, J. -S. Roger Jang
Increasing prosodic variability of text-to-speech synthesizers
Géza Németh, Márk Fék, Tamás Gábor Csapó
Unsupervised HMM classification of F0 curves
Damien Lolive, Nelly Barbot, Olivier Boeffard
Automatic pitch accent prediction for text-to-speech synthesis
Ian Read, Stephen Cox
An unsupervised approach to automatic prosodic annotation
Xinqiang Ni, Yining Chen, Frank K. Soong, Min Chu, Ping Zhang
A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality
Zeynep Inanoglu, Steve Young
An automatic prosody labeling method for Mandarin speech
Chen-Yu Chiang, Hsiu-Min Yu, Yih-Ru Wang, Sin-Horng Chen
Corpus-based generation of prosodic features from text based on generation process model
Keikichi Hirose, Keiko Ochi, Nobuaki Minematsu
Novel eigenpitch-based prosody model for text-to-speech synthesis
Jilei Tian, Jani Nurminen, Imre Kiss
Modelling prominence and emphasis improves unit-selection synthesis
Volker Strom, Ani Nenkova, Robert Clark, Yolanda Vazquez-Alvarez, Jason Brenier, Simon King, Dan Jurafsky
A framework of reply speech generation for concept-to-speech conversion in spoken dialogue systems
Seiya Takada, Yuji Yagi, Keikichi Hirose, Nobuaki Minematsu
Synthesis of prosodic attitudinal variants in German backchannel ja
Thorsten Stocksmeier, Stefan Kopp, Dafydd Gibbon
Inter-language prosodic style modification experiment using word impression vector for communicative speech generation
Ke Li, Yoko Greenberg, Yoshinori Sagisaka
A conservative aggressive subspace tracker
Koby Crammer
Mutual information and the speech signal
Mattias Nilsson, W. Bastiaan Kleijn
Spectro-temporal analysis of speech using 2-d Gabor filters
Tony Ezzat, Jake Bouvrie, Tomaso Poggio
A comparative study of speech rate estimation techniques
Tomas Dekens, Mike Demol, Werner Verhelst, Piet Verhoeve
Spectro-temporal processing for blind estimation of reverberation time and single-ended quality measurement of reverberant speech
Tiago H. Falk, Hua Yuan, Wai-Yip Chan
Linear prediction of audio signals
Toon van Waterschoot, Marc Moonen
Stabilised weighted linear prediction - a robust all-pole method for speech processing
Carlo Magi, Tom Bäckström, Paavo Alku
Conditionally linear Gaussian models for estimating vocal tract resonances
Daniel Rudoy, Daniel N. Spendley, Patrick J. Wolfe
Time-varying pre-emphasis and inverse filtering of speech
Karl Schnell, Arild Lacroix
Reconstructing audio signals from modified non-coherent hilbert envelopes
Joachim Thiemann, Peter Kabal
A flexible spectral modification method based on temporal decomposition and Gaussian mixture model
Binh Phu Nguyen, Masato Akagi
A comparison of estimated and MAP-predicted formants and fundamental frequencies with a speech reconstruction application
Jonathan Darch, Ben Milner
Effect of incomplete glottal closures on estimates of glottal waves via inverse filtering of vowel sounds
Huiqun Deng, Douglas O'Shaughnessy
Vocal tract and area function estimation with both lip and glottal losses
Kaustubh Kalgaonkar, Mark A. Clements
Detection of instants of glottal closure using characteristics of excitation source
S Guruprasad, B Yegnanarayana, K Sri Rama Murty
A comparative evaluation of the zeros of z transform representation for voice source estimation
Nicolas Sturmel, Christophe D'Alessandro, Boris Doval
Ambient telephony: scenarios and research challenges
Aki Härmä
Always listening to you: creating exhaustive audio database in home environments
Yasunari Obuchi, Akio Amano
Joint speaker segmentation, localization and identification for streaming audio
Joerg Schmalenstroeer, Reinhold Haeb-Umbach
Active binaural distance estimation for dynamic sources
Yan-Chen Lu, Martin Cooke, Heidi Christensen
A packetization and variable bitrate interframe compression scheme for vector quantizer-based distributed speech recognition
Bengt J. Borgström, Abeer Alwan
Channel selection by class separability measures for automatic transcriptions on distant microphones
Matthias Wölfel
Conversation detection and speaker segmentation in privacy-sensitive situated speech data
Danny Wyatt, Tanzeem Choudhury, Jeff Bilmes
Audio-based approaches to head orientation estimation in a smart-room
Alberto Abad, Carlos Segura, Climent Nadeu, Javier Hernando
Multi-resolution soft features for channel-robust distributed speech recognition
Valentin Ion, Reinhold Haeb-Umbach
Large-scale random forest language models for speech recognition
Yi Su, Frederick Jelinek, Sanjeev Khudanpur
PLSA-based topic detection in meetings for adaptation of lexicon and language model
Yuya Akita, Yusuke Nemoto, Tatsuya Kawahara
Language modeling using PLSA-based topic HMM
Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki
Lexicon adaptation with reduced character error (LARCE) - a new direction in Chinese language modeling
Yi-cheng Pan, Lin-shan Lee
Minimum rank error training for language modeling
Meng-Sung Wu, Jen-Tzung Chien
Integrating MAP, marginals, and unsupervised language model adaptation
Wen Wang, Andreas Stolcke
Dynamic language model adaptation using presentation slides for lecture speech recognition
Hiroki Yamazaki, Koji Iwano, Koichi Shinoda, Sadaoki Furui, Haruo Yokota
Web-based language modelling for automatic lecture transcription
Cosmin Munteanu, Gerald Penn, Ron Baecker
LSA-based language model adaptation for highly inflected languages
Tanel Alumäe, Toomas Kirt
Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm
Aaron Heidel, Hung-an Chang, Lin-shan Lee
Structural Bayesian language modeling and adaptation
Sibel Yaman, Jen-Tzung Chien, Chin-Hui Lee
Vocabulary selection for a broadcast news transcription system using a morpho-syntactic approach
Ciro Martins, António J. S. Teixeira, João Neto
Handling OOV words in Arabic ASR via flexible morphological constraints
Nguyen Bach, Mohamed Noamany, Ian Lane, Tanja Schultz
Phrases in category-based language models for Spanish and basque ASR
Raquel Justo, M. Inés Torres
Language modeling for automatic turkish broadcast news transcription
Ebru Arısoy, Haşim Sak, Murat Saraçlar
Predicting focus through prominence structure
Sasha Calhoun
Analysis of emotional speech prosody in terms of part of speech tags
Murtaza Bulut, Sungbok Lee, Shrikanth S. Narayanan
The neutral tone in question intonation in Mandarin
Fang Liu, Yi Xu
Pointing to a target while naming it with /pata/ or /tapa/: the effect of consonants and stress position on jaw-finger coordination
Amélie Rochet-Capellan, Jean-Luc Schwartz, Rafael Laboissière, Arturo Galvàn
Suprasegmental aspects of pre-lexical speech in cochlear implanted children
Øydis Hide, Steven Gillis, Paul Govaerts
Categorical perception in intonation: a matter of signal dynamics?
Oliver Niebuhr
A HMM recognition of consonant-vowel syllables from lip contours: the cued speech case
Noureddine Aboutabit, Denis Beautemps, Jeanne Clarke, Laurent Besacier
A unified approach to multi-pose audio-visual ASR
Patrick Lucey, Gerasimos Potamianos, Sridha Sridharan
Audio-visual integration for robust speech recognition using maximum weighted stream posteriors
Rowan Seymour, Darryl Stewart, Ji Ming
Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips
Thomas Hueber, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone
Multimodal speech recognition with ultrasonic sensors
Bo Zhu, Timothy J. Hazen, James Glass
Fused HMM-adaptation of multi-stream HMMs for audio-visual speech recognition
David Dean, Patrick Lucey, Sridha Sridharan, Tim Wark
Analysis of head motions and speech in spoken dialogue
Carlos T. Ishi, Hiroshi Ishiguro, Norihiro Hagita
A paradigm for mobile speech-centric services
Lars Bo Larsen, Kasper L. Jensen, Søren Larsen, Morten Rasmussen
Design and recording of Czech sign language corpus for automatic sign language recognition
Pavel Campr, Marek Hrúz, Miloš Železný
Pushy versus meek - using avatars to influence turn-taking behaviour
Jens Edlund, Jonas Beskow
Wavelet-based front-end for electromyographic speech recognition
Michael Wand, Szu-Chen Stan Jou, Tanja Schultz
Intensive gestures in French and their multimodal correlates
Gaëlle Ferré, Roxane Bertrand, Philippe Blache, Robert Espesser, Stéphane Rauzy
Aspects of visual speech in Arabic
Slim Ouni, Kais Ouni
Rigid vs non-rigid face and head motion in phone and tone perception
Denis Burnham, Jessica Reynolds, Guillaume Vignali, Sandra Bollwerk, Caroline Jones
Audio-visual phoneme classification for pronunciation training applications
Hedvig Kjellström, Olov Engwall, Sherif Mahdy Abdou, Olle Bälter
Visual information and redundancy conveyed by internal articulator dynamics in synthetic audiovisual speech
Katja Grauwinkel, Britta Dewitt, Sascha Fagel
A speech rate related lip movement model for speech animation
Wei Zhou, Zengfu Wang
An extension 2DPCA based visual feature extraction method for audio-visual speech recognition
Guanyong Wu, Jie Zhu
Preventing an external acoustic noise from being misrecognized as a speech recognition object by confirming the lip movement image signal
Soo-jong Lee, Jun Park, Eung-kyeu Kim
Automatic head motion prediction from speech data
Gregor Hofer, Hiroshi Shimodaira
Omnidirectional audio-visual talker localizer with dynamic feature fusion based on validity and reliability criteria
Yuki Denda, Takanobu Nishiura, Yoichi Yamashita
Processing image and audio information for recognising discourse participation status through features of face and voice
Nick Campbell, Damien Douxchamps
The effect of the additivity assumption on time and frequency domain wiener filtering for speech enhancement
Kamil K. Wójcicki, Stephen So, Kuldip K. Paliwal
Noise reduction based on adaptive β-order generalized spectral subtraction for speech enhancement
Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki
Class constrained ROVER based speech enhancement
Amit Das, John H. L. Hansen
EMD based soft-thresholding for speech enhancement
Erhan Deger, Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan
An approximate solution for perceptually constrained signal subspace speech enhancement method
Adam Borowicz, Alexander Petrovsky
Quality assessment of speech enhancement systems by separation of enhanced speech, noise, and echo
Tim Fingscheidt, Suhadi Suhadi
Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds
Anis Ben Aicha, Sofia Ben Jebara
On optimal estimation of compressed speech for hearing aids
Dirk Mauler, Anil M. Nagathil, Rainer Martin
DFT domain subspace based noise tracking for speech enhancement
Richard C. Hendriks, Jesper Jensen, Richard Heusdens
Noise tracking for speech systems in adverse environments
Nitish Krishnamurthy, John H. L. Hansen
Speech enhancement using multi-reference noise reduction in a vehicle environment
Abderrahman Essebbar, Tristan Poinsard
Blind adaptive principal eigenvector beamforming for acoustical source separation
Ernst Warsitz, Reinhold Haeb-Umbach, Dang Hai Tran Vu
Time-domain blind audio source separation using advanced ICA methods
Zbyněk Koldovský, Petr Tichavský
Model-based speech separation with single-microphone input
S. W. Lee, Frank K. Soong, P. C. Ching
Multi-step linear prediction based speech dereverberation in noisy reverberant environment
Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi
A statistical model based post-filtering algorithm for residual echo suppression
Seung Yeol Lee, Jong Won Shin, Hwan Sik Yun, Nam Soo Kim
An optimal speech enhancement under speech uncertainty probability and masking property of auditory system
Xiaoshan Huang, Xiaoqun Zhao
Temporal episodic memory model: an evolution of minerva2
Viktoria Maier, Roger K. Moore
Speech recognition with factorial-HMM syllabic acoustic models
Gianpaolo Coro, Francesco Cutugno, Fulvio Caropreso
Evaluating acoustic distance measures for template based recognition
Mathias De Wachter, Kris Demuynck, Patrick Wambacq, Dirk Van Compernolle
Hierarchical acoustic modeling based on random-effects regression for automatic speech recognition
Yan Han, Lou Boves
Construction and analysis of multiple paths in syllable models
Annika Hämäläinen, Louis ten Bosch, Lou Boves
Landmark-based approach to speech recognition: an alternative to HMMs
Carol Y. Espy-Wilson, Tarun Pruthi, Amit Juneja, Om Deshmukh
Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics
Satoshi Asakawa, Nobuaki Minematsu, Keikichi Hirose
A structured speech model parameterized by recursive dynamics and neural networks
Roberto Togneri, Li Deng
Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches
Li Deng, Helmer Strik
Learning the inter-frame distance for discriminative template-based keyword detection
David Grangier, Samy Bengio
Handling phonetic context and speaker variation in a structure-based speech recognizer
Dong Yu, Li Deng, Alex Acero
Vector-quantization based mask estimation for missing data automatic speech recognition
Maarten Van Segbroeck, Hugo Van hamme
Accurate marginalization range for missing data recognition
Sébastien Demange, Christophe Cerisara, Jean-Paul Haton
Smooth soft mel-spectrographic masks based on blind sparse source separation
Marco Kühne, Roberto Togneri, Sven Nordholm
Model-driven detection of clean speech patches in noise
Jonathan Laidler, Martin Cooke, Neil D. Lawrence
polyaural array processing for automatic speech recognition in degraded environments
Richard M. Stern, Evandro B. Gouvêa, Govindarajan Thattai
Adding noise to improve noise robustness in speech recognition
Nicolás Morales, Liang Gu, Yuqing Gao
The buckeye corpus of speech: updates and enhancements
Eric Fosler-Lussier, Laura Dilley, Na'im Tyson, Mark Pitt
Development of multimodal resources for multilingual information retrieval in the basque context
N. Barroso, A. Ezeiza, N. Gilisagasti, K. López de Ipiña, A. López, J. M. López
Construction of a phonotactic dialect corpus using semiautomatic annotation
Reva Schwartz, Wade Shen, Joseph Campbell, Shelley Paget, Julie Vonwiller, Dominique Estival, Christopher Cieri
BECAM tool - a semi-automatic tool for bootstrapping emotion corpus annotation and management
Slim Abdennadher, Mohamed Aly, Dirk Bühler, Wolfgang Minker, Johannes Pittermann
Resources for new research directions in speaker recognition: the mixer 3, 4 and 5 corpora
Christopher Cieri, Linda Corson, David Graff, Kevin Walker
Intercoder reliability in annotating complex disfluencies
Peter A. Heeman, Andy McMillin, J. Scott Yaruss
Single channel speech separation using maximum a posteriori estimation
M. H. Radfar, R. M. Dansereau
Speech enhancement with improved a posteriori SNR computation
Suhadi Suhadi, Tim Fingscheidt
Method of LP-based blind restoration for improving intelligibility of bone-conducted speech
Thang Vu Tat, Germine Seide, Masashi Unoki, Masato Akagi
Noise suppression based on extending a speech-dominated modulation band
Tiago H. Falk, Svante Stadler, W. Bastiaan Kleijn, Wai-Yip Chan
Speech enhancement using PCA and variance of the reconstruction error model identification
Amin Haji Abolhassani, Sid-Ahmed Selouani, Douglas O'Shaughnessy, Mohamed-Faouzi Harkat
Speech reinforcement based on partial specific loudness
Jong Won Shin, Woohyung Lim, Junesig Sung, Nam Soo Kim
The phonetics and phonology of high and low tones in two falling f0-contours in standard German
Tamara Rathcke, Jonathan Harrington
Temporal alignment of creaky voice in neutralised realisations of an underlying, post-nasal voicing contrast in German
Tina John, Jonathan Harrington
The duration of speech pauses in a multilingual environment
Mike Demol, Werner Verhelst, Piet Verhoeve
Syllable timing patterns in Polish: results from annotation mining
Dafydd Gibbon, Jolanta Bachan, Grażyna Demenko
Minimal pairs and functional loads of sound contrasts obtained from a list of modern greek words
Constandinos Kalimeris, Stelios Bakamidis
More on acoustic correlates of stress
Daan Wissing
Comparing praat and snack formant measurements on two large corpora of northern and southern French
Cécile Woehrling, Philippe Boula de Mareüil
The phonetic exponency of phrasal accentuation in French and German
William Barry, Bistra Andreeva, Ingmar Steiner
Phonetic geminates in cypriot greek: the case of voiceless plosives
Christiana Christodoulou
Predicting vowel duration in spontaneous canadian French speech
Darcie Williams, François Poiré
Rhotic variation and schwa epenthesis in windsor French
Ivan Chow, François Poiré
On the categorical nature of the process involved in schwa elision in French
Audrey Bürki, Cécile Fougeron, Cédric Gendrot
Exploring tonal variations via context-dependent tone models
Yue-Ning Hu, Min Chu, Chao Huang, Yan-Ning Zhang
Acoustic analysis of the neutral tone in Mandarin
Philippe Martin, Jun Li
F0 analysis of perceptual distance among Cantonese level tones
Rerrario Shui-Ching Ho, Yoshinori Sagisaka
Extended powered cepstral normalization (p-CN) with range equalization for robust features in speech recognition
Chang-wen Hsu, Lin-shan Lee
Selection of optimal dimensionality reduction method using chernoff bound for segmental unit input HMM
Makoto Sakai, Norihide Kitaoka, Seiichi Nakagawa
Fepstrum: an improved modulation spectrum for ASR
Vivek Tyagi
Narrowband to wideband feature expansion for robust multilingual ASR
Dušan Macho
Non-linear spectral contrast stretching for in-car speech recognition
Weifeng Li, Hervé Bourlard
Clustering-based two-dimensional linear discriminant analysis for speech recognition
Xiao-Bing Li, Douglas O'Shaughnessy
A study on temporal features derived by analytic signal
Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai
Dimensionality reduction of speech features using nonlinear principal components analysis
Stephen A. Zahorian, Tara Singh, Hongbing Hu
Linear transformation approach to VTLN using dynamic frequency warping
D. R. Sanand, D. Dinesh Kumar, S. Umesh
Features interpolation domain for distributed speech recognition and performance for ITU-t g.723.1 CODEC
Vladimir Fabregas Surigué de Alencar, Abraham Alcaim
Dynamic integration of multiple feature streams for robust real-time LVCSR
Shoei Sato, Kazuo Onoe, Akio Kobayashi, Shinich Homma, Toru Imai, Tohru Takagi, Tetsunori Kobayashi
PCA-based feature extraction for fluctuation in speaking style of articulation disorders
Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li, Toshitaka Nakabayashi
Multi-stream features combination based on dempster-shafer rule for LVCSR system
Fabio Valente, Jithendra Vepa, Hynek Hermansky
Dimensionality reduction for speech recognition using neighborhood components analysis
Natasha Singh-Miller, Michael Collins, Timothy J. Hazen
Probabilistic latent speaker analysis for large vocabulary speech recognition
Dan Su, Xihong Wu, Huisheng Chi
MRASTA and PLP in automatic speech recognition
S. R. Mahadeva Prasanna, Hynek Hermansky
Women's vocal aging: a longitudinal approach
Markus Brückl
Effect of intensive voice therapy on vocal tremor for parkinson speakers
Laurence Cnockaert, Jean Schoentgen, Canan Ozsancak, Pascal Auzou, Francis Grenez
Assessment of vocal dysperiodicities in connected disordered speech
A. Alpan, A. Kacha, Francis Grenez, Jean Schoentgen
Effects of FE modelled consequences of tonsillectomy on perceptual evaluation of voice
Anne-Maria Laukkanen, Jaromír Horáček, Pavel Švancara, Elina Lehtinen
Speech quality after major surgery of the oral cavity and oropharynx with microvascular soft tissue reconstruction
Irma M. Verdonck-de Leeuw, Louis ten Bosch, Li Ying Chao, Rico N. P. M. Rinkel, Pepijn A. Borggreven, Lou Boves, C. René Leemans
Voice fatigue and use of speech recognition: a study of voice quality ratings
Christel de Bruijn, Sandra Whiteside
Complementary approaches for voice disorder assessment
Jean-François Bonastre, Corinne Fredouille, A. Ghio, A. Giovanni, G. Pouchoulin, J. Révis, B. Teston, P. Yu
Frequency study for the characterization of the dysphonic voices
G. Pouchoulin, Corinne Fredouille, Jean-François Bonastre, A. Ghio, A. Giovanni
Acoustic correlates of laryngeal-muscle fatigue: findings for a phonometric prevention of acquired voice pathologies
Victor J. Boucher
Automatic scoring of the intelligibility in patients with cancer of the oral cavity
Andreas Maier, Maria Schuster, Anton Batliner, Elmar Nöth, Emeka Nkenke
Automatic assessment of children's reading level
Jacques Duchateau, Leen Cleuren, Hugo Van hamme, Pol Ghesquière
Using waveform matching techniques in the measurement of shimmer in voiced signals
Carlos Ferrer, María E. Hernández-Díaz, Eduardo González
Analysis of the impact of analogue telephone channel on MFCC parameters for voice pathology detection
R. Fraile, J. I. Godino-Llorente, N. Sáenz-Lechón, V. Osma-Ruiz, P. Gómez-Vilda
Objective parameters from videokymographic images: a user-friendly interface
C. Manfredi, L. Bocchi, G. Cantarella, G. Peretti, G. Guidi, V. Mezzatesta
Integrating audio and visual cues for speaker friendliness in multimodal speech synthesis
David House
The influence of masking words on the prediction of TRPs in a shadowed dialog
Wieneke Wesseling, R. J. J. H. van Son, Louis C. W. Pols
Analysis of the occurrence of laughter in meetings
Kornel Laskowski, Susanne Burger
Incremental perception of acted and real emotional speech
Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts
Speaking through a noisy channel - experiments on inducing clarification behaviour in human-human dialogue
David Schlangen, Raquel Fernández
Computerized chironomy: evaluation of hand-controlled intonation reiteration
Christophe D'Alessandro, Albert Rilliard, Sylvain Le Beux
JAAE: the java abstract annotation editor
Ivan Habernal, Miloslav Konopík
How to judge reusability of existing speech corpora for target task by utilizing statistical multidimensional scaling
Goshu Nagino, Makoto Shozakai, Kiyohiro Shikano
Feasibility of constructing an expressive speech corpus from television soap opera dialogue
Peter Rutten
Collection of empirical data for standardization of generic vocabularies in speech driven ICT devices and services
Rosemary Orr, Bernat González i Llinares, Françoise Petersen, Helge Hüttenrauch, Martin Böcker, Michael Tate
Acoustic-phonetic features for refining the explicit speech segmentation
Antonio Marcos Selmini, Fábio Violaro
Text island spotting in large speech databases
B. Lecouteux, Georges Linarès, Frédéric Beaugendre, Pascal Nocera
People watcher: a game for eliciting human-transcribed data for automated directory assistance
Tim Paek, Yun-Cheng Ju, Christopher Meek
The effect of speech interface accuracy on driving performance
Andrew Kun, Tim Paek, Zeljko Medenica
Context constrained-generalized posterior probability for verifying phone transcriptions
Hua Zhang, Lijuan Wang, Frank K. Soong, Wenju Liu
Getting start with UTDrive: driver-behavior modeling and assessment of distraction for in-vehicle speech systems
Pongtep Angkititrakul, DongGu Kwak, SangJo Choi, JeongHee Kim, Anh PhucPhan, Amardeep Sathyanarayana, John H. L. Hansen
Relative evaluation of informativeness in machine generated summaries
BalaKrishna Kolluru, Yoshihiko Gotoh
A method for evaluating task-oriented spoken dialog translation systems based on communication efficiency
Toshiyuki Takezawa, Masahide Mizushima, Tohru Shimizu, Genichiro Kikui
Using eye movements for online evaluation of speech synthesis
Charlotte van Hooijdonk, Edwin Commandeur, Reinier Cozijn, Emiel Krahmer, Erwin Marsi
Sentence level intelligibility evaluation for Mandarin text-to-speech systems using semantically unpredictable sentences
Jian Li, Dmitry Sityaev, Jie Hao
N-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology
Judith Kessens, David A. van Leeuwen
A MAP based approach to adaptive speech intelligibility measurements
Trym Holter, Svein Sørsdal
Phone boundary detection using selective refinements and context-dependent acoustic features
Sirinoot Boonsuk, Proadpran Punyabukkana, Atiwong Suchato
Modeling context and language variation for non-native speech recognition
Tien-Ping Tan, Laurent Besacier
An evaluation of cross-language adaptation and native speech training for rapid HMM construction based on very limited training data
Xufang Zhao, Douglas O'Shaughnessy
Never-ending learning with dynamic hidden Markov network
Konstantin Markov, Satoshi Nakamura
Building multiple complementary systems using directed decision trees
C. Breslin, M. J. F. Gales
Automatic speech recognition framework for multilingual audio contents
Hiroaki Nanjo, Yuichi Oku, Takehiko Yoshimi
Combined acoustic and pronunciation modelling for non-native speech recognition
G. Bouselmi, Dominique Fohr, I. Illina
Automatic estimation of scaling factors among probabilistic models in speech recognition
Tadashi Emori, Yoshifumi Onishi, Koichi Shinoda
Memory efficient modeling of polyphone context with weighted finite-state transducers
Emilian Stoimenov, John McDonough
Extra large vocabulary continuous speech recognition algorithm based on information retrieval
Valeriy Pylypenko
PocketSUMMIT: small-footprint continuous speech recognition
I. Lee Hetherington
Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task
Tobias Cincarek, Izumi Shindo, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
A study on word detector design and knowledge-based pruning and rescoring
Chengyuan Ma, Chin-Hui Lee
Parameter tuning for fast speech recognition
Thomas Colthurst, Tresi Arvizo, Chia-Lin Kao, Owen Kimball, Stephen A. Lowe, David R. H. Miller, Jim Van Sciver
A computational model for unsupervised word discovery
Louis ten Bosch, Bert Cranen
Phoneme confusions in human and automatic speech recognition
Bernd T. Meyer, Matthias Wächter, Thomas Brand, Birger Kollmeier
Construction of spoken language model including fillers using filler prediction model
Kengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa
Attention shift decoding for conversational speech recognition
Raghunandan Kumaran, Jeff Bilmes, Katrin Kirchhoff
A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian
Péter Mihajlik, Tibor Fegyó, Zoltán Tüske, Pavel Ircing
A semi-supervised learning approach for morpheme segmentation for an Arabic dialect
Mei Yang, Jing Zheng, Andreas Kathol
Accelerating the annotation of lexical data for less-resourced languages
Gerhard B. van Huyssteen, Martin J. Puttkammer
On web-based creation of speech resources for less-resourced languages
Christoph Draxler
Building an information retrieval system for serbian - challenges and solutions
Miroslav Martinović, Srdjan Vesić, Goran Rakić
Bootstrapping morphological analysis of gĩkũyũ using unsupervised maximum entropy learning
Guy De Pauw, Peter Waiganjo Wagacha
The voiceTRAN machine translation system
Jerneja Žganec Gros, Stanislav Gruden
MuLAS: a framework for automatically building multi-tier corpora
Sérgio Paulo, Luís C. Oliveira
Creating multimedia dictionaries of endangered languages using LEXUS
Jacquelijn Ringersma, Marc Kemps-Snijders
IceNLP: a natural language processing toolkit for icelandic
Hrafn Loftsson, Eiríkur Rögnvaldsson
Phonotactic spoken language identification with limited training data
Marius Peche, Marelie Davel, Etienne Barnard
Automatic speech recognition for an under-resourced language - amharic
Solomon Teferra Abate, Wolfgang Menzel
Information retrieval strategies for accessing african audio corpora
Abdillahi Nimaan, Pascal Nocera, Frédéric Béchet, Jean-François Bonastre
Morfessor and variKN machine learning tools for speech and language technology
Vesa Siivola, Mathias Creutz, Mikko Kurimo
Towards better language modeling for Thai LVCSR
Markpong Jongtaveesataporn, Issara Thienlikit, Chai Wutiwiwatchai, Sadaoki Furui
Generative and discriminative algorithms for spoken language understanding
Christian Raymond, Giuseppe Riccardi
A soft-clustering algorithm for automatic induction of semantic classes
Elias Iosif, Alexandros Potamianos
Classification of discourse functions of affirmative words in spoken dialogue
Agustín Gravano, Stefan Benus, Julia Hirschberg, Shira Mitchell, Ilia Vovsha
Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy
Bogdan Minescu, Géraldine Damnati, Frédéric Béchet, Renato De Mori
Speaker adaptation of language models for automatic dialog act segmentation of meetings
Jáchym Kolář, Yang Liu, Elizabeth Shriberg
Unsupervised categorisation approaches for technical support automated agents
Amparo Albalate, Dimitar Dimitrov, Roberto Pieraccini
Joint position-pitch extraction from multichannel audio
Michael Wohlmayr, Marián Képesi
Morphological pre-processing technique and its applications on speech signal
Hyun Soo Kim
A pitch extraction system based on phase locked loops and consensus decision
Patricia A. Pelle, Claudio F. Estienne
A robust multi-phase pitch-mark detection algorithm
Milan Legát, Jindřich Matoušek, Daniel Tihelka
Pitch estimation of noisy speech signals using empirical mode decomposition
Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan
Evaluating two versions of the momel pitch modelling algorithm on a corpus of read speech in Korean
Daniel Hirst, Hyongsil Cho, Sunhee Kim, Hyunji Yu
Hybrid electroglottograph and speech signal based algorithm for pitch marking
Hussein Hussein, Oliver Jokisch
A fine pitch model for speech
Jasha Droppo, Alex Acero
Pitch period estimation using multipulse model and wavelet transform
Prasanta Kumar Ghosh, Antonio Ortega, Shrikanth S. Narayanan
Combining rate and place information for robust pitch extraction
Martin Heckmann, Frank Joublin, Christian Goerick
Integrating pitch and localisation cues at a speech fragment level
Heidi Christensen, Ning Ma, Stuart N. Wrigley, Jon Barker
Speech fundamental frequency estimation using the alternate comb
Jean-Sylvain Liénard, François Signol, Claude Barras
Detecting pitch accent using pitch-corrected energy-based predictors
Andrew Rosenberg, Julia Hirschberg
Normalized two stage SVQ for minimum complexity wide-band LSF quantization
Saikat Chatterjee, T. V. Sreenivas
A novel 2kb/s waveform interpolation speech coder based on non-negative matrix factorization
Peng Zhang, Chang-chun Bao
A novel energy distribution comparison approach for robust speech spectrum vector quantization
Ahmed Ismail, Yasser Dakroury, Hazem Abbas
Novel low-band phase representation for low bit-rate speech coding
Ahmed Ismail, Yasser Dakroury, Hazem Abbas
Perceptual-based playout mechanisms for multi-stream voice over IP networks
Chun-Feng Wu, Cheng-Lung Lee, Wen-Whei Chang
Time-warping and re-phasing in packet loss concealment
Robert Zopf, Jes Thyssen, Juin-Hwey Chen
The harmonic model codec (HMC) framework for voIP
Yannis Agiomyrgiannakis, Yannis Stylianou
Bit-erasure channel decoding for GMM-based multiple description coding
Yannis Agiomyrgiannakis, Yannis Stylianou
Degradation-classification assisted single-ended quality measurement of speech
Hua Yuan, Tiago H. Falk, Wai-Yip Chan
Concept and evaluation of a downward-compatible system for spatial teleconferencing using automatic speaker clustering
Alexander Raake, Sascha Spors, Jens Ahrens, Jitendra Ajmera
Speech quality estimation using packet loss effects in CELP-type speech coders
Min-Ki Lee, Kyung-Tae Kim, Hong-Goo Kang, Dae Hee Youn
An 8-32 kbit/s scalable wideband coder extended with MDCT-based bandwidth extension on top of a 6.8 kbit/s narrowband CELP coder
Masahiro Oshikiri, Hiroyuki Ehara, Toshiyuki Morii, Tomofumi Yamanashi, Kaoru Satoh, Koji Yoshida
Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation
Robert Wielgat, Tomasz P. Zieliński, Paweł Świętojański, Piotr Żołądź, Daniel Król, Tomasz Woźniak, Stanisław Grabias
Unsupervised training with directed manual transcription for recognising Mandarin broadcast audio
K. Yu, M. J. F. Gales, P. C. Woodland
Context dependent syllable acoustic model for continuous Chinese speech recognition
Hao Wu, Xihong Wu
A sub-optimal viterbi-like search for linear dynamic models classification
Dimitris Oikonomidis, Vassilis Diakoloukas, Vassilis Digalakis
On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields
Georg Heigold, Ralf Schlüter, Hermann Ney
Speeding-up neural network training using sentence and frame selection
Stefano Scanzio, Pietro Laface, Roberto Gemello, Franco Mana
Using a small development set to build a robust dialectal Chinese speech recognizer
Linquan Liu, Thomas Fang Zheng, Makoto Akabane, Ruxin Chen, Wenhu Wu
Unsupervised re-scoring of observation probability in viterbi based on reinforcement learning by using confidence measure and HMM neighborhood
Carlos Molina, Nestor Becerra Yoma, Fernando Huenupán, Claudio Garreton
Optimization on decoding graphs by discriminative training
Shiuan-Sung Lin, François Yvon
Morphosyntactic processing of n-best lists for improved recognition and confidence measure computation
Stéphane Huet, Guillaume Gravier, Pascale Sébillot
How predictable is ASR confidence in dialog applications?
Xiang Li, Juan M. Huerta
Error detection in confusion network
Alexandre Allauzen
An approach to efficient generation of high-accuracy and compact error-corrective models for speech recognition
Takanobu Oba, Takaaki Hori, Atsushi Nakamura
Detection of out-of-vocabulary words in posterior based ASR
Hamed Ketabdar, Mirko Hannemann, Hynek Hermansky
Homograph ambiguity resolution in front-end design for portuguese TTS systems
Daniela Braga, Luís Coelho, Fernando Gil V. Resende
New word acquisition using subword modeling
Ghinwa F. Choueiter, Stephanie Seneff, James Glass
Language identification of person names using CF-IOF based weighing function
Samuel Thomas, Ashish Verma
G2p conversion of names: what can we do (better)?
Henk van den Heuvel, Jean-Pierre Martens, Nanneke Konings
A learning method for Thai phonetization of English words
Ausdang Thangthai, Chai Wutiwiwatchai, Anocha Ragchatjaroen, Sittipong Saychum
Spontaneous speech synthesis by pronunciation variant selection - a comparison to natural speech
Steffen Werner, Rüdiger Hoffmann
A generic methodology of converting transliterated text to phonetic strings case study: greeklish
Nikos Tsourakis, Vassilis Digalakis
Probabilistic deduction of symbol mappings for extension of lexicons
Rita Singh, Evandro B. Gouvêa, Bhiksha Raj
Use of syllable center detection for improved duration modeling in Chinese Mandarin connected digits recognition
Sergey Astrov, Joachim Hofer, Harald Höge
Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language
Thomas Pellegrini, Lori Lamel
Robust F0 modeling for Mandarin speech recognition in noise
Sheng Qiang, Yao Qian, Frank K. Soong, Congfu Xu
Word duration modeling for word graph rescoring in LVCSR
Dino Seppi, Daniele Falavigna, Georg Stemmer, Roberto Gretter
On automatic prominence detection for German
Fabio Tamburini, Petra Wagner
Prosody-enriched lattices for improved syllable recognition
Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan
Exploiting phoneme similarities in hybrid HMM-ANN keyword spotting
Joel Pinto, Andrew Lovitt, Hynek Hermansky
Online vocabulary adaptation using limited adaptation data
C. E. Liu, K. Thambiratnam, F. Seide
An overview on automatic speech attribute transcription (ASAT)
Chin-Hui Lee, Mark A. Clements, Sorin Dusan, Eric Fosler-Lussier, Keith Johnson, Biing-Hwang Juang, Lawrence R. Rabiner
Detection-based ASR in the automatic speech attribute transcription project
Ilana Bromberg, Qian Qian, Jun Hou, Jinyu Li, Chengyuan Ma, Brett Matthews, Antonio Moreno-Daniel, Jeremy Morris, Sabato Marco Siniscalchi, Yu Tsao, Yu Wang
Attribute-based Mandarin speech recognition using conditional random fields
Chi-Yueh Lin, Hsiao-Chuan Wang
Comparing classifiers for pronunciation error detection
Helmer Strik, Khiet P. Truong, Febe de Wet, Catia Cucchiarini
Using prosodic and spectral characteristics for sleepiness detection
Jarek Krajewski, Bernd Kröger
Score fusion for articulatory feature detection
Brian M. Ore, Raymond E. Slyh
Improved location features for meeting speaker diarization
Scott Otterson
A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system
Kyu J. Han, Shrikanth S. Narayanan
The blame game: performance analysis of speaker diarization system components
Marijn Huijbregts, Chuck Wooters
Trainable speaker diarization
Hagai Aronowitz
Improving speaker diarization for CHIL lecture meetings
Jing Huang, Etienne Marcheret, Karthik Visweswariah
Speaker diarization using normalized cross likelihood ratio
Viet-Bac Le, Odile Mella, Dominique Fohr
Tone production by the speakers of different age-and-gender groups
Wai-Sum Lee
Vowels and tones in infant directed speech: hyperarticulation for both, but different developmental patterns
Nan Xu, Denis Burnham, Christine Kitamura
Acquisition of vowel duration in children speaking american English
Eon-Suk Ko
F0 models show Chinese speakers of Japanese insert intonational boundaries and drop pitch
Hiroko Hirano, Keikichi Hirose, Goh Kawai, Wentao Gu, Nobuaki Minematsu
Formal modelling of L1 and L2 perceptual learning: computational linguistics versus machine learning
Paola Escudero, Jelle Kastelein, Klara Weiand, R. J. J. H. van Son
Kettle hinders cat, shadow does not hinder shed: activation of ‘almost embedded’ words in nonnative listening
Mirjam Broersma
An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements
Sacha Krstulović, Anna Hunecke, Marc Schröder
Statistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems
Liang Gu, Wei Zhang, Lazkin Tahir, Yuqing Gao
A pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis
Wu Liu, Dezhi Huang, Yuan Dong, Xinnian Mao, Haila Wang
A trainable excitation model for HMM-based speech synthesis
R. Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda
Cross-language phonemisation in German text-to-speech synthesis
Jochen Steigner, Marc Schröder
Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone
Ryuki Tachibana, Tohru Nagano, Gakuto Kurata, Masafumi Nishimura, Noboru Babaguchi
Implementation and evaluation of an HMM-based Thai speech synthesis system
Suphattharachai Chomphan, Takao Kobayashi
Speech synthesis enhancement in noisy environments
Davide Bonardo, Enrico Zovato
Tagging syllable boundaries with joint n-gram models
Helmut Schmid, Bernd Möbius, Julia Weidenkaff
Hierarchical non-uniform unit selection based on prosodic structure
Jun Xu, Dezhi Huang, Yongxin Wang, Yuan Dong, Lianhong Cai, Haila Wang
Control of an articulatory speech synthesizer based on dynamic approximation of spatial articulatory targets
Peter Birkholz
A preselection method based on cost degradation from the optimal sequence for concatenative speech synthesis
Nobuyuki Nishizawa, Hisashi Kawai
Line cepstral quefrencies and their use for acoustic inventory coding
Guntram Strecha, Matthias Eichner, Rüdiger Hoffmann
Articulatory acoustic feature applications in speech synthesis
Peter Cahill, Daniel Aioanei, Julie Carson-Berndsen
Approaches for adaptive database reduction for text-to-speech synthesis
Aleksandra Krul, Géraldine Damnati, François Yvon, Cédric Boidin, Thierry Moudenc
Exploiting unlabeled internal data in conditional random fields to reduce word segmentation errors for Chinese texts
Richard Tzong-Han Tsai, Hsi-Chuan Hung, Hong-Jie Dai, Wen-Lian Hsu
On the role of spectral dynamics in unit selection speech synthesis
Barry Kirkpatrick, Darragh O'Brien, Ronán Scaife, Andrew Errity
ugloss: a framework for improving spoken language generation understandability
Brian Langner, Alan W. Black
Combination of LSF and pole based parameter interpolation for model-based diphone concatenation
Karl Schnell, Arild Lacroix
Automatic building of synthetic voices from large multi-paragraph speech databases
Kishore Prahallad, Arthur R. Toth, Alan W. Black
Automatic phonetic segmentation of Spanish emotional speech
A. Gallardo-Antolín, R. Barra, Marc Schröder, Sacha Krstulović, J. M. Montero
Iterative unit selection with unnatural prosody detection
Dacheng Lin, Yong Zhao, Frank K. Soong, Min Chu, Jieyu Zhao
F0 transformation within the voice conversion framework
Zdeněk Hanzlíček, Jindřich Matoušek
Weighted frequency warping for voice conversion
Daniel Erro, Asunción Moreno
Frame alignment method for cross-lingual voice conversion
Daniel Erro, Asunción Moreno
Voicing level control with application in voice conversion
Jani Nurminen, Jilei Tian, Victor Popa
New algorithm for LPC residual estimation from LSF vectors for a voice conversion system
Winston S. Percybrooks, Elliot Moore
Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model
Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
Improving the phase vocoder approach to pitch-shifting
Petko N. Petkov, W. Bastiaan Kleijn
Comparing GMM-based speech transformation systems
Larbi Mesbahi, Vincent Barreaud, Olivier Boeffard
Improved HMM/SVM methods for automatic phoneme segmentation
Jen-Wei Kuo, Hung-Yi Lo, Hsin-Min Wang
Gaussian mixture optimization for HMM based on efficient cross-validation
Takahiro Shinozaki, Tatsuya Kawahara
Model-space MLLR for trajectory HMMs
Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda
In-context phone posteriors as complementary features for tandem ASR
Hamed Ketabdar, Hervé Bourlard
Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition
Qian Qian, Xiaodong He, Li Deng
Improved acoustic modeling for transcribing Arabic broadcast data
Lori Lamel, Abdel. Messaoudi, Jean-Luc Gauvain
String and lattice based discriminative training for the corpus of spontaneous Japanese lecture transcription task
Erik McDermott, Atsushi Nakamura
Discriminative noise adaptive training approach for an environment migration
Byung-Ok Kang, Ho-Young Jung, Yun-Keun Lee
Word confusability - measuring hidden Markov model similarity
Jia-Yu Chen, Peder A. Olsen, John R. Hershey
Speech recognition with state-based nearest neighbour classifiers
Thomas Deselaers, Georg Heigold, Hermann Ney
HMM-based speech recognition using decision trees instead of GMMs
Remco Teunen, Masami Akamine
An improved method for unsupervised training of LVCSR systems
Christian Gollan, Stefan Hahn, Ralf Schlüter, Hermann Ney
A variational approach to robust maximum likelihood estimation for speech recognition
Mohamed Kamal Omar
Generating small, accurate acoustic models with a modified Bayesian information criterion
Kai Yu, Rob A. Rutenbar
Sparse Gaussian graphical models for speech recognition
Peter Bell, Simon King
An HMM acoustic model incorporating various additional knowledge sources
Sakriani Sakti, Konstantin Markov, Satoshi Nakamura
Comparison of subspace methods for Gaussian mixture models in speech recognition
Matti Varjokallio, Mikko Kurimo
SPICE: web-based tools for rapid language adaptation in speech processing systems
Tanja Schultz, Alan W. Black, Sameer Badaskar, Matthew Hornyak, John Kominek
Introduction to multilingual corpus-based concatenative speech synthesis
Filip Deprez, Jan Odijk, Jan De Moortel
Recognition of foreign names spoken by native speakers
Frederik Stouten, Jean-Pierre Martens
Language identification using several sources of information with a multiple-Gaussian classifier
R. Cordoba, L. F. D'Haro, F. Fernandez-Martinez, J. M. Montero, R. Barra
Dynamic language change in MIMUS
Carmen Del Solar, Guillermo Pérez, Eva Florencio, David Moral, Gabriel Amores, Pilar Manchón
The RWTH 2007 TC-STAR evaluation system for european English and Spanish
Jonas Lööf, Christian Gollan, Stefan Hahn, Georg Heigold, B. Hoffmeister, Christian Plahl, David Rybach, Ralf Schlüter, Hermann Ney
Using direction of arrival estimate and acoustic feature information in speaker diarization
Eugene Chin Wei Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Eng Siong Chng, Haizhou Li, Susanto Rahardja
Recovering punctuation marks for automatic speech recognition
Fernando Batista, Diamantino Caseiro, Nuno Mamede, Isabel Trancoso
Disfluency correction of spontaneous speech using conditional random fields with variable-length features
Jui-Feng Yeh, Chung-Hsien Wu, Wei-Yen Wu
Detection, diarization, and transcription of far-field lecture speech
Jing Huang, Etienne Marcheret, Karthik Visweswariah, Vit Libal, Gerasimos Potamianos
Speech-based annotation and retrieval of digital photographs
Timothy J. Hazen, Brennan Sherry, Mark Adler
Co-training using prosodic and lexical information for sentence segmentation
Umit Guz, Sébastien Cuendet, Dilek Hakkani-Tür, Gokhan Tur
Extracting true speaker identities from transcriptions
Yannick Estève, Sylvain Meignier, Paul Deléglise, Julie Mauclair
An improved speaker diarization system
Rong Fu, Ian D. Benest
The ISL 2007 English speech transcription system for european parliament speeches
Sebastian Stüker, Christian Fügen, Florian Kraft, Matthias Wölfel
Advances in Mandarin broadcast speech recognition
Mei-Yuh Hwang, Wen Wang, Xin Lei, Jing Zheng, Ozgur Cetin, Gang Peng
Automatic transcription for a web 2.0 service to search podcasts
Jun Ogata, Masataka Goto, Kouichirou Eto
A text-free approach to assessing nonnative intonation
Joseph Tepperman, Abe Kazemzadeh, Shrikanth S. Narayanan
Automatic generation of cloze items for prepositions
John Lee, Stephanie Seneff
Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance
Christopher Waple, Hongcui Wang, Tatsuya Kawahara, Yasushi Tsubota, Masatake Dantsuji
ASR-based pronunciation training: scoring accuracy and pedagogical effectiveness of a system for dutch L2 learners
Catia Cucchiarini, Ambra Neri, Febe de Wet, Helmer Strik
A Bayesian network classifier for word-level reading assessment
Joseph Tepperman, Matthew Black, Patti Price, Sungbok Lee, Abe Kazemzadeh, Matteo Gerosa, Margaret Heritage, Abeer Alwan, Shrikanth S. Narayanan
Behavior models for learning and receptionist dialogs
Hartwig Holzapfel, Alex Waibel
Design of a rich multimodal interface for mobile spoken route guidance
Markku Turunen, Jaakko Hakulinen, Anssi Kainulainen, Aleksi Melto, Topi Hurtig
The virtual guide: a direction giving embodied conversational agent
Mariët Theune, Dennis Hofs, Marco van Kessel
Creating spoken dialogue characters from corpora without annotations
Sudeep Gandhe, David Traum
Complementarity and redundancy in multimodal user inputs with speech and pen gestures
Pui-Yu Hui, Zhengyu Zhou, Helen Meng
Children's convergence in referring expressions to graphical objects in a speech-enabled computer game
Linda Bell, Joakim Gustafson
An analysis of individual differences in the f0 contour and the duration of anger utterances at several degrees
Hiromi Kawatsu, Sumio Ohno
Acoustic features of anger utterances during natural dialog
Yoshiko Arimoto, Sumio Ohno, Hitoshi Iida
Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis
Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg, Wisam Dakka
Using neutral speech models for emotional speech analysis
Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan
Emotion clustering using the results of subjective opinion tests for emotion recognition in infants' cries
N. Satoh, K. Yamauchi, S. Matsunaga, M. Yamashita, R. Nakagawa, K. Shinohara
On the limitations of voice conversion techniques in emotion identification tasks
R. Barra, J. M. Montero, J. Macias-Guarasa, J. Gutiérrez-Arriola, J. Ferreiros, J. M. Pardo
Use of lexical and affective prosodic cues to emotion by younger and older adults
Kate Dupuis, Kathleen Pichora-Fuller
Two-stream emotion recognition for call center monitoring
Purnima Gupta, Nitendra Rajput
The role of intonation and voice quality in the affective speech perception
Ioulia Grichkovtsova, Anne Lacheret, Michel Morel
Combining frame and turn-level information for robust recognition of emotions within speech
Bogdan Vlasenko, Björn Schuller, Andreas Wendemuth, Gerhard Rigoll
The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals
Björn Schuller, Anton Batliner, Dino Seppi, Stefan Steidl, Thurid Vogt, Johannes Wagner, Laurence Devillers, Laurence Vidrascu, Noam Amir, Loic Kessous, Vered Aharonson
Automatic question detection: prosodic-lexical features and crosslingual experiments
Vũ Minh Quang, Laurent Besacier, Eric Castelli
Performance evaluation of HMM-based style classification with a small amount of training data
Makoto Tachibana, Keigo Kawashima, Junichi Yamagishi, Takao Kobayashi
Visualizing acoustic similarities between emotions in speech: an acoustic map of emotions
Khiet P. Truong, David A. van Leeuwen
Fusion of global statistical and segmental spectral features for speech emotion recognition
Hao Hu, Ming-Xing Xu, Wei Wu
Group delay features for emotion detection
Vidhyasaharan Sethu, Eliathamby Ambikairajah, Julien Epps
Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age
Christian Müller, Felix Burkhardt
Detecting deception using critical segments
Frank Enos, Elizabeth Shriberg, Martin Graciarena, Julia Hirschberg, Andreas Stolcke
Style estimation of speech based on multiple regression hidden semi-Markov model
Takashi Nose, Yoichi Kato, Takao Kobayashi
Analysis and classification of speech mode: whispered through shouted
Chi Zhang, John H. L. Hansen
Perception and production of word-final alveolar stops by brazilian portuguese learners of English
Melissa Bettoni-Techio, Andréia S. Rauber, Rosana Denise Koerich
The relationship between the perception and production of English nasal codas by brazilian learners of English
Denise Cristina Kluge, Andréia S. Rauber, Mara Silvia Reis, Ricardo A. Hoffmann Bion
CALL courseware for learning reactive tokens in face-to-face dialogs
Takafumi Utashiro, Goh Kawai
The developmental analysis of demonstrative expression skills utilizing a multimodal infant behavior corpus
Shinya Kiriyama, Ryo Tsuji, Tomohiko Kasami, Shogo Ishikawa, Naofumi Otani, Hiroaki Horiuchi, Yoichi Takebayashi, Shigeyoshi Kitazawa
Russian vowels system acoustic features development in ontogenesis
Elena E. Lyakso, Olga V. Frolova
The role of metrical stress in comprehension and production in dutch children at-risk of dyslexia
Petra van Alphen, Elise de Bree, Paula Fikkert, Frank Wijnen
A statistical method of evaluating pronunciation proficiency for presentation in English
Seiichi Nakagawa, Kei Ohta
The intelligibility and its relations to acoustic characteristics of English /s/ and /esh/ produced by native speakers of Japanese
Akiyo Joto, Yoshiki Nagase, Seiya Funatsu
The limits of multidimensional category learning
Martijn Goudbeek, Daniel Swingley, Keith R. Kluender
Mobile adaptive CALL (MAC): a lightweight speech-based intervention for mobile language learners
Maria Uther, James Uther, Panos Athanasopoulos, Pushpendra Singh, Reiko Akahane-Yamada
English and French speakers' perception of voicing distinctions in non-native lateral consonant syllable onsets
Catherine T. Best, Pierre A. Hallé, Jennifer S. Pardo
Predicting the consequences of vocalizations in early infancy
Francisco Lacerda, Lisa Gustavsson
Learning tone distinctions for Mandarin Chinese
David Weenink, Guangqin Chen, Zongyan Chen, Stefan de Konink, Dennis Vierkant, Eveline van Hagen, R. J. J. H. van Son
Perception of disfluency: language differences and listener bias
Catherine Lai, Kyle Gorman, Jiahong Yuan, Mark Liberman
Design and characterization of the non-native military air traffic communications database (nnMATC)
Stephane Pigeon, Wade Shen, Aaron Lawson, David A. van Leeuwen
A comparison of speaker clustering and speech recognition techniques for air situational awareness
Wade Shen, Douglas Reynolds
Advanced front-end for robust speech recognition in extremely adverse environments
Dimitrios Dimitriadis, Jose C. Segura, Luz Garcia, Alexandros Potamianos, Petros Maragos, Vassilis Pitsikalis
Experiments on hiwire database using denoising and adaptation with a hybrid HMM-ANN model
Roberto Gemello, Franco Mana, Stefano Scanzio
Detection and removal of switching noise in push-to-talk and voice operated exchange communications systems
Brett Y. Smolenski
Evaluation of the combined use of MEMLIN and MLLR on the non-native adaptation task of hiwire project database
Luis Buera, Antonio Miguel, Óscar Saz, Eduardo Lleida, Alfonso Ortega
Improved machine translation of speech-to-text outputs
Daniel Déchelotte, Holger Schwenk, Gilles Adda, Jean-Luc Gauvain
Improvements in machine translation for English/iraqi speech translation
S. Saleem, K. Subramanian, R. Prasad, David Stallard, Chia-Lin Kao, P. Natarajan, R. Suleiman
Improving speech translation with automatic boundary prediction
Evgeny Matusov, Dustin Hillard, Mathew Magimai-Doss, Dilek Hakkani-Tür, Mari Ostendorf, Hermann Ney
Punctuating confusion networks for speech translation
Roldano Cattoni, Nicola Bertoldi, Marcello Federico
Integration of ASR and machine translation models in a document translation task
Aarthi Reddy, Richard Rose, Alain Désilets
Bilingual LSA-based translation lexicon adaptation for spoken language translation
Yik-Cheung Tam, Tanja Schultz
The BBN 2007 displayless English/iraqi speech-to-speech translation system
David Stallard, Fred Choi, Chia-Lin Kao, Kriste Krstovski, P. Natarajan, R. Prasad, S. Saleem, K. Subramanian
Context dependent word modeling for statistical machine translation using part-of-speech tags
Ruhi Sarikaya, Yonggang Deng, Yuqing Gao
Translating conversational speech to standard linguistic form
Darren Scott Appling, Nick Campbell
Using inter-lingual triggers for machine translation
Caroline Lavecchia, Kamel Smaïli, David Langlois, Jean-Paul Haton
The IRST English-Spanish translation system for european parliament speeches
Daniele Falavigna, Nicola Bertoldi, Fabio Brugnara, Roldano Cattoni, Mauro Cettolo, Boxing Chen, Marcello Federico, Diego Giuliani, Roberto Gretter, Deepa Gupta, Dino Seppi
The influence of utterance chunking on machine translation performance
Christian Fügen, Muntsin Kolss
Iraqcomm: a next generation translation system
Kristin Precoda, Jing Zheng, Dimitra Vergyri, Horacio Franco, Colleen Richey, Andreas Kathol, Sachin Kajarekar
Optimizing sentence segmentation for spoken language translation
Sharath Rao, Ian Lane, Tanja Schultz
A multitask learning perspective on acoustic-articulatory inversion
Korin Richmond
A comparison of acoustic features for articulatory inversion
Chao Qin, Miguel Á. Carreira-Perpiñán
Can unquantised articulatory feature continuums be modelled?
Odette Scharenborg, Vincent Wan
Estimation of place of articulation in stop consonants for visual feedback
Milind S. Shah, Prem C. Pandey
Compact representations of the articulatory-to-acoustic mapping
Blaise Potard, Yves Laprie
Articulatory feature classifiers trained on 2000 hours of telephone speech
Joe Frankel, Mathew Magimai-Doss, Simon King, Karen Livescu, Özgür Çetin
Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech
Amr H. Nour-Eldin, Peter Kabal
Artificial bandwidth extension without side information for ITU-t g.729.1
Bernd Geiser, Hervé Taddei, Peter Vary
The effect of highband harmonic structure in the artificial bandwidth expansion of telephone speech
Hannu Pulakka, Paavo Alku, Laura Laaksonen, Päivi Valve
Artificial bandwidth extension for speech signals using speech recogniton
Shingo Kuroiwa, Masashi Takashina, Satoru Tsuge, Ren Fuji
Voicing-based codebook in low-rate wideband CELP coding
Driss Guerchi, Tamer Rabie, Abdelrhani Louzi
Performance of speaker-dependent wideband speech coding
Ethan R. Duni, Bhaskar D. Rao
Speech recognition techniques for a sign language recognition system
Philippe Dreuw, David Rybach, Thomas Deselaers, Morteza Zahedi, Hermann Ney
Impact of various small sound source signals on voice conversion accuracy in speech communication aid for laryngectomees
Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano
Design and development of voice controlled aids for motor-handicapped persons
Petr Cerva, Jan Nouza
Management of static/dynamic properties in a multimodal interaction system
Kouichi Katsurada, Yuji Okuma, Makoto Yano, Yurie Iribe, Tsuneo Nitta
Evaluation of alternatives on speech to sign language translation
R. San-Segundo, A. Pérez, D. Ortiz, L. F. D'Haro, M. Inés Torres, F. Casacuberta
Speech based drug information system for aged and visually impaired persons
Géza Németh, Gábor Olaszy, Mátyás Bartalis, Géza Kiss, Csaba Zainkó, Péter Mihajlik
Automatic speech recognition with a cochlear implant front-end
Waldo Nogueira, Tamás Harczos, Bernd Edler, Jörn Ostermann, Andreas Büchner
Voice activated powered wheelchair with non-voice rejection algorithm
Soo-Young Suk, Hiroaki Kojima
Phonetic based sentence level rewriting of questions typed by dyslexic spellers in an information retrieval context
Laurianne Sitbon, Patrice Bellot, Philippe Blache
How to integrate speech-operated internet information dialogs into a car
André Berton, Peter Regel-Brietzmann, Hans-Ulrich Block, Stefanie Schachtl, Manfred Gehrke
Recent progress in the MIT spoken lecture processing project
James Glass, Timothy J. Hazen, Scott Cyphers, Igor Malioutov, David Huynh, Regina Barzilay
How to personalize speech applications for web-based information in a car
Philipp Fischer, Andreas Österle, André Berton, Peter Regel-Brietzmann
Topic estimation with domain extensibility for guiding user's out-of-grammar utterances in multi-domain spoken dialogue systems
Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Prosody change and response timing analysis in spontaneously spoken dialogs and their modeling in a spoken dialog system
Ryota Nishimura, Norihide Kitaoka, Seiichi Nakagawa
GEMSIS - a novel application of speech recognition to emergency and disaster medicine
Satoshi Tamura, Kunihiko Takamatsu, Shinji Ogura, Satoru Hayamizu
Application of speech technology in a home based assessment kiosk for early detection of alzheimer's disease
Rachel Coulston, Esther Klabbers, Jacques de Villiers, John-Paul Hosom
Ontology-based multimodal high level fusion involving natural language analysis for aged people home care application
Olga Vybornova, Monica Gemo, Ronald Moncarey, Benoit Macq
Modeling the statistical behavior of lexical chains to capture word cohesiveness for automatic story segmentation
Shing-kai Chan, Lei Xie, Helen Meng
Cross-linguistic analysis of prosodic features for sentence segmentation
James G. Fung, Dilek Hakkani-Tür, Mathew Magimai-Doss, Elizabeth Shriberg, Sébastien Cuendet, Nikki Mirghafori
Varying input segmentation for story boundary detection in English, Arabic and Mandarin broadcast news
Andrew Rosenberg, Mehrbod Sharifi, Julia Hirschberg
Speaker role based structural classification of broadcast news stories
BalaKrishna Kolluru, Yoshihiko Gotoh
The influence of vowel quality features on peak alignment
Matthias Jilka, Bernd Möbius
Pitch accent versus lexical stress: quantifying acoustic measures related to the voice source
Yen-Liang Shue, Markus Iseli, Nanette Veilleux, Abeer Alwan
Prosody, emotions, and… ‘whatever’
Stefan Benus, Agustín Gravano, Julia Hirschberg
Modeling tones in hakka on the basis of the command-response model
Wentao Gu, Rerrario Shui-Ching Ho, Tan Lee
Length, ordering preference and intonational phrasing: evidence from pauses
Gerrit Kentner
Alignment of the second low target in dutch falling-rising pitch contours
Jörg Peters, Judith Hanssen, Carlos Gussenhoven
On filled-pauses and prolongations in european portuguese
Helena Moniz, Ana Isabel Mata, M. Céu Viana
Dependence of tone perception on syllable perception
Michael Olsberg, Yi Xu, Jeremy Green
Testing the relevance of speech rate, pitch and a glottal Chink for the perception of age in synthesized speech using formant synthesis
Ralf Winkler
Utterance-final glottalization as a cue for familiar speaker recognition
Tamás Böhm, Stefanie Shattuck-Hufnagel
A rule-based speech morphing for verifying a expressive speech perception model
Chun-Fang Huang, Masato Akagi
On the importance of pure prosody in the perception of speaker identity
Elina E. Helander, Jani Nurminen
Perceptual relevance of pitch contours of Mandarin tones and its efficacy in prosody generation of speech synthesis
Shi-Han Chen, Chih-Chung Kuo
The effect of filled pauses in a lecture speech on impressive evaluation of listeners
Hiromitsu Nishizaki, Mitsuhiro Sohmiya, Kenji Kobayashi, Yoshihiro Sekiguchi
Perceptual equivalence of approximated Cantonese tone contours
Yujia Li, Tan Lee
Audiovisual emotional speech of game playing children: effects of age and culture
Suleman Shahid, Emiel Krahmer, Marc Swerts
Machine learning for spoken dialogue systems
Oliver Lemon, Olivier Pietquin
Learning dialogue strategies for interactive database search
Verena Rieser, Oliver Lemon
Hierarchical dialogue optimization using semi-Markov decision processes
Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira
Knowledge consistent user simulations for dialog systems
Hua Ai, Diane J. Litman
Reducing recognition error rate based on context relationships among dialogue turns
Hsu-Chih Wu, Stephanie Seneff
Bayes risk-based optimization of dialogue management for document retrieval system with speech interface
Teruhisa Misu, Tatsuya Kawahara
Realisations and alternations in German /r/-realisation
Christiane Ulbrich, Horst Ulbrich
Singleton and geminate stops in Finnish - acoustic correlates
Christopher S. Doty, Kaori Idemaru, Susan G. Guion
Segment deletion in spontaneous speech: a corpus study using mixed effects models with crossed random effects
Christophe Van Bael, Harald Baayen, Helmer Strik
Categorical perception of Cantonese tones in context: a cross-linguistic study
Hongying Zheng, Peter W. M. Tsang, William S. -Y. Wang
A corpus study of the 3rd tone sandhi in standard Chinese
Yiya Chen, Jiahong Yuan
Age-related changes in fundamental frequency and formants: a longitudinal study of four speakers
Jonathan Harrington, Sallyanne Palethorpe, Catherine I. Watson
A comparative study on speech summarization of broadcast news and lecture speech
Jian Zhang, Ho Yin Chan, Pascale Fung, Lu Cao
Towards online speech summarization
Gabriel Murray, Steve Renals
System request detection in conversation based on acoustic and speaker alternation features
Tomoyuki Yamagata, Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki
Selecting on-topic sentences from natural language corpora
Michael Levit, Elizabeth Boschee, Marjorie Freedman
A semi-supervised method for efficient construction of statistical spoken language understanding resources
Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee
Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization
Yasuhisa Fujii, Norihide Kitaoka, Seiichi Nakagawa
A unified probabilistic generative framework for extractive spoken document summarization
Yi-Ting Chen, Hsuan-Sheng Chiu, Hsin-Min Wang, Berlin Chen
Generic class-based statistical language models for robust speech understanding in directed dialog applications
Matthieu Hébert
Robust location understanding in spoken dialog systems using intersections
Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Alex Acero
Speech-nonspeech discrimination using the information bottleneck method and spectro-temporal modulation index
Maria Markaki, Michael Wohlmayr, Yannis Stylianou
A uniformly most powerful test for statistical model-based voice activity detection
Keun Won Jang, Dong Kook Kim, Joon-Hyuk Chang
Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics
John Dines, Jithendra Vepa
Filtering the unknown: speech activity detection in heterogeneous video collections
Marijn Huijbregts, Chuck Wooters, Roeland Ordelman
Environmentally aware voice activity detector
Abhijeet Sangwan, Nitish Krishnamurthy, John H. L. Hansen
Noise robust voice activity detection based on switching kalman filter
Masakiyo Fujimoto, Kentaro Ishizuka
Voice activity detection based on support vector machine using effective feature vectors
Q-Haing Jo, Yun-Sik Park, Kye-Hwan Lee, Ji-Hyun Song, Joon-Hyuk Chang
Voice activity detection in degraded speech using excitation source information
K Sri Rama Murty, B Yegnanarayana, S Guruprasad
Evaluation of real-time voice activity detection based on high order statistics
David Cournapeau, Tatsuya Kawahara
Robust voice activity detection based on adaptive sub-band energy sequence analysis and harmonic detection
Yanmeng Guo, Qian Qian, Yonghong Yan
The influence of speech activity detection and overlap on speaker diarization for meeting room recordings
Corinne Fredouille, Nicholas Evans
Voice activity detection using the phase vector in microphone array
Gibak Kim, Nam Ik Cho
Adaptive weighting of microphone arrays for distant-talking F0 and voiced/unvoiced estimation
Federico Flego, Christian Zieger, Maurizio Omologo
Robust and high-resolution voiced/unvoiced classification in noisy speech using a signal smoothness criterion
A. Sreenivasa Murthy, S. Chandra Sekhar, T. V. Sreenivas
Audio classification using extended baum-welch transformations
Tara N. Sainath, Victor Zue, Dimitri Kanevsky
Automatic laughter detection using neural networks
Mary Tai Knox, Nikki Mirghafori
Automatic acoustic segmentation for speech recognition on broadcast recordings
Gang Peng, Mei-Yuh Hwang, Mari Ostendorf
Articulatory synthesis of singing
Peter Birkholz
Vocal conversion from speaking voice to singing voice using STRAIGHT
Takeshi Saitou, Masataka Goto, Masashi Unoki, Masato Akagi
Speech to chant transformation with the phase vocoder
Axel Roebel, Joshua Fineberg
VOCALOID - commercial singing synthesizer based on sample concatenation
Hideki Kenmochi, Hayato Ohshita
RAMCESS/handsketch : a multi-representation framework for realtime and expressive singing synthesis
Nicolas D’Alessandro, Thierry Dutoit
Formant-based synthesis of singing
Sten Ternström, Johan Sundberg
ELAN: a free and open-source multimedia annotation tool
Han Sloetjes, Albert Russel, Alexander Klassmann
Speechindexer in action: managing endangered Formosan languages
Jozsef Szakos, Ulrike Glavitsch
A portable record player for wax cylinders using a laser-beam reflection method
Tohru Ifukube, Yasuyuki Shimizu
Article |
---|