ISCA Archive Interspeech 2004 Sessions Search Booklet
  ISCA Archive Sessions Search Booklet
×

Click on column names to sort.

Searching uses the 'and' of terms e.g. Smith Interspeech matches all papers by Smith in any Interspeech. The order of terms is not significant.

Use double quotes for exact phrasal matches e.g. "acoustic features".

Case is ignored.

Diacritics are optional e.g. lefevre also matches lefèvre (but not vice versa).

It can be useful to turn off spell-checking for the search box in your browser preferences.

If you prefer to scroll rather than page, increase the number in the show entries dropdown.

top

Interspeech 2004

Jeju Island, Korea
4-8 October 2004

General Chair: Dae Hee Youn
doi: 10.21437/Interspeech.2004
ISSN: 2958-1796

Speech Recognition - Adaptation


Stochastic gradient adaptation of front-end parameters
Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goe

Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions
Antoine Raux, Rita Singh

Transformation and combination of hiden Markov models for speaker selection training
Chao Huang, Tao Chen, Eric Chang

Improving eigenspace-based MLLR adaptation by kernel PCA
Brian Mak, Roger Hsiao

Rapid acoustic model development using Gaussian mixture clustering and language adaptation
Nikos Chatzichrisafis, Vasilios Digalakis, Vasilios Diakoloukas, Costas Harizakis

Adaptation of front end parameters in a speech recognizer
Karthik Visweswariah, Ramesh Gopinath

Speaker normalization through constrained MLLR based transforms
Diego Giuliani, Matteo Gerosa, Fabio Brugnara

Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying
Xiangyu Mu, Shuwu Zhang, Bo Xu

Adaptation in the pronunciation space for non-native speech recognition
Georg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth

Robust ASR model adaptation by feature-based statistical data mapping
Xuechuan Wang, Douglas O'Shaughnessy

A novel target-driven generalized JMAP adaptation algorithm
Zhaobing Han, Shuwu Zhang, Bo Xu

Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA
Brian Mak, Simon Ho, James T. Kwok

Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition
Hyung Bae Jeon, Dong Kook Kim

Vocal tract normalization based on spectral warping
Wei Wang, Stephen Zahorian

Acoustic model adaptation for coded speech using synthetic speech
Koji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge

Speaker adaptation method for CALL system using bilingual speakers' utterances
Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino

Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task
Shinji Watanabe

Speaker clustering of speech utterances using a voice characteristic reference space
Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang

Performance improvement of connected digit recognition using unsupervised fast speaker adaptation
Young Kuk Kim, Hwa Jeon Song, Hyung Soon Kim

Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation
Hyung Soon Kim, Hwa Jeon Song

Speaker dependent model order selection of spectral envelopes
Matthias Wölfel

Methods for task adaptation of acoustic models with limited transcribed in-domain data
Enrico Bocchieri, Michael Riley, Murat Saraclar

Unsupervised topic adaptation for lecture speech retrieval
Atsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba

Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs
Haibin Liu, Zhenyang Wu

Design of ready-made acoustic model library by two-dimensional visualization of acoustic space
Goshu Nagino, Makoto Shozakai


Spoken Language Identification, Translation and Retrieval I


Language recognition using phone latices
Jean-Luc Gauvain, Abdel Messaoudi, Holger Schwenk

ACCDIST: a metric for comparing speakers' accents
Mark Huckvale

Aspects of named entity processing
Michael Levit, Allen Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth

Finite-state-based and phrase-based statistical machine translation
Josep M. Crego, José B. Marino, Adria de Gispert

Using word latice information for a tighter coupling in speech translation systems
Tanja Schultz, Szu-Chen Jou, Stephan Vogel, Shirin Saleem

Confirmation strategy for document retrieval systems with spoken dialog interface
Teruhisa Misu, Tatsuya Kawahara, Kazunori Komatani

Multilayer subword units for open-vocabulary spoken document retrieval
Shi-Wook Lee, Kazuyo Tanaka, Yoshiaki Itoh

An efficient partial matching algorithm toward speech retrieval by speech
Yoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee

Language detection by neural discrimination
Celestin Sedogbo, Sebastien Herry, Bruno Gas, Jean Luc Zarader

Language identification techniques based on full recognition in an air traffic control task
Ricardo de Córdoba, Javier Ferreiros, Valentin Sama, Javier Macias-Guarasa, Luis F. D'Haro, Fernando Fernandez

Dialect analysis and modeling for automatic classification
John H. L. Hansen, Umit Yapanel, Rongqing Huang, Ayako Ikeno

Rhythm in read british English: interdialect variability
Emmanuel Ferragne, Francois Pellegrino

A grammar-based Chinese to English speech translation system for portable devices
Pascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu

Cost-sensitive call classification
Gokhan Tur

An evaluation of a spoken document retrieval baseline system in finish
Mikko Kurimo, Ville Turunen, Inger Ekman

Discriminative training of naive Bayes classifiers for natural language call routing
Hui Jiang, Pengfei Liu, Imed Zitouni

Phonetic confusion based document expansion for spoken document retrieval
Nicolas Moreau, Hyoung-Gook Kim, Thomas Sikora

Hybrid named entity recognition for question-answering system
Euisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang

An online audio indexing system
Jitendra Ajmera, Iain McCowan, Hervé Bourlard

Histogram normalisation and the recognition of names and ontology words in the MUMIS project
Eric Sanders, Febe de Wet

Improving the topic indexation and segmentation modules of a media watch system
Rui Amaral, Isabel Trancoso

Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches
Melissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, Francois Pellegrino

METRIC-SEQDAC: a hybrid approach for audio segmentation
Hsin-min Wang, Shih-sian Cheng

Statistical Chinese spoken document retrieval using latent topical information
Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-min Wang

Keyword recognition and extraction by multiple-LVCSRs with 60,000 words in speech-driven WEB retrieval task
Matsushita Masahiko, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro

Improved spoken language translation using n-best speech recognition hypotheses
Ruiqiang Zhang, Genichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai-Kit Lo

Automatic language identification using discrete hidden Markov model
Kakeung Wong, Man-hung Siu

Two-way speech-to-speech translation on handheld devices
Bowen Zhou, Daniel Dechelotte, Yuqing Gao

HLT modules scalability within the NESPOLE! project
Hervé Blanchon


Linguistics, Phonology, and Phonetics


Correlation between VOT and F0 in the perception of Korean stops and affricates
Midam Kim

The development of anticipatory labial coarticulation in French: a pionering study
Aude Noiray, Lucie Menard, Marie-Agnes Cathiard, Christian Abry, Christophe Savariaux

Speech recognition, sylabification and statistical phonetics
Melvyn John Hunt

Data-driven approaches for automatic detection of syllable boundaries
Jilei Tian

Phonemic repertoire and similarity within the vocabulary
Anne Cutler, Dennis Norris, Nuria Sebastian-Galles

Boostrapping phonetic lexicons for new languages
Sameer Maskey, Alan Black, Laura Tomokiya

Lexical representation of non-native phonemes
Mirjam Broersma, K. Marieke Kolkman

A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers
Jong-Pyo Lee, Tae-Yeoub Jang

Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI study
Emi Zuiki Murano, Mihoko Teshigawara

Effects of phonetic contexts on the duration of phonetic segments in fluent read speech
Sorin Dusan

A study on nasal coda los in continuous speech
Qiang Fang

An improved pair-wise variability index for comparing the timing characteristics of speech
Hua-Li Jian

An acoustic study of speech rhythm in taiwan English
Hua-Li Jian

Language specific phonetic rules: evidence from domain-initial strengthening
Sung-A Kim

Spectral characteristics of the release bursts in Korean alveolar stops
Hansang Park

Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian)
Rob Van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes

Assessment of non-native phones in anglicisms by German listeners
Julia Abresch, Stefan Breuer

Phonology of exceptions for for Korean grapheme-to-phoneme conversion
Sunhee Kim

Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effect
Kitazawa Shigeyoshi, Shinya Kiriyama

A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and Thai
Kimiko Tsukada

Acoustic correlates of phrase-internal lexical boundaries in dutch
Taehong Cho, Elizabeth K. Johnson

Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners' perception of dutch and English
Taehong Cho, James M. McQueen

Comparing intonation of two varieties of French using normalized F0 values
Svetlana Kaminskaia, Francois Poire

Phonetic realization of the suffix-suppressed accentual phrase in Korean
Mira Oh, Kee-Ho Kim

Spectral moment vs. bark cepstral analysis of children's word-initial voiceles stops
H. Timothy Bunnell, James Polikoff, Jane McNicholas

Pronunciation assessment based upon the compatibility between a learner's pronunciation structure and the target language's lexical structure
Nobuaki Minematsu

Spread of high tone in akita Japanese
Kenji Yoshida



Robust Speech Recognition on AURORA


Evaluation of universal compensation on Aurora 2 and 3 and beyond
Ming Ji, Baochun Hou

PROSPECT features and their application to missing data techniques for robust speech recognition
Hugo Van hamme

Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement
Hugo Van hamme, Patrick Wambacq, Veronique Stouten

Applying the Aurora feature extraction schemes to a phoneme based recognition task
Hans-Guenter Hirsch, Harald Finster

Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 database
Zhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui

Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm
Tor Andre Myrvoll, Satoshi Nakamura

HMM-based feature compensation method: an evaluation using the AURORA2
Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano

Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping
Xuechuan Wang, Douglas O'Shaughnessy

MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition
Benjamin J. Shannon, Kuldip K. Paliwal

A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR
Ghulam Muhammad, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta

Including uncertainty of speech observations in robust speech recognition
José Carlos Segura, Angel De la Torre, Javier Ramirez, Antonio J. Rubio, Carmen Benitez

Integration of n-best recognition results obtained by multiple noise reduction algorithms
Takeshi Yamada, Jiro Okada, Nobuhiko Kitawaki

Revisiting some model-based and data-driven denoising algorithms in Aurora 2 context
Panji Setiawan, Sorel Stan, Tim Fingscheidt

Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion
Guo-Hong Ding, Bo Xu

A robust training algorithm based on neighborhood information
Wing-Hei Au, Man-Hung Siu

In-phase feature induction: an effective compensation technique for robust speech recognition
Siu Wa Lee, Pak Chung Ching

Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptation
Siu-Kei Au Yeung, Man-Hung Siu

A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering
Shang-nien Tsai, Lin-shan Lee


Spoken / Multimodal Dialogue System


Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition
Christian Fügen, Hartwig Holzapfel, Alex Waibel

Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs
Akinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano

Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary
Hironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa

Constrained minimization technique for topic identification using discriminative training and support vector machines
Imed Zitouni, Minkyu Lee, Hui Jiang

Characterizing task-oriented dialog using a simulated ASR chanel
Jason D. Williams, Steve Young

A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots
Takashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino

Noise adaptive spoken dialog system based on selection of multiple dialog strategies
Akinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino

Flexible dialogue management using distributed and dynamic dialogue control
Mikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk

Contextual revision in information seeking conversation systems
Keith Houck

Cross domain dialogue modelling: an object-based approach
Ian O'Neill, Philip Hanna, Xingkun Liu, Michael McTear

A comparison of confirmation styles for error handling in a speech dialog system
Hirohiko Sagawa, Teruko Mitamura, Eric Nyberg

Using computer simulation to compare two models of mixed-initiative
Fan Yang, Peter A. Heeman

Towards understanding mixed-initiative in task-oriented dialogues
Fan Yang, Peter A. Heeman, Kristy Hollingshead

Spokenquery: an alternate approach to chosing items with speech
Peter Wolf, Joseph Woelfel, Jan Van Gemert, Bhiksha Raj, David Wong

Mining customer care dialogs for "daily news"
Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert Bell, Mazin Rahim, Deborah F. Swayne, Chris Volinsky

Higgins - a spoken dialogue system for investigating error handling techniques
Jens Edlund, Gabriel Skantze, Rolf Carlson

A conversational dialogue system for cognitively overloaded users
Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao

Modeling generic dialog applications for embedded systems
Gerhard Hanrieder, Stefan W. Hamerich

A framework for dialogue data collection with a simulated ASR channel
Matthew N. Stuttle, Jason D. Williams, Steve Young

A multi-layer conversation management approach for information seeking applications
Shimei Pan

A universal speech interface for appliances
Thomas Kevin Harris, Roni Rosenfeld

Speech understanding, dialogue management and response generation in corpus-based spoken dialogue system
Keita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi

Implementation of dialog applications in an open-source voiceXML platform
Fernando Fernandez, Valentin Sama, Luis F. D'Haro, Ruben San-Segundo, Ricardo de Córdoba, Juan Manuel Montero

Fuzzy logic decision fusion in a multimodal biometric system
Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu-Sang Moon, Yeung Yam

A state model for the realization of visual perceptive feedback in smartkom
Peter Poller, Norbert Reithinger

A vector-based method for efficiently representing multivariate environmental information
Akemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa

A multi-modal dialog system for a mobile robot
Ioannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink

Structured interview-based evaluation of spoken multimodal conversation with h.c. andersen
Niels Ole Bernsen, Laila Dybkjaer






Speech Recognition - Large Vocabulary


The automatic news transcription system: ANTS, some real time experiments
Dominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina

Use of metadata to improve recognition of spontaneous speech and named entities
Bhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig

Duration modeling techniques for continuous speech recognition
Janne Pylkkonen, Mikko Kurimo

Large vocabulary continuous speech recognition for estonian using morpheme classes
Tanel Alumae

Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling
Zhaobing Han, Shuwu Zhang, Bo Xu

Parallel tone score association method for tone language speech recognition
William S-Y. Wang, Gang Peng

Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition
Jing Zheng, Horacio Franco, Andreas Stolcke

Automatic transcription of continuous speech using unsupervised and incremental training
G.L. Sarada Ghadiyaram, N. Hemalatha Nagarajan, T. Nagarajan Thangavelu, Hema A. Murthy

Very large vocabulary speech recognition system for automatic transcription of czech broadcast programs
Jan Nouza, Dana Nejedlova, Jindrich Zdansky, Jan Kolorenc

Speech recognition error analysis on the English MALACH corpus
Olivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig

A frame level boosting training scheme for acoustic modeling
Rong Zhang, Alexander Rudnicky

Optimizing boosting with discriminative criteria
Rong Zhang, Alexander Rudnicky

Restructuring HMM states for speaker adaptation in Mandarin speech recognition
Xianghua Xu, Qiang Guo, Jie Zhu

A discriminative locally weighted distance measure for speaker independent template based speech recognition
Mike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools

Deterministic annealing EM algorithm in parameter estimation for acoustic model
Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura

TRAP based features for LVCSR of meting data
Frantisek Grezl, Martin Karafiat, Jan Cernocky

Optimal acoustic and language model weights for minimizing word verification errors
Frank K. Soong, Wai Kit Lo, Satoshi Nakamura

Structuring of baseball live games based on speech recognition using task dependant knowledge
Atsushi Sako, Yasuo Ariki

A two-level schema for detecting recognition errors
Zhengyu Zhou, Helen Meng

Large vocabulary continuous speech recognition based on cross-morpheme phonetic information
In-Jeong Choi, Nam-Hoon Kim, Su Youn Yoon

Automatic phonetic base form generation based on maximum context tree
Changxue Ma

Dictionary refinements based on phonetic consensus and non-uniform pronunciation reduction
Gustavo Hernandez-Abrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf

Transcription of arabic broadcast news
Abdel. Messaoudi, Lori Lamel, Jean-Luc Gauvain

Spontaneous speech recognition using a massively parallel decoder
Takahiro Shinozaki, Sadaoki Furui

Issues in meeting transcription - the ISL meeting transcription system
Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen

Multi-pass ASR using vocabulary expansion
Katsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi

Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition
Vlasios Doumpiotis, William Byrne

Task-specific minimum Bayes-risk decoding using learned edit distance
Izhak Shafran, William Byrne

Apply n-best list re-ranking to acoustic model combinations of boosting training
Rong Zhang, Alexander Rudnicky

Using VTLN for broadcast news transcription
D. Y. Kim, S. Umesh, M. J. F. Gales, T. Hain, P. C. Woodland

From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system
Andreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, Dave Gelbart, Nikki Mirghafori, Tuomo Pirinen

An efficient repair procedure for quick transcriptions
Anand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde

Tone information as a confidence measure for improving Cantonese LVCSR
Yao Qian, Tan Lee, Frank K. Soong


Speech Science


Temporal variables in parkinsonian speech
Danielle Due

Speaker adaptation of a three-dimensional tongue model
Olov Engwall

Perception of non-native phonemes in noise
Nicole Cooper, Anne Cutler

Intelligibility of degraded speech from smeared STRAIGHT spectrum
Hideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin

Sound source localization based on zero-crosing peak-amplitude coding
Young-Ik Kim, Rhee Man Kil

Adult and infant sensitivity to phonotactic features in spoken Japanese
Kajikawa Sachiyo, Fais Laurel, Amano Shigeaki, Werker Janet

Revisiting dysarthria assessment intelligibility metrics
Phil Green, James Carmichael

The effect of intonation on perception of Cantonese lexical tones
Valter Ciocca, Tara L. Whitehill, Joan K.-Y. Ma

Maximum short quantity in Japanese and finish in two perception tests with F0 and db variants
Toshiko Isei-Jaakkola

Evaluation of an inverse filtering technique using physical modeling of voice production
Paavo Alku, Matti Airas, Brad Story

Positional and phonotactic effects on the realization of taiwan Mandarin tone 2
Hui-ju Hsu, Janice Fon

Speech production based on lossy tube models: unit concatenation and sound transitions
Karl Schnell, Arild Lacroix

Modelling and ranking of differences across formants of british, australian and american accents
Qin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho

An experimental method for measuring transfer functions of acoustic tubes
Tatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto

Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks
Takuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim

Computation of the acoustic characteristics of vocal-tract models with geometrical perturbation
Kunitoshi Motoki, Hiroki Matsuzaki

Analysis of hypernasality by synthesis
P. Vijayalakshmi, M. Ramasubba Reddy

Adaptive long-term predictive analysis of disordered speech
Abdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen

Phoneme restoration in degraded speech communication
Slobodan Jovicic, Sandra Antesevic, Zoran Saric

Automatic detection of vocal fold paralysis and edema
Maria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras



Spoken and Natural Language Understanding


Estimation of semantic confidences on lattice hierarchies
Robert Lieb, Tibor Fabian, Guenther Ruske, Matthias Thomae

Learning subject drift for topic tracking
Fumiyo Fukumoto, Yoshimi Suzuki

The ICSI-SRI-UW metadata extraction system
Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary Harper, Yang Liu

Automatic detection of contrast for speech understanding
Mark Hasegawa-Johnson, Stephen Levinson, Tong Zhang

Integrating layer concept inform ation into n-gram modeling for spoken language understanding
Nick Jui-Chang Wang, Jia-Lin Shen, Ching-Ho Tsai

A robust understanding model for spoken dialogues
Junyan Chen, Ji Wu, Zuoying Wang

Belief-based nonlinear rescoring in Thai speech understanding
Chai Wutiwiwatchai, Sadaoki Furui

An understanding strategy based on plausibility score in recognition history using CSR confidence measure
Toshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi

Speech recognition error correction using maximum entropy language model
Sangkeun Jung, Minwoo Jeong, Gary Geunbae Lee

Discriminative training of compound-word based multinomial classifiers for speech routing
Xiang Li, Juan Huerta

An information extraction approach for spoken language understanding
Jihyun Eun, Changki Lee, Gary Geunbae Lee

A maximum entropy shallow functional parser for spoken language understanding
David Horowitz, Partha Lal, Pierce Gerard Buckley

Mixture language models for call routing
Qiang Huang, Stephen Cox

Speech act identification using an ontology-based partial pattern tree
Chung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen

Creating speech recognition grammars from regular expressions for alphanumeric concepts
Ye-Yi Wang, Yun-Cheng Ju

Poetry assistant
Isabel Trancoso, Paulo Araujo, Ceu Viana, Nuno Mamede

Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers
Tasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo

Robust dependency parsing of spontaneous Japanese speech and its evaluation
Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki

Strategies for optimizing a stochastic spoken natural language parser
Wolfgang Minker, Dirk Buehler, Christiane Beuschel

Prolongation in spontaneous Mandarin
Tzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund

Speech intention understanding based on decision tree learning
Yuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki

Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants
Satanjeev Banerjee, Alexander Rudnicky

An acoustic study of emotions expressed in speech
Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan, Carlos Busso

Topic classification and verification modeling for out-of-domain utterance detection
Tatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura

Partially lexicalized parsing model utilizing rich features
So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo-Hong Kim

Clustering similar nouns for selecting related news articles
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi

Chinese text word-segmentation considering semantic links among sentences
Leonardo Badino

Syllable-based probabilistic morphological analysis model of Korean
Do-Gil Lee, Hae-Chang Rim




Acoustic Modeling


Context dependent "long units" for speech recognition
Denis Jouvet, Ronaldo Messina

Rapid EM training based on model-integration
Shinichi Yoshizawa, Kiyohiro Shikano

Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system
Dominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara

A statistical discrimination measure for hidden Markov models based on divergence
Jorge Silva, Shrikanth Narayanan

A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition
Jan Stadermann, Gerhard Rigoll

Data driven number-of-states selection in HMM topologies
Dirk Knoblauch

Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizers
Youngkyu Cho, Sung-a Kim, Dongsuk Yook

Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model format
Peder Olsen, Karthik Visweswariah

Feature-based pronunciation modeling with trainable asynchrony probabilities
Karen Livescu, James Glass

Maximum entropy direct model as a unified model for acoustic modeling in speech recognition
Hong-Kwang Jeff Kuo, Yuqing Gao

Explicit duration modeling for Cantonese connected-digit recognition
Yu Zhu, Tan Lee

Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems
Arthur Chan, Ravishankar Mosur, Alexander Rudnicky, Jahanzeb Sherwani

Compact acoustic model for embedded implementation
Junho Park, Hanseok Ko

Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach
Takatoshi Jitsuhiro, Satoshi Nakamura

Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognition
Panu Juhani Somervuo

Discriminative training with tied covariance matrices
Wolfgang Macherey, Ralf Schlüter, Hermann Ney

Acoustic phonetic modeling using local codebook features
Frank Diehl, Asuncion Moreno

An efficient codebook design in SDCHMM for mobile communication environments
Gue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh

Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models
Makoto Shozakai, Goshu Nagino

Context dependent phoneme duration modeling with tree-based state tying
Myoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee

Towards better understanding of the model implied by the use of dynamic features in HMMs
John Scott Bridle


Prosody Modeling and Generation


Chinese prosody phrase break prediction based on maximum entropy model
Jian-Feng Li, Guo-Ping Hu, Renhua Wang

Intonation modeling for indian languages
Krothapalli Sreenivasa Rao, Bayya Yegnanarayana

Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework
Yu Zheng, Gary Geunbae Lee, Byeongchang Kim

Using part-of-speech for predicting phrase breaks
Ian Read, Stephen Cox

A proposal to quantitatively select the right intonation unit in data-driven intonation modeling
David Escudero-Mancebo, Valentin Cardenoso-Payo

Formulating contextual tonal variations in Mandarin
Jinfu Ni, Hisashi Kawai, Keikichi Hirose

Automatic adaptation of the momel F0 stylisation algorithm to new corpora
Salma Mouline, Olivier Boeffard, Paul Bagshaw

Joint extraction and prediction of fujisaki's intonation model parameters
Pablo Daniel Aguero, Klaus Wimmer, Antonio Bonafonte

Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesis
Panagiotis Zervas, Nikos Fakotakis, George Kokkinakis, George Kouroupetroglou, Gerasimos Xydas

The duration of pitch transition phase and its relative factors
Ziyu Xiong, Juanwen Chen

Polynomial regression model for duration prediction in Mandarin
Yu Hu, Renhua Wang, Lu Sun

Prediction of the glottal LF parameters using regression trees
Michelle Tooher, John G. McKenna

Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rate
Volker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner

Analysis of F0 contours of Cantonese utterances based on the command-response model
Wentao Gu, Keikichi Hirose, Hiroya Fujisaki

Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French
Marion Dohen, Helene Loevenbruck

Duration modeling for hindi text-to-speech synthesis system
Sridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan

A new prosodic phrasing model for indian language telugu
Nemala Sridhar Krishna, Hema A. Murthy

Evolutionary optimization of an adaptive prosody model
Oliver Jokisch, Michael Hofmann

An intonation model for embedded devices based on natural F0 samples
Gerasimos Xydas, Georgios Kouroupetroglou

Prosodic characteristics of czech contrastive topic
Katerina Vesela, Nino Peterek, Eva Hajicova






Speech Features


Continuous speech recognition using joint features derived from the modified group delay function and MFCC
Rajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde

Phase-space representation of speech
Hua Yu

The modified group delay feature: a new spectral representation of speech
Hema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde

ICA-based feature extraction for phoneme recognition
Oh-Wook Kwon, Te-Won Lee

On using MLP features in LVCSR
Qifeng Zhu, Barry Chen, Nelson Morgan, Andreas Stolcke

Learning long-term temporal features in LVCSR using neural networks
Barry Chen, Qifeng Zhu, Nelson Morgan

Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition
T. V. Sreenivas, G. V. Kiran, A. G. Krishna

An adaptive MEL-LPC analysis for speech recognition
Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada

Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition
Kentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami

A new acoustic measure for aspiration noise detection
Carlos Toshinori Ishi

Synthesizing speech from speech recognition parameters
Kris Demuynck, Oscar Garcia, Dirk Van Compernolle

LP-TRAP: linear predictive temporal patterns
Marios Athineos, Hynek Hermansky, Daniel P.W. Ellis

Parallel feature generation based on maximizing normalized acoustic likelihood
Xiang Li, Richard Stern

An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments
Kun-Ching Wang

Improved voice activity detection combining noise reduction and subband divergence measures
Javier Ramirez, José Carlos Segura, Carmen Benitez, Angel de la Torre, Antonio Rubio

Voice activity detection using global soft decision with mixture of Gaussian model
Kiyoung Park, Changkyu Choi, Jeongsu Kim

Environmental robust features for speech detection
Thomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros

Crosscorrelation-based multispeaker speech activity detection
Kornel Laskowski, Qin Jin, Tanja Schultz

Improved robustness of time-frequency principal components (TFPC) by synergy of methods in different domains
Shang-nien Tsai

A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech
Li Deng, Yu Dong, Alex Acero

DWT-based classification of acoustic-phonetic classes and phonetic units
Gernot Kubin, Van Tuan Pham

Learning nonnegative features of spectro-temporal sounds for classification
Yong-Choon Cho, Seungjin Choi


Language Modeling, Multimodal & Multilingual Speech Processing


N-gram language modeling of Japanese using bunsetsu boundaries
Sungyup Chung, Keikichi Hirose, Nobuaki Minematsu

Dynamic language modeling for broadcast news
Langzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda

A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects"
Ren-Yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu

The influence of target size and distance on the production of speech and gesture in multimodal referring expressions
Ielka van der Sluis, Emiel Krahmer

Dynamic time windows for multimodal input fusion
Anurag Kumar Gupta, Tasos Anastasakos

MICot : a tool for multimodal input data collection
Raymond H. Lee, Anurag Kumar Gupta

Simulating multimodal applications
Chakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Levy

A multimodal communication aid for global aphasia patients
Jakob Schou Pedersen, Paul Dalsgaard, Borge Lindberg

Mis-recognized utterance detection using hierarchical language model
Hirofumi Yamamoto, Genichiro Kikui, Yoshinori Sagisaka

Cross-lingual phoneme mapping for multilingual synthesis systems
Marko Moberg, Kimmo Parssinen, Juha Iso-Sipila

Robot motion control using listener's back-channels and head gesture information
Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi

Indonesian speech recognition for hearing and speaking impaired people
Sakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol

A two phase arabic language model for speech recognition and other language applications
Mohsen Rashwan

Language model adaptation based on PLSA of topics and speakers
Yuya Akita, Tatsuya Kawahara

Unified language modeling using finite-state transducers with first applications
Hans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz

Effects of language modeling on speech-driven question answering
Katsunobu Itou, Atsushi Fujii, Tomoyosi Akiba

Measuring convergence in language model estimation using relative entropy
Abhinav Sethy, Shrikanth Narayanan, Bhuvana Ramabhadran



Speech Analysis


Reconstruction filter design for bone-conducted speech
Toshiki Tamiya, Tetsuya Shimamura

Frequency warped ARMA analysis of the closed and the open phase of voiced speech
Pedro J. Quintana-Morales, Juan L. Navarro-Mesa

Zeros of z-transform (ZZT) decomposition of speech for source-tract separation
Boris Doval, Baris Bozkurt, Christophe D'Alessandro, Thierry Dutoit

Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speech
Li Deng, Roberto Togneri

Graphical model approach to pitch tracking
Xiao Li, Jonathan Malkin, Jeff Bilmes

A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation
Bo Xu, Jianhua Tao, Yongguo Kang

A concurrent curve strategy for formant tracking
Yves Laprie

A formant tracking LP model for speech processing
Qin Yan, Esfandiar Zavarehei, Saeed Vaseghi, Dimitrios Rentzos

Application of long-term filtering to formant estimation
Hong You

A method for glottal formant frequency estimation
Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe D'Alessandro

Improved differential phase spectrum processing for formant tracking
Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe D'Alessandro

MAP prediction of pitch from MFCC vectors for speech reconstruction
Xu Shao, Ben P. Milner

New harmonicity measures for pitch estimation and voice activity detection
An-Tze Yu, Hsiao-Chuan Wang

Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filtering
Takuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka

Automatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signals
Attila Ferencz, Jeongsu Kim, Yong-Beom Lee, Jae-Won Lee

On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech input
Federico Flego, Luca Armani, Maurizio Omologo

A minimum mean squared error estimator for single channel speaker separation
Aarthi M. Reddy, Bhiksha Raj

Audio source separation from the mixture using empirical mode decomposition with independent subspace analysis
Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu

Audio watermarking in sub-band signals using multiple echo kernels
In-Jung Oh, Hyun-Yeol Chung, Jae-Won Cho, Ho-Youl Jung, R. Prost

A piecewise interpolation method based on log-least square error criterion for HRTF
Jie Zhang, Zhenyang Wu

Modified realizable frequency warped ARMA modeling and its application in synthesis structures for voiced speech
Juan L. Navarro-Mesa, Pedro J. Quintana-Morales

Time-scaling of speech using independent subspace analysis
R. Muralishankar, A. G. Ramakrishnan, Lakshmish N. Kaushik

Long term modeling of phase trajectories within the speech sinusoidal model framework
Laurent Girin, Mohammad Firouzmand, Sylvain Marchand

An acoustic shock limiting algorithm using time and frequency domain speech features
Tina Soltani, Dave Hermann, Etienne Cornu, Hamid Sheikhzadeh, Rob Brennan

Speech probability distribution based on generalized gama distribution
Jong Won Shin, Joon-Hyuk Chang, Nam Soo Kim

Stop consonant classification by dynamic formant trajectory
Yanli Zheng, Mark Hasegawa-Johnson, Sarah Borys

Estimating detailed spectral envelopes using articulatory clustering
Yoshinori Shiga, Simon King



Audio-Visual Speech Processing


Audio-visual spoken language processing
Jinyoung Kim, Jeesun Kim, Chris Davis

Issues in the development of auditory-visual speech perception: adults, infants, and children
Kaoru Sekiyama, Denis Burnham

Signaling and detecting uncertainty in audiovisual speech by children and adults
Emiel Krahmer, Marc Swerts

Effect of intensive audiovisual perceptual training on the perception and production of the /l/-/r/ contrast for Japanese learners of English
Valerie Hazan, Anke Sennema, Andrew Faulkner

Visual recalibration of auditory speech versus selective speech adaptation: different build-up courses
Jean Vroomen, Sabine van Linden, Beatrice de Gelder, Paul Bertelson

Of the top of the head: audio-visual speech perception from the nose up
Chris Davis, Jeesun Kim

Aspects of speaking-face data corpus design methodology
J. Bruce Millar, Michael Wagner, Roland Goecke

Modeling audio-visual speech perception: back on fusion architectures and fusion control
Jean-Luc Schwartz, Marie Cathiard

Neurocognition of speech-specific audiovisual perception
Mikko Sams, Ville Ojanen, Jyrki Tuomainen, Vasily Klucharev

Target practice on talking faces
Adriano Vilela Barbosa, Eric Vatikiotis-Bateson, Andreas Daffertshofer

Audiovisual perceptual evaluation of resynthesised speech movements
Matthias Odisio, Gérard Bailly

Video-realistic synthetic speech with a parametric visual speech synthesizer
Sascha Fagel

Mutual information based visual feature selection for lipreading
Patricia Scanlon, Gerasimos Potamianos, Vit Libal, Stephen M. Chu

AVICAR: audio-visual speech corpus in a car environment
Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas Huang

Adaptive classifier cascade for multimodal speaker identification
Engin Erzin, Yucel Yemez, A. Murat Tekalp

Use of visual cues in the perception of a labial/labiodental contrast by Spanish-L1 and Japanese-L1 learners of English
Midori Iba, Anke Sennema, Valerie Hazan, Andrew Faulkner

Audio-visual SPeaker localization for car navigation systems
Xianxian Zhang, Kazuya Takeda, John H. L. Hansen, Toshiki Maeno

Automatic lips reading for audio-visual speech processing and recognition
Josef Chaloupka

liveness verification in audio-video authentication
Michael Wagner, Girija Chetty

Speech recognition using motion based lipreading
Maria José Sanchez Martinez, Juan Pablo de la Cruz Gutierrez

Comparative study of linear and non-linear models for viseme in version: modeling of a cortical associative function
Frédéric Berthommier

3d lip-tracking for audio-visual speech recognition in real applications
Petr Cisar, Zdenek Krnoul, Milos Zelezny

The audio-video australian English speech data corpus AVOZES
J. Bruce Millar, Roland Goecke

Correcting Korean vowel speech recognition errors with limited lip features
Ki-Hyung Hong, Yong-Ju Lee, Jae-Young Suh, Kyong-Nim Lee

Segmental differences in the visual contribution to speech inteligibility
Kuniko Nielsen


Spoken Language Generation and Synthesis III


Voice conversion for unknown speakers
Hui Ye, Steve Young

Domain adaptation methods in the IBM trainable text-to-speech system
Volker Fischer, Jaime Botella Ordinas, Siegfried Kunzmann

Applying pitch connection control in Mandarin speech synthesis
Yi Zhou, Yiqing Zu, Zhenli Yu, Dongjian Yue, Guilin Chen

A first step towards text-independent voice conversion
Hermann Ney, David Suendermann, Antonio Bonafonte, Harald Hoege

Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems
Zhenli Yu, Kaizhi Wang, Yiqing Zu, Dongjian Yue, Guilin Chen

Subjective evaluation of join cost functions used in unit selection speech synthesis
Jithendra Vepa, Simon King

Constructing emotional speech synthesizers with limited speech database
Heiga Zen, Tadashi Kitamura, Murtaza Bulut, Shrikanth Narayanan, Ryosuke Tsuzuki, Keiichi Tokuda

A two-phase pitch marking method for TD-PSOLA synthesis
Cheng-Yuan Lin, Jyh-Shing Roger Jang

Including dynamic and phonetic information in voice conversion systems
Antonio Bonafonte, Alexander Kain, Jan van Santen, Helenca Duxans

A novel voice conversion system based on codebook mapping with phoneme-tied weighting
Zixiang Wang, Renhua Wang, Zhiwei Shuang, Zhenhua Ling

Compression of speech database by feature separation and pattern clustering using STRAIGHT
Zhenhua Ling, Yu Hu, Zhiwei Shuang, Renhua Wang

Decision-tree backing-off in HMM-based speech synthesis
Shunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, Tadashi Kitamura

Using a depth-restricted search to reduce delays in unit selection
Nobuyuki Nishizawa, Hisashi Kawai

MLLR adaptation for hidden semi-Markov model based speech synthesis
Junichi Yamagishi, Takashi Masuko, Takao Kobayashi

Phoxsy: multi-phone segments for unit selection speech synthesis
Stefan Breuer, Julia Abresch

Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS
Francesc Alias, Xavier Llora, Ignasi Iriondo, Joan Claudi Socoro, Xavier Sevillano, Lluis Formiga

A voice conversion method based on joint pitch and spectral envelope transformation
Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel

Fast GMM-based voice conversion for text-to-speech synthesis systems
Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel

A genetic algorithm for unit selection based speech synthesis
Rohit Kumar

A memory efficient grapheme-to-phoneme conversion system for speech processing
Jun Huang, Lex Olorenshaw, Gustavo Hernandez-Abrego, Lei Duan

Automatic pruning of unit selection speech databases for synthesis without loss of naturalness
Rohit Kumar, S. Prahallad Kishore

A database design for a TTS synthesis system using lexical diphones
Tanya Lambert, Andrew Breen

A family-of-models approach to HMM-based segmentation for unit selection speech synthesis
John Kominek, Alan W Black

Mutual-information based segment pre-selection in concatenative text-to-speech
Wei Zhang, Ling Jin, Xijun Ma

Hidden semi-Markov model based speech synthesis
Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura

DFW-based spectral smoothing for concatenative speech synthesis
Hartmut R. Pfitzinger

Korean prosody generation and artificial neural networks
Kyung-Joong Min, Un-Cheon Lim

A prosodic phrasing model for a Korean text-to-speech synthesis system
Kyuchul Yoon

A comparison of statistical methods and features for the prediction of prosodic structures
Qin Shi, Volker Fischer

Letter-to-sound for small-footprint multilingual TTS engine
Guilin Chen, Ke-Song Han

Grapheme-to-phoneme conversion for Chinese text-to-speech
Jun Xu, Guohong Fu, Haizhou Li

XML representation languages as a way of interconnecting TTS modules
Marc Schröder, Stefan Breuer

Approach to interchange-format based Chinese generation
Wenjie Cao, Chengqing Zong, Bo Xu

Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis
Enrico Zovato, Stefano Sandri, Silvia Quazza, Leonardo Badino

Number of output nodes of artificial neural networks for Korean prosody generation
Kyung-Joong Min, Chan-Goo Kang, Un-Cheon Lim

A Korean grapheme-to-phoneme conversion system using selection procedure for exceptions
Sunhee Kim, Ju-Eun Ahn, Soon-Hyob Kim, Yang-Hee Lee

Synthesis of vowels and tones in Thai language by articulatory modeling
Thanate Khaorapapong, Montri Karnjanadecha, Keerati Inthavisas

Source-filter separation for articulation-to-speech synthesis
Yoshinori Shiga, Simon King

Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabet
Asano Hisako, Nakajima Hideharu, Mizuno Hideyuki, Oku Masahiro

Inexactness and robustness in cepstral-to-formant transformation of spoken and sung vowels
Frantz Clermont, Thomas John Millhouse

Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice
Takeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi

Statistical corpus-based speech segmentation
Vincent Pollet, Geert Coorman

Recent improvements on ARTIC: czech text-to-speech system
Jindrich Matousek, Jan Romportl, Daniel Tihelka, Zbynek Tychtl

Learning for transliteration of arabic-numeral expressions using decision tree for Korean TTS
HyeonSook Nam, Youngim Jung, Donghun Lee, Hyuk-chul Kwon, Aesun Yoon

How to integrate phonetic and linguistic knowledge in a text-to-phoneme conversion task: a syllabic TPC tool for French
Nicole Beringer

Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis system
Wael Hamza, Ellen Eide, Raimo Bakis

High quality text-to-pinyin conversion using two-phase unknown word prediction
Juhong Ha, Yu Zheng, Gary Geunbae Lee, Yoon-Suk Seong, Byeongchang Kim

Pronunciation lexicon adaptation for TTS voice building
Yeon-Jun Kim, Ann Syrdal, Alistair Conkie

Improving letter-to-pronunciation accuracy with automatic morphologically-based stress prediction
Gabriel Webster

The IBM expressive speech synthesis system
Wael Hamza, Ellen Eide, Raimo Bakis, Michael Picheny, John Pitrelli

What concept-to-speech can gain for prosody
Markus Schnell, Rüdiger Hoffmann



Speaker Recognition


Segmentation and relevance measure for speaker verification
Jerome Louradour, Regine André-Obrecht, Khalid Daoudi

A new nonlinear feature extraction algorithm for speaker verification
Mohamed Chetouani, Bruno Gas, Jean-Luc Zarader, Marcos Faundez-Zanuy

SVM modeling of "SNERF-grams" for speaker recognition
Elizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, Sachin Kajarekar

SVM kernel adaptation in speaker classification and verification
Purdy Ho, Pedro Moreno

Noise-robust speaker verification using F0 features
Koji Iwano, Taichi Asami, Sadaoki Furui

Eigen-prosody analysis for robust speaker recognition under mismatch handset environment
Zi-He Chen, Yuan-Fu Liao, Yau-Tarng Juang

Triphone-based confidence system for speaker identification
Aaron Lawson, Mark Huggins

Improved model training and automatic weight adjustment for multi-SNR multi-band speaker identification system
Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki

A new approach to channel robust speaker verification via constrained stochastic feature transformation
Man-Wai Mak, Kwok-kwong Yiu, Ming-Cheung Cheun, Sun-Yuan Kung

Best speaker-based structure tree for speaker verification
Chakib Tadj, Christian Gargour, Nabil Badri

Robust speaker identification based on perceptual log area ratio and Gaussian mixture models
David Chow, Waleed Abdulla

Channel frequency response correction for speaker recognition
Stanley Wenndt, Richard Floyd

Unseen handset mismatch compensation based on a priori knowledge interpolation for robust speaker recognition
Jyh-Her Yang, Yuan-Fu Liao

A comparison of soft and hard spectral subtraction for speaker verification
Michael Padilla, Thomas Quatieri

Comparison of several speaker verification procedures based on GMM
Vlasta Radova, Ales Padrta

Improving performance of text-independent speaker identification by utilizing contextual principal curves filtering
Yong Guan, Wenju Liu, Hongwei Qi, Jue Wang

Speaker identification using probabilistic PCA model selection
Jen-Tzung Chien, Chuan-Wei Ting

Text independent speaker recognition using speaker dependent word spotting
Hagai Aronowitz, David Burshtein, Amihood Amir

A study on model-based equal error rate estimation for automatic speaker verification
Hsiao-Chuan Wang, Jyh-Min Cheng

Probabilistic speaker identification with dual penalized logistic regression machine
Tomoko Matsui, Kunio Tanabe

Model quality evaluation during enrolment for speaker verification
Javier R. Saeta, Javier Hernando

Real-time speaker identification
Pasi Frati, Evgeny Karpov, Tomi Kinnunen

Multi-codebook vector quantization algorithm for speaker identification
Mohammed Abu El-Yazeed, Nemat Abdel Kader, Mohammed El-Henawy

Multi-sample fusion with constrained feature transformation for robust speaker verification
Ming-Cheung Cheung, Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung

Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs
Michael Betser, Frédéric Bimbot, Mathieu Ben, Guillaume Gravier

Time -frequency analysis of vocal source signal for speaker recognition
Nengheng Zheng, P. C. Ching, Tan Lee

A novel method for two-speaker segmentation
Rashmi Gangadharaiah, Balakrishnan Narayanaswamy, Narayanaswamy Balakrishnan

Throat microphone signal for speaker recognition
Bayya Yegnanarayana, A. Shahina, M. R. Kesheorey

Posteriori probabilities and likelihoods combination for speech and speaker recognition
Mohamed Faouzi Ben Zeghiba, Hervé Bourlard

The use of typical sequences for robust speaker identification
Mohamed Mihoubi, Douglas O'Shaughnessy, Pierre Dumouchel

A forensic phonetic investigation into the duration and speech rate
KyungHwa Kim

Mixture Gaussian model training against impostor model parameters: an application to speaker identification
T. V. Sreenivas, Sameer Badaskar, Sameer Badaskar

Jacobian adaptation with improved noise reference for speaker verification
Jan Anguita, Javier Hernando, Alberto Abad

Objective wavelet packet features for speaker verification
Mihalis Siafarikas, Todor Ganchev, Nikos Fakotakis

Policy analysis framework for conversational biometrics
Upendra V. Chaudhari, Ganesh N. Ramaswamy

A new score normalization method for speaker verification with virtual impostor model
Woo-Yong Choi, Jung Gon Kim, Hyung Soon Kim, Sung Bum Pan

On the time variability of vocal tract for speaker recognition
Samuel Kim, Thomas Eriksson, Hong-Goo Kang

Distributed speaker recognition
Veena Desai, Hema A. Murthy

Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identification
Pongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen

Distributed speaker recognition using earth mover's distance
Yoshiyuki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren

A forensically-motivated tool for selecting cepstrally-consistent steady-states from non-contemporaneous vowel utterances
Michael Barlow, Mehrdad Khodai-Joopari, Frantz Clermont

Scoring and direct methods for the interpretation of evidence in forensic speaker recognition
Anil Alexander, Andrzej Drygajlo

Efficient online cohort selection method for speaker verification
Tomi Kinnunen, Evgeny Karpov, Pasi Franti

Statistical model migration in speaker recognition
Jiri Navratil, Ganesh N. Ramaswamy, Ran D. Zilca

Latent semantic analysis for speaker recognition
A. Nayeemulla Khan, Bayya Yegnanarayana

Model-based sequential organization for cochannel speaker identification
Yang Shao, DeLiang Wang

Articulatory feature-based conditional pronunciation modeling for speaker verification
Ka-Yee Leung, Man-Wai Mak, Sun-Yuan Kung

A comparison of normalization and training approaches for ASR-dependent speaker identification
Alex Park, Timothy J. Hazen

New background modeling for speaker verification
Dat Tran



Contemporary Issues in ASR


Transformation-based error correction for speech-to-text systems
Jochen Peters, Christina Drexel

Phone classification in pseudo-euclidean vector spaces
Alexander Gutkin, Simon King

Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation
Grace Chung, Chao Wang, Stephanie Seneff, Ed Filisko, Min Tang

Modeling pronunciation variation using artificial neural networks for English spontaneous speech
Ken Chen, Mark Hasegawa-Johnson

Foreign-accented speaker-independent speech recognition
Stefanie Aalburg, Harald Hoege

Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphone
Panikos Heracleous, Yoshitaka Nakajima, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano

Recognition of read and spontaneous children's speech using two new corpora
Martin Russell, Shona D'Arcy, Lit Ping Wong

Articulatory feature recognition using dynamic Bayesian networks
Joe Frankel, Mirjam Wester, Simon King

Predicting word correct rate from acoustic and linguistic confusability
Gies Bouwman, Bert Cranen, Lou Boves

Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition
Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Word confusability prediction in automatic speech recognition
Jan Anguita, Stephane Peillon, Javier Hernando, Alexandre Bramoulle

Adaptation for soft whisper recognition using a throat microphone
Szu-Chen Jou, Tanja Schultz, Alex Waibel

A statistical lexicon for non-native speech recognition
Rainer Gruhn, Konstantin Markov, Satoshi Nakamura

Modeling auxiliary features in tandem systems
Mathew Magimai Doss, Shajith Ikbal, Todd Stephenson, Hervé Bourlard

Survey of spontaneous speech phenomena in a multimodal dialogue system and some implications for ASR
Louis ten Bosch, Lou Boves

Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models
Tobias Cincarek, Rainer Gruhn, Satoshi Nakamura

Coping with disfluencies in spontaneous speech recognition
Frederik Stouten, Jean-Pierre Martens

Speaker model quantization for unsupervised speaker indexing
Soonil Kwon, Shrikanth Narayanan

Investigating automatic recognition of non-native children's speech
Matteo Gerosa, Diego Giuliani

Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detection
Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary Harper

Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergence
Minho Jin, Gyucheol Jang, Sungrack Yun, Chang D. Yoo

Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations
Masataka Goto, Koji Kitayama, Katsunobu Itou, Tetsunori Kobayashi

Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition
Kyong-Nim Lee, Minhwa Chung

Performance of speech recognition and synthesis in packet-based networks
Sebastian Möller, Jan Felix Krebber, Alexander Raake

A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss
Alastair Bruce James, Ben P. Milner, Angel Manuel Gomez

An analysis of packet loss models for distributed speech recognition
Ben P. Milner, Alastair Bruce James




Interdisciplinary Topics in Spoken Language Processing


Generating gestures from speech
Ruben San-Segundo, Juan Manuel Montero, Javier Macias-Guarasa, Ricardo de Córdoba, Javier Ferreiros, José Manuel Pardo

Subtopic segmentation in the lecture speech
Noboru Kanedera, Sumida Asuka, Takao Ikehata, Tetsuo Funada

Some articulatory measurements of real sadness
Donna Erickson, Caroline Menezes, Akinori Fujino

Application of voice conversion to hearing-impaired Mandarin speech enhancement
Chen-Long Lee, Wen-Whei Chang, Yuan-Chuan Chiang

A Japanese dialogue-based CALL system with mispronunciation and grammar error detection
Oh Pyo Kweon, Akinori Ito, Motoyuki Suzuki, Shozo Makino

Statistics-based direction finding for training vowels
Cheolwoo Jo, Ilsuh Bak

Reference marking in children's computer-directed speech: an integrated analysis of discourse and gestures
Simona Montanari, Serdar Yildirim, Elaine Andersen, Shrikanth Narayanan

What makes a non-native accent?: a study of Korean English
Jong-mi Kim, Suzanne Flynn

Study on emotional speech features in Korean with its aplication to voice color conversion
Sang-Jin Kim, Kwang-Ki Kim, Minsoo Hahn

Developmental changes in voiced-segment ratio for Japanese infants and parents
Shigeaki Amano, Tomohiro Nakatani, Tadahisa Kondo

Implementation of an intonational quality assessment system for a handheld device
Kisun You, Hoyoun Kim, Wonyong Sung

Characterizing and classifying cued speech vowels from labial parameters
Denis Beautemps, Thomas Burger, Laurent Girin

Cough detection in spoken dialogue system for home health care
Shin-ya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta




Robust ASR


Why speech recognizers make errors ? a robustness view
Hong Kook Kim, Mazin Rahim

An energy normalization scheme for improved robustness in speech recognition
Mohammad Ahadi, Hamid Sheikhzadeh, Robert Brennan, George Freeman

Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments
Juan Huerta, Etienne Marcheret, Sreeram Balakrishnan

Modeling phones coarticulation effects in a neural network based speech recognition system
Leila Ansary, Seyyed Ali Seyyed Salehi

Error - weighted discriminative training for HMM parameter estimation
Daniel Willett

Robust verification of recognized words in noise
Wai Kit Lo, Frank K. Soong, Satoshi Nakamura

Robust automatic speech recognition using an optimal spectral amplitude estimator algorithm in low-SNR car environments
Zili Li, Hesham Tolba, Douglas O'Shaughnessy

Robust speech recognition using data-driven temporal filters based on independent component analysis
Junhui Zhao, Jingming Kuang, Xiang Xie

Robust distant speech recognition based on position dependent CMN
Norihide Kitaoka, Longbiao Wang, Seiichi Nakagawa

Robust speech recognition based on HMM composition and modified wiener filter
Sumitaka Sakauchi, Yoshikazu Yamaguchi, Satoshi Takahashi, Satoshi Kobashikawa

Feature-dependent compensation in speech recognition
Ivan Brito, Nestor Becerra Yoma, Carlos Molina

Using context to correct phone recognition errors
Stephen Cox

Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation
Yasunari Obuchi

Weighting observation vectors for robust speech recognition in noisy environments
Zhenyu Xiong, Fang Zheng, Wenhu Wu

Hands-free speech recognition using blind source separation post-processed by two-stage spectral subtraction
Masanori Tsujikawa, Ken-ichi Iso

Robust speech recognition with spectral subtraction in low SNR
Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano

Active perception: using a priori knowledge from clean speech models to ignore non-target features
Bert Cranen, Johan de Veth

Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition
Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Borge Lindberg

Using linear interpolation to improve histogram equalization for speech recognition
Filipp Korkmazsky, Dominique Fohr, Irina Illina

A factorial HMM aproach to robust isolated digit recognition in background music
Mark Hasegawa-Johnson, Ameya Deoras

Multi-eigenspace normalization for robust speech recognition in noisy environments
Yoonjae Lee, Hanseok Ko

Exploiting models intrinsic robustness for noisy speech recognition
Christophe Cerisara, Dominique Fohr, Odile Mella, Irina Illina

Speech recognition experiments with the SPEECON database using several robust front-ends
Pere Pujol, Jaume Padrell, Climent Nadeu, Dusan Macho

Spectro-temporal activity pattern (STAP) features for noise robust ASR
Shajith Ikbal, Mathew Magimai Doss, Hemant Misra, Hervé Bourlard

Improvement of confidence measure performance using background model set algorithm
Byoung-Don Kim, Jin-Young Kim, Seung-Ho Choi, Young-Bum Lee, Kyoung-Rok Lee

Using RASTA in task independent TANDEM feature extraction
Guillermo Aradilla, John Dines, Sunil Sivadas

A distributed speech recognition system in multi-user environments
Kyu Jeong Han, Shrikanth Narayanan, Naveen Srinivasamurthy

Soft features for improved distributed speech recognition over wireless networks
Reinhold Haeb-Umbach, Valentin Ion



Spoken Language Resources and Technology Evaluation I


New challenges in usability evaluation - beyond task-oriented spoken dialogue systems
Laila Dybkjaer, Niels Ole Bernsen, Wolfgang Minker

Using quick transcriptions to improve conversational speech models
Owen Kimball, Chia-lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul

A wizard of oz framework for collecting spoken human-computer dialogs
Rohit Mishra, Elizabeth Shriberg, Sandra Upson, Joyce Chen, Fuliang Weng, Stanley Peters, Lawrence Cavedon, John Niekrasz, Hua Cheng, Harry Bratt

Subjective evaluation of spoken dialogue systems using SER VQUAL method
Mikko Hartikainen, Esa-Pekka Salonen, Markku Turunen

Fiction database for emotion detection in abnormal situations
Ioana Vasilescu, Laurence Devillers, Chloe Clavel, Thibaut Ehrette

Fast semi-automatic semantic annotation for spoken dialog systems
Ruhi Sarikaya, Yuqing Gao, Paola Virga

A study on automatic detection of Japanese vowel devoicing for speech synthesis
Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Renhua Wang

Orientel-turkish: telephone speech database description and notes on the experience
Tolga Ciloglu, Dinc Acar, Ahmet Tokatli

Intertranscriber reliability of prosodic labeling on telephone conversation using toBI
Tae-Jin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson

Efficient compression method for pronunciation dictionaries
Jilei Tian

Construct a multi-lingual speech corpus in taiwan with extracting phonetically balanced articles
Min-siong Liang, Dau-cheng Lyu, Yuang-chin Chiang, Renyuan Lyu

Automatic prosody labeling of read norwegian
Per Olav Heggtveit, Jon Emil Natvig

Towards automatic word segmentation of dialect speech
Eric Sanders, Andrea Diersen, Willy Jongenburger, Helmer Strik

New nonsense syllables database - analyses and preliminary ASR experiments
Petr Fousek, Frantisek Grezl, Hynek Hermansky, Petr Svojanovsky

Speech input and output module assessment for remote access to a smart-home spoken dialog system
Jan Felix Krebber, Sebastian Möller, Alexander Raake

An implement of speech DB gathering system using voiceXML
Dong-Hyun Kim, Yong-Wan Roh, Kwang-Seok Hong

Precise phone boundary detection using wavelet packet and recurrent neural networks
Farshad Almasganj

From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition
Andrew Cameron Morris, Viktoria Maier, Phil Green

Design and construction of Korean-spoken English corpus
Seok-Chae Rhee, Sook-Hyang Lee, Young-Ju Lee, Seok-Keun Kang

Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective
Folkert De Vriend, Giulio Maltese

Spoken language interface in ECMA/ISO telecommunication standards
Kuansan Wang

The efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping
Marelie Davel, Etienne Barnard

Towards a new level of anotation detail of multilingual speech corpora
Anja Geumann

CIAIR in-car speech database
Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura

Investigating speech style specific pronunciation variation in large spoken language corpora
Christophe Van Bael, Henk van den Heuvel, Helmer Strik

The efficient generation of pronunciation dictionaries: human factors during bootstrapping
Marelie Davel, Etienne Barnard






Speech Coding and Enhancement


A packet loss concealment method using recursive linear prediction
Kazuhiro Kondo, Kiyoshi Nakagawa

On a n-gram model approach for packet loss concealment
Minkyu Lee, Imed Zitouni, Qiru Zhou

Efficient vector quantisation of line spectral frequencies using the switched split vector quantiser
Stephen So, Kuldip K. Paliwal

Enhancement of reverberant speech using excitation source information
M. Chaitanya, S. R. M. Prasanna, Bayya Yegnanarayana

Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberation
Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi

Inner product based-multiband vector quantization for wideband speech coding at 16 kbps
Seung Yeol Lee, Nam Soo Kim, Joon-Hyuk Chang

Speech enhancement and recognition by integrating adaptive beamforming and wiener filtering
Alberto Abad, Javier Hernando

Temporal normalization techniques for transform-type speech coding and application to split-band wideband coders
Kyung-Tae Kim, Sung-Kyo Jung, MiSuk Lee, Hong-Goo Kang, Dae Hee Youn

Interface for barge-in free spoken dialogue system using adaptive sound field control
Tatsunori Asai, Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano

Multi-mode harmonic transfrom excitation LPC coding for speech and music
Jong-Hark Kim, Jae-Hyun Shin, In-Sung Lee

Source separation using particle filters
Mital Gandhi, Mark Hasegawa-Johnson

Segmental speech coding model for storage applications
Anssi Ramo, Jani Nurminen, Sakari Himanen, Ari Heikkinen

Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition
Gwo-hwa Ju, Lin-shan Lee

Minimum phase compensation in speech coding using hammerstein model
Jari Juhani Turunen, Juha Tanttu, Frank Cameron

Optimizing regression for in-car speech recognition using multiple distributed microphones
Weifeng Li, Fumitada Itakura, Kazuya Takeda

Speech enhancement based on magnitude estimation using the gamma prior
Weifeng Li, Kazuya Takeda, Fumitada Itakura, Huy Dat Tran

Unscented kalman filtering of line spectral frequencies
Andrew Errity, John McKenna, Stephen Isard

Speech enhancement based on smoothing of spectral noise floor
Hyoung-Gook Kim, Thomas Sikora

Noise reduction using hybrid noise estimation technique and post-filtering
Junfeng Li, Masato Akagi

An adaptive kalman filter for the enhancement of speech signals
Marcel Gabrea

Improved iterative wiener filtering for non-stationary noise speech enhancement
T. V. Sreenivas, K. Sharath Rao, A. Sreenivasa Murthy

Highband spectrum envelope estimation of telephone speech using hard/soft-classification
Yasheng Qian, Peter Kabal






Prosodic Recognition and Analysis


Evaluation of a threshold for detecting local slower phrases in Japanese spontaneous conversational speech
Keiichi Takamaru

Intonation recognition for indonesian speech based on fujisaki model
Nazrul Effendy, Ekkarit Maneenoi, Patavee Charnvivit, Somchai Jitapunkul

Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features
Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose

Clause types and filed pauses in Japanese spontaneous monologues
Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Nobuaki Minematsu

Effect of voice prosody on the decision making process in human-computer interaction
Yohei Yabuta, Yasuhiro Katagiri, Noriko Suzuki, Yugo Takeuchi

Alignment of human prosodic patterns for spoken dialogue systems
Noriko Suzuki, Yasuhiro Katagiri

Evaluation of a prosodic labeling system utilizing linguistic information
Shinya Kiriyama, Shigeyoshi Kitazawa

Functions of intonation boundaries during spoken language comprehension in English
Allison Blodgett

Voice activation using prosodic features
Marco Kühne, Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann

The role of prosodic cues in word segmentation of Korean
Sahyang Kim

Default phrasing and attachment preference in Korean
Sun-Ah Jun

Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models
Sarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole

The role of pitch range variation in the discourse structure and intonation structure of Korean
Eunjong Kong

Dependency analysis of read Japanese sentences using pause and F0 information: a speaker independent case
Kazuyuki Takagi, Kazuhiko Ozeki

Effects of prosodic boundaries on ambiguous syntactic clause boundaries in Japanese
Shari Speer, Soyoung Kang

The superior effectivenes of the F0 range for identifying the context from sounds without phonemes
Yasuko Nagasaki, Takanori Komatsu

A study of tone classification for continuous Thai speech recognition
Tan Li, Montri Karnjanadecha, Thanate Khaorapapong

An acoustic-analytic role for the deviation between the scansion and reading of poems
Key-Seop Kim, Un Lim, Dong-Il Shin

Estimating syntactic structure from prosodic features in Japanese speech
Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa

Perceptual discrimination of prosodic types and their preliminary acoustic analysis
Masahiko Komatsu, Tsutomu Sugawara, Takayuki Arai



Search papers
Article
×

Plenary Talks

Speech Recognition - Adaptation

Spoken Language Identification, Translation and Retrieval I

Linguistics, Phonology, and Phonetics

Biomedical Applications of Speech Analysis

Robust Speech Recognition on AURORA

Spoken / Multimodal Dialogue System

Speech Recognition - Search

Spoken Dialogue and Systems

Speech Perception

Multi-Lingual Speech-to-Speech Translation

Speech Recognition - Large Vocabulary

Speech Science

Novel Features in ASR

Spoken and Natural Language Understanding

Speaker Segmentation and Clustering

Speech Processing in a Packet Network Environment

Acoustic Modeling

Prosody Modeling and Generation

Multi-Sensor ASR

Multi-Lingual Speech Processing

Speech Enhancement

Speech and Affect

Speech Features

Language Modeling, Multimodal & Multilingual Speech Processing

Detection and Classification in ASR

Speech Analysis

Speech Production

Audio-Visual Speech Processing

Spoken Language Generation and Synthesis III

Speech Recognition - Language Model

Speaker Recognition

Processing of Prosody by Humans and Machines

Contemporary Issues in ASR

Second Language Learning and Spoken Language Processing

Emerging Research: Human Factors in Speech and Communication Systems

Interdisciplinary Topics in Spoken Language Processing

Towards Adaptive Machines: Active and Unsupervised Learning

Speech Coding

Robust ASR

Emerging Research

Spoken Language Resources and Technology Evaluation I

Multi-Modal / Multi-Media Processing

Automatic Speech Recognition in the Context of Mobile Communications

Robust Features for ASR

Towards Rapid Speech and Natural Language Application Development: Tooling, Architectures, Components and Standards

Speech Coding and Enhancement

Acoustic Modeling for Robust ASR

Spoken Dialogue Technology and Systems

Multi-Channel Speech Processing

Intersection of Spoken Language Processing and Written Language Processing

Prosodic Recognition and Analysis

Towards Rapid Speech and Natural Language Application Development