ISCA Archive Eurospeech 1999 Sessions Search Booklet
  ISCA Archive Sessions Search Booklet
×

Click on column names to sort.

Searching uses the 'and' of terms e.g. Smith Interspeech matches all papers by Smith in any Interspeech. The order of terms is not significant.

Use double quotes for exact phrasal matches e.g. "acoustic features".

Case is ignored.

Diacritics are optional e.g. lefevre also matches lefèvre (but not vice versa).

It can be useful to turn off spell-checking for the search box in your browser preferences.

If you prefer to scroll rather than page, increase the number in the show entries dropdown.

top

6th European Conference on Speech Communication and Technology

Budapest, Hungary
5-9 September 1999

General Chair: Géza Gordos
doi: 10.21437/Eurospeech.1999




Speech Recognition - Acoustic Processing


Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition
Rathinavelu Chengalvarayan

Acoustic pre-processing for optimal effectivity of missing feature theory
Johan de Veth, Bert Cranen, Febe de Wet, Louis Boves

Simultaneous recognition of multiple sound sources based on 3-d n-best search using microphone array
Panikos Heracleous, Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano

Down-sampling speech representation in ASR
Hynek Hermansky, Pratibha Jain

Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR
Dusan Macho, Climent Nadeu, Peter Jancovic, Gregor Rozinaj, Javier Hernando

Syllable onset detection applied to the portuguese language
Hugo Meinedo, Joao P. Neto, Luis B. Almeida

Decorrelated and liftered filter-bank energies for robust speech recognition
Kuldip K. Paliwal

Optimization algorithms for estimating modulation spectrum domain filters
Pau Paches-Leal, Richard C. Rose, Climent Nadeu

Efficient vector quantization using an n-path binary tree search algorithm
R. San-Segundo, R. Córdoba, J. Ferreiros, A. Gallardo, J. Colás, J. Pastor, Y. López

Neural network based optimal feature extraction for ASR
Narada D. Warakagoda, Magne H. Johnsen

A study of speech recognition for the elderly
Fumihiro Yato, Naomi Inoue, Kazuo Hashimoto

The analysis and application of a new endpoint detection method based on distance of autocorrelated similarity#
Jie Zhu, Fei-li Chen


Articulatory Measurements And Modelling


Hyper-articulated speech: auditory and visual intelligibility
Denis Beautemps, Pascal Borel, Sébastien Manolios

Modeling of the vocal tract in three dimensions
Olov Engwall

Articulatory reduction in emotional speech
Miriam Kienast, Astrid Paeschke, Walter Sendlmeier

A trajectory formation model of articulatory movements using a multidimensional phonemic task
Tokihiko Kaburagi, Masaaki Honda, Takeshi Okadome

LPC-based inversion of the DRM articulatory model
Sacha Krstulovic

A vocal tract model using multi-line equivalent circuits
Nobuhiro Miki, Thoru Yokoyama, Takeshi Ohtani, Shinobu Masaki, Ikuhiro Shimada, Ichiro Fujimoto, Yuji Nakamura

Acoustic nature of the whisper
Masahiro Matsuda, Hideki Kasuya

Relations between utterance speed and articulatory movements
Takeshi Okadome, Tokihiko Kaburagi, Masaaki Honda

Design of hypercube codebooks for the acoustic-to-articulatory inversion respecting the non-linearities of the articulatory-to-acoustic mapping
Slim Ouni And Yves Laprie

A missing-word test comparison of human and statistical language model performance
Marie Owens, Anja Krüger, Paul Donnelly, F J Smith, Ji Ming

Estimating velum height from acoustics during continuous speech
Korin Richmond

On improving the decision algorithm for articulatory codebook search
C. Silva, S. Chennoukh, Isabel Trancoso

Extraction of articulators in x-ray image sequences
G. Thimm, J. Luettin

Effects of source-tract interaction in perception of nasality
António Teixeira, Francisco Vaz, José Carlos Príncipe

Perceiving anticipatory phonetic gestures in French
Béatrice Vaxelaire, Rudolph Sock, Véronique Hecker

Motor equivalence evidenced by articulatory modelling
Anne Vilain, Christian Abry, Pierre Badin






Speech Recognition - Confidence Measures 2


Accurate recognition of city names with spelling as a fall back strategy
Josef G. Bauer, Jochen Junkawitsch

Selective prosodic post-processing for improving recognition of French telephone numbers
Katarina Bartkova, Denis Jouvet

Improving rejection with semantic slot-based confidence scores
Eric I. Chang

The IBM conversational telephony system for financial applications
K. Davies, R. Donovan, M. Epstein, Martin Franz, Abraham Ittycheriah, E. E. Jan, J. M. LeRoux, David Lubensky, Chalapathy Neti, Mukund Padmanabhan, K. Papineni, Salim Roukos, A. Sakrajda, Jeffrey S. Sorensen, B. Tydlitat, T. Ward

Error spotting using syllabic fillers in spontaneous conversational speech recognition
Rachida El Méliani, Douglas O’Shaughnessy

Recognition of spelled names over the telephone and rejection of data out of the spelling lexicon
Denis Jouvet, Jean Monné

An utterance verification system based on subword modeling for a vocabulary independent speech recognition system
Myoung-Wan Koo, Sun-Jeong Lee

Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data
Nicolas Moreau, Denis Jouvet

Variable preselection list length estimation using neural networks in a telephone speech hypothesis-verification system
J. Macías-Guarasa, J. Ferreiros, A. Gallardo, R. San-Segundo, Juan Manuel Pardo, L. Villarrubia

Speaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech
Thilo Pfau, Robert Faltlhauser, Günther Ruske

Automatic speech recognition using acoustic confidence conditioned language models
Richard C. Rose, Giuseppe Riccardi

Utilizing prosody for unconstrained morpheme recognition
Volker Strom, Henrik Heine

Modeling the prosody of hidden events for improved word recognition
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tür, Gökhan Tür

A comparison of word graph and n-best list based confidence measures
Frank Wessel, Klaus Macherey, Hermann Ney


Speech Analysis And Tools


C++ software environment for speech signal processing
Marcus M. Prätzas, Ulrich Balss, Herbert Reininger, Harald Wüst

Improvement of electrolaryngeal speech by introducing normal excitation information
Kun Ma, Pelin Demirel, Carol Espy-Wilson, Joel MacAuslan

Detecting user speech in barge-in over prompts using speaker identification methods
Abraham Ittycheriah, Richard J. Mammone

Speaker and channel-normalized set of formant parameters for telephone speech recognition
Boris Lobanov, T. Levkovskaya, Igor E. Kheidorov

Fuzzy segmentation of lip image using cluster analysis
Alan W.C. Liew, K. L. Sum, S. H. Leung, Wai H. Lau

Software to support research and development of spoken dialogue systems
Michael F. McTear

Analysis of sources of variability in speech
Sachin Kajarekar, Narendranath Malayath, Hynek Hermansky

Adaptive nonlinear prediction based on order statistics for speech signals
Tetsuya Shimamura, Haruko Hayakawa

Developing a voiced information retrieval system for the portuguese language capable to handle both brazilian and portuguese spoken versions
M. N. Souza, E. J. Caprini, C. G. Machado, M. V. Ludolf, L. P. Calôba, J. M. Seixas, F. G. Resende, S. L. Netto, Diamantino R. Freitas, Joao Paulo Teixeira, C. Espain, V. Pera, F. Moreira

Real-time speech modeling using computationally efficient locally recurrent neural networks (CERNs)
John J. Soraghan, Amir Hussain, Ivy Shim

Effectiveness of KL-transformation in spectral delta expansion
M. Tokuhira, Y. Ariki





Speech Recognition - Search And Pronunciation Modelling


A two-stage speech recognition method with an error correction model
Yoshiharu Abe, Hiroyasu Itsui, Yuzo Maruta, Kunio Nakajima

Speech recognition with automatic punctuation
C. Julian Chen

Automatic modeling of pronunciation variations
Ellen Eide

Reducing search complexity in low perplexity tasks
Martin Franz, Miroslav Novak

A two-stage speech recognition method for information retrieval applications
Paolo Coletti, Marcello Federico

Multi-level decision trees for static and dynamic pronunciation models
Eric Fosler-Lussier

Modeling and efficient decoding of large vocabulary conversational speech
Michael Finke, Jürgen Fritsch, Detlef Koll, Alex Waibel

Evaluation of a segmentation system based on multi-level lattices
Jean-Luc Husson

The application of an improved DP match for automatic lexicon generation
Philip Hanna, Darryl Stewart, Ji Ming

Modeling trajectories in the HMM framework
Rukmini Iyer, Owen Kimball, Herbert Gish

Korean large vocabulary continuous speech recognition using pseudomorpheme units
Oh-Wook Kwon, Kyuwoong Hwang, Jun Park

Navigating German cities by spontaneous French queries
Harouna Kabré, Alexander Waibel

Generating alternative pronunciations from a dictionary
Filipp Korkmazskiy, Chin-Hui Lee

Finding consensus among words: lattice-based word error minimization
Lidia Mangu, Eric Brill, Andreas Stolcke

An efficient decoding method for real time speech recognition
Stefan Ortmanns, Wolfgang Reichl, Wu Chou

Recent improvements in voicemail transcription
Mukund Padmanabhan, G. Saon, S. Basu, Jing Huang, Geoffrey Zweig

Acoustics-based baseform generation with pronunciation and/or phonotactic models
Bhuvana Ramabhadran, Sabine Deligne, Abraham Ittycheriah

Improving recognition correct rate of important words in large vocabulary speech recognition
Yasuo Shirosaki, Hideaki Kikuchi, Katsuhiko Shirai

Pronunciation modeling by sharing gaussian densities across phonetic models
Murat Saraclar, Harriet Nock, Sanjeev Khudanpur

One pass cross word decoding for large vocabularies based on a lexical tree search organization
Xavier L. Aubert







Speech Recognition - Broadcast News


The philips/RWTH system for transcription of broadcast news
Peter Beyerlein, Xavier Aubert, Reinhold Haeb-Umbach, Matthew Harris, Dietrich Klakow, A. Wendemuth, Sirko Molau, Michael Pitz, A. Sixtus

Toward realtime transcription of broadcast news
Jason Davenport, Long Nguyen, Spyros Matsoukas, Richard Schwartz, John Makhoul

Recent advances in transcribing television and radio broadcasts
Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Michéle Jardino

Selection for acoustic coverage from unlimited speech extracted from closed-captioned TV
Photina Jaeyun Jang, Alexander G. Hauptmann

Laughter extracted from television closed captions as speech recognizer training data
Paul E. Kennedy, Alexander G. Hauptmann

Further advances in transcription of broadcast news
Long Nguyen, Spyros Matsoukas, Jason Davenport, Daben Liu, Jay Billa, Francis Kubala, John Makhoul

Recent advances in Japanese broadcast news transcription
Katsutoshi Ohtsuki, Sadaoki Furui, Naoyuki Sakurai, Atsushi Iwasaki, Zhi-Peng Zhang

Automatic verification of broadcast news transcriptions
Michael Pitz, Sirko Molau

Improved speaker segmentation and segments clustering using the bayesian information criterion
Alain Tritschler, Ramesh A. Gopinath

Development of the 1998 OGI-FONIX broadcast news transcription system
Xintian Wu, Yonghong Yan

Speech/music discrimination based on posterior probability features
Gethin Williams, Daniel P.W. Ellis

Dragon systems' 1998 broadcast news transcription system
Steven Wegmann, Puming Zhan, Ira Carp, Michael Newman, Jon Yamron, Larry Gillick

Progress in automatic meeting transcription
Hua Yu, Michael Finke, Alex Waibel

A study of broadcast news audio stream segmentation and segment clustering
Matthew Harris, Xavier Aubert, Reinhold Haeb-Umbach, Peter Beyerlein

Fast speaker change detection for broadcast news transcription and indexing
Daben Liu, Francis Kubala

Robust information extraction from spoken language data
David D. Palmery, Mari Ostendorf, John D. Burgerz

Integrated transcription and identification of named entities in broadcast speech
Steve Renals, Yoshihiko Gotoh

Improvements in accuracy and speed in the HTK broadcast news transcription system
P. C. Woodland, J. J. Odell, T. Hain, G. L. Moore, T. R. Niesler, Andreas Tuerk, E. W. D. Whittaker



Speaker Recognition - Acoustic Features And Robustness


Experimental evaluation of text-independent speaker verification on laboratory and field test databases in the M2VTS project
Laurent Besacier, J. Luettin, G. Maitre, E. Meurville

Channel estimation and normalization by coherent spectral averaging for robust speaker verification
Rajesh Balchandran, Vidhya Ramanujam, Richard J. Mammone

Time-frequency principal components of speech: application to speaker identification
Ivan Magrin-Chagnolleau, Geoffrey Durou

Speaker recognition by means of a combination of linear and nonlinear predictive models
Marcos Faúndez-Zanuy

Feature vector transformation using independent component analysis and its application to speaker identification
Gil-Jin Jang, Seong-Jin Yun, Yung-Hwan Oh

The prototype model in speaker identification
Yizhar Lavner, Judith Rosenhouse, Isak Gath

A new cepstrum-based channel compensation method for speaker verification
T. F. Lo, M. W. Mak, K. K. Yiu

Speaker recognition based on discriminative feature extraction - optimization of mel-cepstral features using second-order all-pass warping function
Chiyomi Miyajima, Hideyuki Watanabe, Tadashi Kitamura, Shigeru Katagiri

Facing severe channel variability in forensic speaker verification conditions
Javier Ortega-Garcia, Santiago Cruz-Llanas, Joaquin Gonzalez-Rodriguez

Speaker and language recognition using speech codec parameters
Thomas F. Quatieri, E. Singer, R. B. Dunn, Douglas A. Reynolds, J. P. Campbell

Robust speaker verification in noisy conditions by modification of spectral time trajectories
Vidhya Ramanujam, Rajesh Balchandran, Richard J. Mammone

Toward parametric representation of speech for speaker recognition systems
Rivarol Vergin, Douglas O'Shaughnessy, Pierre Dumouchel

Text independent speaker identification using LSP codebook speaker models and linear discriminant functions
R. D. Zilca, Y. Bistritz





Speech Recognition - Multilinguality


Recognition of continuous persian speech using a medium-sized vocabulary speech corpus
S. M. Ahadi

Multi-lingual speech recognition based on demi-syllable subword units
Tibor Fegyó, Péter Tatai

MAP-based cross-language adaptation augmented by linguistic knowledge: from English to Chinese
Pascale Fung, Chi Yuen Ma, Wai Kat Liu

Analysis of HMM models in alphabet letters recognition
Stefan Grocholewski

Tone recognition of Chinese continuous speech using tone critical segments
Keikichi Hirose, Jin-song Zhang

Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition
Tai-Hsuan Ho, Chin-Jung Liu, Herman Sun, Ming-Yi Tsai, Lin-Shan Lee

The clustering algorithm for the definition of multilingual set of context dependent speech models
Bojan Imperl, Bogomir Horvat

Study on tone classification of Chinese continuous speech in speech recognition system
Jian Liu, Xiaodong He, Fuyuan Mo, Tiecheng Yu

Decision tree-based triphones are robust and practical for mandarian speech recognition
Yi Liu, Pascale Fung

Decision trees for inter-word context dependencies in Spanish continuous speech recognition tasks
K. López de Ipiña, A. Varona, I. Torres, L. J. Rodríguez

End points detection for noisy speech using a wavelet based algorithm
Amin M. Nassar, Nemat S. Abdel Kader, Amr M. Refat

Adaptation of acoustic models for multilingual recognition
C. Nieuwoudt, E. C. Botha

Recognition of non-native German speech with multilingual recognizers
Ulla Uebler, Manuela Boros


Systems, Architectures, Interfaces


Relational vs. object-oriented models for representing speech: a comparison using ANDOSL data
Toomas Altosaar, Bruce Millar, Martti Vainio

First experiences of the German speechdat-car database collection in mobile environments
Christoph Draxler, Robert Grudszus, Stephan Euler, Klaus Bengler

OASIS - a framework for spoken language call steering
Mike Edgington, David Attwater, Peter Durston

VOCAPI - small standard API for command & control
Eike Gegenmantel

Standardised speech interfaces - key for objective evaluation of recognition accuracy
Christel Müller, Karsten Schröder

A medical rehabilitation diagnoses transcription method that integrates continuous and isolated word recognition
Shoichi Matsunaga, Yoshiaki Noda, Katsutoshi Ohtsuki, Eiji Doi, Tomio Itoh

Problems of creating a flexible e-mail reader for hungarian
Géza Németh, Csaba Zainkó, Gábor Olaszy, Gábor Prószéky

Interactive, TTS supported speech message composer for large, limited vocabulary, but open information systems
Gábor Olaszy, Géza Németh, Péter Olaszi, Géza Gordos

ALE for speech: a translation prototype
Gerald Penn, Bob Carpenter

An integrated system for Spanish CSR tasks
L.J. Rodríguez, M. I. Torres, J. M. Alcaide, A. Varona, K. López de Ipina, M. Penagarikano, G. Bordel

Use of speech synthesis in an application
Angelien Sanderman, Ellen Bosgoed, Hans de Graaff, Peter van Splunder

Text-to-audio-visual speech synthesis based on parameter generation from HMM
Masatsune Tamura, Shigekazu Kondo, Takashi Masuko, Takao Kobayashi

Authoring tools for speech synthesis using the sable markup standard
Johan Wouters, Brian Rundle, Michael W. Macon


Speaker Recognition - Scoring And Decision


Dynamic weighting of the distortion sequence in text-dependent speaker verification
A. M. Ariyaeeinia, P. Sivakumaran, M. Pawlewski, M. J. Loomes

On the use of supra model information from multiple classifiers for robust speaker identification
Hakan Altincay, Mübeccel Demirekler

Missing features detection and handling for robust speaker verification
Mounir El-Maliki, Andrzej Drygajlo

High performance text-independent speaker recognition system based on voiced/unvoiced segmentation and multiple neural nets
Nikos Fakotakis, John Sirigos, George Kokkinakis

Similarity normalization method based on world model and a posteriori probability for speaker verification
Corinne Fredouille, Jean-François Bonastre, Teva Merlin

Text-independent speaker verification using virtual speaker based cohort normalization
Toshihiro Isobe, Jun-ichi Takahashi

Robust person verification based on speech and facial images
J. Luettin, S. Ben-Yacoub

A neural network-based text-dependent speaker verification system using suprasegmental features
M. Mathew, B. Yegnanarayana, R. Sundar

Modelling output probability distributions for enhancing speaker recognition
Jason Pelecanos, Sridha Sridharan

On the use of neural networks to combine utterance and speaker verification systems in a text-dependent speaker verification task
L. Rodríguez-Linares, C. García-Mateo, J. L. Alba-Castro

Genesys: a neural network model for speaker identification
B. Ruiz-Mezcua, R. Rodríguez-Galán, Luis A. Hernández-Gómez, Paloma Domingo-García, Enrique Bailly-Baillicre Gutiérrez

Speaker verification with growing cell structures
Bogdan Sabac, Inge Gavat

Environment adaptation and long term parameters in speaker identification
Chakib Tadj, Pierre Dumouchel, Mohamed Mihoubi, Pierre Ouellet

Speaker identification using subband HMMS
K. Yoshida, K. Takagi, K. Ozeki

A priori threshold determination for phrase-prompted speaker verification
W. D. Zhang, K. K. Yiu, M. W. Mak, C. K. Li, M. X. He




Speech Recognition - Acoustic Modelling 1


Experiments in constrained maximum likelihood extraction of temporal features for speech recognition
Gilles Boulianne, Julie Brousseau, Nathalie Talbot, Pierre Dumouchel

Model selection in acoustic modeling
S. S. Chen, Ramesh A. Gopinath

Acoustic modeling and language modeling for cantonese LVCSR
Y. W. Wong, K. F. Chow, Wai H. Lau, W. K. Lo, Tan Lee, P. C. Ching

Context dependent hybrid HMM/ANN systems for large vocabulary continuous speech recognition system
O. Deroo, C. Ris, S. Dupont

Reduced gaussian mixture models in a large vocabulary continuous speech recognizer
V. Fischer, T. Ross

Mixture trees - hierarchically tied mixture densities for modeling HMM emission probabilities
J. Fritsch

Reinforcement learning for phoneme recognition
Akira Ichikawa, Tomoyuki Shimizu, Yasuo Horiuchi

Combined temporal and spectral multi-resolution phonetic modelling
Paul McCourt, Naomi Harte, Saeed Vaseghi

Speed improvement of the time-asynchronous acoustic fast match
Miroslav Novak, Michael Picheny

A hybrid ANN/HMM syllable recognition module based on vowel spotting
John Sirigos, Nikos Fakotakis, George Kokkinakis

Data-driven modulation filter design under adverse acoustic conditions and using phonetic and syllabic units
Michael L. Shire

Accuracy versus complexity in context dependent phone modeling
Wei Xu, Jacques Duchateau, Kris Demuynck, Ioannis Dologlou, Patrick Wambacq, Dirk van Compernolle, Hugo van Hamme

A new hybrid structure of speech recognizer based on HMM and neural network
Jianlai Zhou, Xiaodong He, Tiecheng Yu, Fuyuan Mo

Dependency modeling with bayesian networks in a voicemail transcription system
Geoffrey Zweig, Mukund Padmanabhan

A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
Li Deng, Jeff Ma

A study on the effect of adding new dimensions to trajectories in the acoustic space
D. Albesano, R. De Mori, R. Gemello, F. Mana

Tail distribution modelling using the richter and power exponential distributions
M. J. F. Gales, P. A. Olsen

A study of duration in continuous speech recognition based on DDBHMM
Qingwei Zhao, Zuoying Wang, Dajin Lu

Comparison of continuous-density and semi-continuous HMM in isolated words recognition systems
T. Vaich, A. Cohen







Speech Recognition - Acoustic Modelling 2


Hybrid connectionist-structural acoustical modeling in the ATROS system
M. J. Castro, F. Casacuberta

Path-dependent kalman estimation of a cepstral bias
Lionel Delphin-Poulat, Jérôme Idier

Acoustical modelling of phone transitions: biphones and diphones - what are the differences?
S. Dobrisek, F. Mihelic, N. Pavesic

Optimal feature sub-space selection based on discriminant analysis
Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle

Phoneme recognition system based on HMM with distributed VQ codebook
Mohamed Debyeche, Mohamed Afify, Jean-Paul Haton

Research on speech units modeling in continuous speech recognition
Xiaodong He, Jian Liu, Jianlai Zhou, Tiecheng Yu

An investigation of cepstral parameterisations for large vocabulary speech recognition
Reinhold Haeb-Umbach, Marco Loog

Dynamic HMM selection for continuous speech recognition
T. Hain, P. C. Woodland

Unified decoding and feature representation for improved speech recognition
Li Jiang, Xuedong Huang

High accuracy acoustic modeling based on multi-stage decision tree
DongHwa Kim, Chaojun Liu, Xintian Wu, Yonghong Yan

Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition
Jeff Ma, Li Deng

Top-down bottom-up hybrid clustering algorithm for acoustic-phonetic modeling of speech
José B. Marino, Albino Nogueiras-Rodríguez

Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition
Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takeshi Yamada, Takashi Endo

Diphone subspace models for phone-based HMM complementation
Klaus Reinhard, Mahesan Niranjan

Unified framework for acoustic topology modelling: ML-SSS and question-based decision trees
Harald Singer, Atsushi Nakamura

Linear transformations in sub-band groups for speech recognition
B. Doherty, Saeed Vaseghi, Paul McCourt

A prototype of Mandarin speech telephone number inquiry system
Peng-Ren Lu, Wei-Tyng Hong, Sheng-Lun Chiang, Yih-Ru Wang, Sin-Horng Chen

Off-line acoustic modelling of non-native accents
Silke Witt, Steve Young

Semi-supervised adaptation of acoustic models for large-volume dictation
Colin W. Wightman, Ted A. Harder

Enhanced likelihood computation using regression
Peter de Souza, Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny

High accuracy acoustic modeling using two-level decision-tree based state-tying
Chaojun Liu, Xintian Wu, Yonghong Yan

Domain adduced state tying for cross-domain acoustic modelling
R. Singh, B. Raj, Richard M. Stern

Parameter tying and gaussian clustering for faster, better, and smaller speech recognition
Ananth Sankar, Venkata Ramana Rao Gadde

A combined maximum mutual information and maximum likelihood approach for mixture density splitting
Ralf Schlüter, Wolfgang Macherey, Boris Müller, Hermann Ney


Dialogue 2


Knowledge collection for natural language spoken dialog systems
Egbert Ammicht, Allen Gorin, Tirso Alonso

Improving discourse management in TRIPS-98
Donna K. Byron

Speech act modeling in a spoken dialogue system using fuzzy hidden Markov model and bayes' decision criterion
Chung-Hsien Wu, Gwo-Lang Yan, Chien-Liang Lin

Task hierarchies representing sub-dialogs in speech dialog systems
Ute Ehrlich

Effects of system barge-in responses on user impressions
Jun-Ichi Hirasawa, Mikio Nakano, Takeshi Kawabata, Kiyoaki Aikawa

A new word-confidence threshold technique to enhance the performance of spoken dialogue systems
R. López-Cózar, Antonio J. Rubio, P. García, J. C. Segura

Confirmation strategies to improve correction rates in a telephonic inquiry dialogue system
C. Alexia Lavelle, Martine de Calmés, Guy Pérennou

Mathematical analysis of dialogue control strategies
Yasuhisa Niimi, Takuya Nishimoto

Processing of anaphoric and elliptic sentences in a spoken dialog system
Jana Ocelikova, Vaclav Matousek

Free-flow dialog management using forms
K. A. Papineni, Salim Roukos, T. Ward

Towards the detection and description of textual meaning indicators in spontaneous conversations
Klaus Ries

Dialogue management in the dutch ARISE train timetable information system
Janienke Sturm, Els den Os, Lou Boves

Problem spotting in human-machine interaction
Emiel Krahmer, Marc Swerts, Mariet Theune, Mieke Weegels

Consistent dialogue across concurrent topics based on an expert system model
Bor-shen Lin, Hsin-min Wang, Lin-shan Lee


Speech Coding


Secondary codebook storage quantisation
Thomas M. Chapman, C. S. Xydeas

Pseudo-articulatory representations: promise, progress and problems
W. H. Edmondson, D. J. Iskra, P. Kienzle

A 1.7KBPS waveform interpolation speech coder using decomposition of pitch cycle waveform
Ge Gao, P. C. Ching

Enhanced analysis-by-synthesis waveform interpolative coding at 4 KBPS
Oded Gottesman, Allen Gersho

Joint source-channel decoding by channel-coded optimal estimation (CCOE) for a CELP speech codec
Norbert Görtz

Analysis-by-synthesis low-rate multimode harmonic speech coding
Chunyan Li, Allen Gersho, Vladimir Cuperman

Variable length coding of transformed LSF coefficients
László Lois

Low bit-rate speech coding using quantization of variable length segments
R. Mayrench, D. Malah

Low delay analysis/synthesis schemes for joint speech enhancement and low bit rate speech coding
Rainer Martin, Hong-Goo Kang, Richard V. Cox

A comparative study of several ADPCM schemes with linear and nonlinear prediction
Oscar Oliva, Marcos Faúndez-Zanuy

Segmental feature extraction and coding for speech synthesis
H. Ohmura, K. Tanaka

Backward adaptive RBF-based hybrid predictors for CELP-type coders at medium bit-rates
C. Peláez-Moreno, F. Díaz-de-María

An improved speech model with allowance for time-varying pitch harmonic amplitudes and frequencies in low bit-rate MBE coders
Valentin V. Sercov, Alexander A. Petrovsky

Sparse vector linear prediction matrices with multidiagonal structure
Davor Petrinovic, Davorka Petrinovic

Source-dependent variable rate speech coding below 3 KBPS
M. Stefanovic, A. Kondoz

A novel speech coding approach based on half-wave vector quantization *
Xiaoping Chen, Yantao Song, Tiecheng Yu

Speech coding using mixture of gaussians polynomial model
Parham Zolfaghari, Tony Robinson




Speech Recognition - Language Modelling


Language modeling based on automatic word concatenations
Christel Beaujard, Michéle Jardino

Recognition performance of a structured language model
Ciprian Chelba, Frederick Jelinek

On the use of right context in sense-disambiguating language models
Vincent Chow, Dekai Wu

Language modelling with hierarchical domains
Paul G. Donnelly, F. J. Smith, E. Sicilia, Ji Ming

Integration of several information sources for robust class-based statistical language modelling
Géraldine Damnati

Efficient language model adaptation through MDI estimation
Marcello Federico

Syntax-based speech recognition: how a syntactic parser can help a recognition system
Arnaud Gaudinat, Jean-Philippe Goldman, Eric Wehrli

A new metric for stochastic language model evaluation
Akinori Ito, Masaki Kohda, Mari Ostendorf

Phrase-based language models for speech recognition
Hong-Kwang Jeff Kuo, Wolfgang Reichl

Class-combined word n-gram for robust language modeling
Norihiko Kobayashi, Tetsunori Kobayashi

Using partial morphological analysis in language modeling estimation for large vocabulary portuguese speech recognition
Ciro Martins, Joao P. Neto, Luís B. Almeida

Discriminative training of language model classifiers
Uwe Ohler, Stefan Harbeck, Heinrich Niemann

Improving n-gram modeling using distance-related unit association maximum entropy language modeling
Shuwu Zhang, Harald Singer, Dekai Wu, Yoshinori Sagisaka

Language modeling for broadcast news transcription
Gilles Adda, Michéle Jardino, Jean-Luc Gauvain

Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French
Frédéric Béchet, Alexis Nasr, Thierry Spriet, Renato de Mori

Language modelling and spoken dialogue systems - the ARISE experience
P. Baggia, A. Kellner, Guy Pérennou, C. Popovici, Janienke Sturm, Frank Wessel

Language model level vs. lexical level for modeling pronunciation variation in a French CSR
Laure Brieussel-Pousse, Guy Perennou

Characteristics of Chinese language models for large vocabulary telephone speech
Roger H.Y. Leung, Chi-Yan Choy, Hong C. Leung

A new based distance language model for a dictation machine: application to MAUD
D. Langlois, K. Smadli

Using various language model smoothing techniques for the transcription of a weather forecast broadcasted by the czech radio
Ludek Müller, Josef Psutka

Studies in acoustic training and language modeling using simulated speech data
Don McAllaster, Larry Gillick

Language model adaptation using minimum discrimination information
Wolfgang Reichl

Automatic and manual clustering for large vocabulary speech recognition: a comparative study
K. Smadli, A. Brun, I. Zitouni, Jean-Paul Haton

Learning of stochastic context-free grammars by means of estimation algorithms
Joan-Andreu Sánchez, José-Miguel Benedí

Part-of-speech n-gram and word n-gram fused language model
Hirofumi Yamamoto, Yoshinori Sagisaka

Linguistic features for whole sentence maximum entropy language models
Xiaojin Zhu, Stanley F. Chen, Ronald Rosenfeld

Variable-length sequence language model for large vocabulary continuous dictation machine
I. Zitouni, J. F. Mari, K. Smadli, Jean-Paul Haton

Using detailed linguistic structure in language modelling
Ruiqiang Zhang, Ezra Black, Andrew Finch


Prosody - Study Of Prosody For Speech Synthesis


Decision tree micro-prosody structures for text to speech synthesis
Aimin Chen, Shu Lian Wong, Saeed Vaseghi, Charles Ho

Automatic modeling of duration in a Spanish text-to-speech system using neural networks
R. Córdoba, J. A. Vallejo, J. M. Montero, J. Gutierrez-Arriola, M. A. López, Juan Manuel Pardo

Objective methods for evaluating synthetic intonation
Robert A.J. Clark, Kurt E. Dusterhoff

Using decision trees within the tilt intonation model to predict F0 contours
Kurt E. Dusterhoff, Alan W. Black, Paul Taylor

Levels of prosodic representation in spoken discourse: an empirical approach
Richard Esposito, Li-chiung Yang

Segmental duration modelling in a text-to-speech system for the galician language
Xavier Fernández-Salgado, Eduardo R. Banga

The symbolic coding of segmental duration and tonal alignment: an extension to the INTSINT system.
Daniel Hirst

Training an application-dependent prosodic model corpus, model and evaluation
Yann Morlec, Gérard Bailly, Véronique Aubergé

Farsi language prosodic structure, research and implementation using a speech synthesizer
H. Sheikhzadeh, A. Eshkevari, M. Khayatian, R. Sadigh, S. M. Ahadi

Acoustical characterisation of the accented syllable in portuguese, a contribution to the naturalness of speech synthesis
Joao Paulo Teixeira, Elisabete Rosa Paulo, Diamantino Freitas, Maria da Graca Pinto

Analysis and synthesis of the four tones in connected speech of the standard Chinese based on a command-response model
Changfu Wang, Hiroya Fujisaki, Sumio Ohno, Tomohiro Kodama

A profile of the discourse and intonational structures of route descriptions
Sandra Williams, Catherine I. Watson





Speech Generation And Synthesis - Prosody


Predicting gradient F0 variation: pitch range and accent prominence
Ivan Bulyko, Mari Ostendorf

CART-based duration modeling using a novel method of extracting prosodic features
Paul Deans, Andrew Breen, Peter Jackson

A primary study on the randomness control of the prosodic boundary index for natural synthetic speech
Ki-Wan Eom, Jin-Young Kim, Sun-Mi Kim

On a hybrid time domain-LPC technique for prosody superimposing used for speech synthesis
Attila Ferencz, István Nagy, Tünde-Csilla Kovács, Teodora Ratiu, Maria Ferencz

Multilingual prosody modelling using cascades of regression trees and neural networks
J. W. A. Fackrell, H. Vereecken, J.-P. Martens, Bert Van Coile

An efficient speaker adaptation method for TTS duration model
Wentao Gu, Chilin Shih, Jan P.H. van Santen

Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program
David House, Linda Bell, Kjell Gustafson, Linn Johansson

Representation and processing of linguistic structures for an all-prosodic synthesis system using XML
Mark Huckvale

A study on a pitch alteration by using the formant and phase compensation technique
Won Park, Hyung-Bin Park, Myung-Jin Bae

Micro-prosodic control in cantonese text-to-speech synthesis
Tan Lee, Helen M. Meng, Wai H. Lau, W. K. Lo, P. C. Ching

Exploring the naturalness of several German high-quality-text-to-speech systems
Hansjörg Mixdorff, Dieter Mehnert

Detecting accent sandhi in Japanese using a superpositional F0 model
A. Sakurai, Hiromichi Kawanami, Keikichi Hirose

Focus detection by comparison of speech waveforms
Satoshi Kitagawa, Nick Campbell

An advanced intonation model for synthesis
Mark Tatham, Eric Lewis, Katherine Morton

A new F0 modification algorithm by manipulating harmonics of magnitude spectrum
Satoshi Takano, Masanobu Abe

A mixed strategy approach to Spanish prosody
Juan Manuel Villar Navarro, Eduardo López Gonzalo, José Relaño Gil






Speech Understanding - Miscellaneous Topics


Linguistic phrase spotting in a simple application spoken dialogue system
Manuela Boros, Paul Heisterkamp

Learning of domain dependent knowledge in semantic networks
F. Deinzer, J. Fischer, U. Ahlrichs, Elmar Nöth

Combining words and prosody for information extraction from speech
Dilek Hakkani-Tür, Gökhan Tür, Andreas Stolcke, Elizabeth Shriberg

Error correction translation using text corpora
Kai Ishikawa, Eiichiro Sumita

Efficient sentence disambiguation by preferred constituent order
S. Kronenberg, K. Skuplik

Identifying linguistic segmentations in Chinese spoken dialogue
Yue-Shi Lee, Hsin-Hsi Chen

Error recovery for robust language understanding in spoken dialogue systems
Tung-Hui Chiang, Yi-Chung Lin

A monolingual semantic decoder based on word sense disambiguation for mixed language understanding
Xiaohu Liu, Pascale Fung, Chi Shun Cheung

To believe is to understand
Helen M. Meng, Wai Lam, Carmen Wai

A hybrid approach to spoken dialogue understanding: prosody, statistics and partial parsing
Elmar Nöth, Jürgen Haas, Volker Warnke, Florian Gallwitz, Manuela Boros

Portable speech interpreter which has voice input and sophisticated correction functions
Yasunari Obuchi, Atsuko Koizumi, Yoshinori Kitahara, Jun'ichi Matsuda, Toshihisa Tsukada

Categorical understanding using statistical ngram models
Alexandros Potamianos, Giuseppe Riccardi, Shrikanth Narayanan

Detection and correction of speech repairs in word lattices
Jörg Spilker, Hans Weber, Günther Görz

Connectionist language models for speech understanding: the problem of word order variation
Igor Schadle, Jean-Yves Antoine, Daniel Memmi

Semi-automatic acquisition of domain-specific semantic structures
Kai-Chung Siu, Helen M. Meng

Transformation into language processing units by dividing and connecting utterance units
Toshiyuki Takezawa

Learning a lightweight robust deterministic parser
Aboy Wong, Dekai Wu

An information-based method for selecting feature types for word prediction
Dekai Wu, Zhifang Sui, Jun Zhao

A robust parser for spoken language understanding
Ye-Yi Wang


Speech Generation And Synthesis - Systems, Linguistic Processing


Aiuruete: a high-quality concatenative text-to-speech system for brazilian portuguese with demisyllabic analysis-based units and a hierarchical model of rhythm production
Plínio A. Barbosa, Fábio Violaro, Eleonora C. Albano, Flávio Simoes, Patrícia Aquino, Sandra Madureira, Edson Francozo

A parser-based text preprocessor for romanian language TTS synthesis
Dragos Burileanu, Claudius Dan, Mihai Sima, Corneliu Burileanu

Nparse - a shallow n-gram-based grammatical-phrase parser
Alice Carlberger

A language-independent probabilistic model for automatic conversion between graphemic and phonemic transcription of words
Evangelos Dermatas, George Kokkinakis

Acquisition of an extensive rule set for slovene grapheme-to-allophone transcription
Jerneja Gros, F. Mihelic

Voice conversion between UK and US accented English
Ching-Hsiang Ho, Saeed Vaseghi, Aimin Chen

Development of speech design tool "SESIGN99" to enhance synthesized speech
Hideyuki Mizuno, Masanobu ABE, Shin'ya Nakajima

Automation of the training procedures for neural networks performing multi-lingual grapheme to phoneme conversion
Horst-Udo Hain

Parsing hungarian sentences in order to determine their prosodic structures in a multilingual TTS system
Ilona Koutny

Text-to-speech synthesis of estonian
Meelis Mihkla, Arvo Eek, Einar Meister

Development of an emotional speech synthesiser in Spanish
J. M. Montero, J. Gutiérrez-Arriola, J. Colás, J. Macías-Guarasa, E. Enríquez, Juan Manuel Pardo

S5: the SQEL slovene speech synthesis system
N. Pavesic, Jerneja Gros

A multilingual text processing engine for the PAPAGENO text-to-speech synthesis system
Matej Rojc, Janez Stergar, Ralph Wilhelm, Horst-Udo Hain, Martin Holzapfel, Bogomir Horvat

Toshiba English text-to-speech synthesizer (TESS)
Chang K. Suh, Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine

Towards the generation of French phonetic inflected forms
Frédérique Sannier, Véronique Aubergé

Canadian French text-to-speech synthesis: modeling an optimal set of realizations for dialect markers
Evelyne Tzoukermann, Lucie Ménard, Marise Ouellet

Machine learning of word pronunciation: the case against abstraction
Bertjan Busser, Walter Daelemans, Antal van den Bosch


Speech &Amp; The Internet


A public domain speech-to-text system
M. Ordowski, N. Deshmukh, A. Ganapathiraju, J. Hamaker, Joseph Picone

Human speech production - an internet-based interactive multimodal tutorial
Klaus Fellbaum, Joerg Richter

Terminology principles and support for spoken language system development
Dafydd Gibbon, Silke Kölsch, Inge Mertins, Michaela Schulte, Thorsten Trippel

New WWW browser for visually impaired people using interactive voice technology
Yasuo Horiuchi, Fujiwara Atsushi, Akira Ichikawa

Text to speech control protocol
Jirí Hanika, Petr Horák

Multilinguality and human language technology courseware
Bojan Petek

Multimodal information seeking dialogues on the world wide web
José Rouillard, Jean Caelen

Compression of acoustic features - are perceptual quality and recognition performance incompatible goals?
Roger Tucker, Tony Robinson, James Christie

A network architecture for building applications that use speech recognition and/or synthesis
Dominique Vaufreydaz, José Rouillard, Mohammad Akbar

Criteria for evaluating internet tutorials in speech communication sciences
Chris Bowerman, Anders Eriksson, Mark Huckvale, Mike Rosner, Mark Tatham, Maria Wolters

Javaspeechlab - interactive speech analysis laboratory on the world-wide web
Andrzej Drygajlo, Guy Delafontaine

Reviving discrete HMMs: the myth about the superiority of continuous HMMs
Vassilis Digalakis, Stavros Tsakalidis, Leonardo Neumeyer

Principles and design of an intelligent system for information retrieval over the internet with a multimodal dialogue interface
Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Kenji Abe, Michio Iijima, Masayoshi Suzuki, Kazunari Taketa

An asynchronous virtual meeting system for bi-directional speech dialog
Takuya Nishimoto, Hidehiro Yuki, Takehiko Kawahara, Yasuhisa Niimi





Corpora


The acquisition of a speech corpus for limited domain translation
Demetrio Aiello, Loredana Cerrato, Cristina Delogu, Andrea Di Carlo

Tagging spoken corpus
Yue-Shi Lee, Hsin-Hsi Chen

A hungarian child database for speech processing applications
F. Csatári, Zs. Bakcsi, Klára Vicsi

A generic lexicon tool for word model definition in multimodal applications
Julie Carson-Berndsen

Compiling multi-tiered speech databases into the relational model: experiments with the emu system
Steve Cassidy

Two Swedish Speechdat databases - some experiences and results
Kjell Elenius

A multimodal database of gestures and speech
Satoru Hayamizu, Shigeki Nagaya, Keiko Watanuki, Masayuki Nakazawa, Shuichi Nobe, Takashi Yoshimura

Japanese spontaneous speech database with wide regional and age distribution
Tomoko Matsui, Masaki Naito, Harald Singer, Atsushi Nakamura, Yoshinori Sagisaka

Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition
Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takeshi Yamada, Takashi Endo

Automatic labeling of Japanese prosody using j-toBI style description
Hiroaki Noguchi, Kazuhisa Kiriyama, Hiroshi Matsuda, Miki Taniguchi, Yasuharu Den, Yasuhiro Katagiri

Czech language database of car speech and environmental noise
Petr Pollák, Josef Vopièka, Pavel Sovka

Language model selection based on the analysis of Japanese spontaneous speech on travel arrangement task
Akira Kurematsu, Atsushi Sukenori

New resources at BAS: acoustic, multimodal, linguistic
Florian Schiel, Christoph Draxler, Phil Hoole, Hans G. Tillmann

Building speech databases for cellular networks
Eric Sanders, Henk van den Heuvel, Khalid Choukri

The speechdat-car multilingual speech databases for in-car applications: some first validation results
Henk van den Heuvel, Jerôme Boudy, Robrecht Comeyne, Stephan Euler, Asuncion Moreno, Gael Richard

A welsh speech database: preliminary results
Briony Williams

New developments within the european language resources association (ELRA)
Khalid Choukri, ValéRie Mapelli, Jeff Allen

Data collection and processing in the carnegie mellon communicator
Maxine Eskenazi, Alexander I. Rudnicky, Karin Gregory, Paul Constantinides, Robert Brennan, Christina Bennett, Jwan Allen

Speechdat multilingual speech databases for teleservices: across the finish line
Harald Höge, Christoph Draxler, Henk van den Heuvel, Finn Tore Johansen, Eric Sanders, Herbert S. Tropf

Enhancing reusability of speech corpora by hyperlinked query output
Andreas Mengel, Ulrich Heid

Design and ccollection of a corpus of polyphones and prosodic contexts for speech synthesis research and development
Kim Silverman, Victoria Anderson, Jerome Bellegarda, Kevin Lenzo, Devang Naik


Speech Generation And Synthesis - Acoustic Synthesis And Units


Sinusoidal representation and auditory model-based parametric matching and smoothing and its application in speech analysis/synthesis
Oscar C. Au, Wanggen Wan, Cyan L. Keung, Chi H. Yim

Choose the best to modify the least: a new generation concatenative synthesis system
Marcello Balestri, Alberto Pacchiotti, Silvia Quazza, Pier Luigi Salza, Stefano Sandri

Selection of waveform units for corpus-based Mandarin speech synthesis based on decision trees and prosodic modification costs
Fu-chiang Chou, Chiu-yu Tseng, Lin-shan Lee

Improving quality in a speech synthesizer based on the MBROLA algorithm
B. Etxebarria, I. Hernáez, I. Madariaga, E. Navas, J. C. Rodríguez, R. Gándara

A novel model TD-PSPTP for speech synthesis
Yan Huang, Bo Xu

Detection of non-stationarity in speech signals and its application to time-scaling
David Kapilow, Yannis Stylianou, Juergen Schroeter

A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence
Takao Koyama, Jun-ichi Takahashi

Stable speech synthesis using recurrent radial basis functions
Iain Mann, Steve McLaughlin

Efficient weight training for selection based synthesis
Yoram Meron, Keikichi Hirose

Speech synthesis using HMM-based acoustic unit inventory
Jindrich Matousek

An enhanced ABS/OLA sinusoidal model for waveform synthesis in TTS
Michael W. Macon, Mark A. Clements

High vowel /i y u/ in canadian and continental French: an analysis for a TTS system
Marise Ouellet, Evelyne Tzoukermann, Lucie Ménard

Speech production based on the mel-frequency cepstral coefficients
Zbynìk Tychtl, Josef Psutka

Exploiting improved parameter smoothing within a hybrid concatenative/LPC speech synthesizer
Erhard Rank

Synchronization of speech frames based on phase data with application to concatenative speech synthesis
Yannis Stylianou

Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura


Speech And Noise 1


A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition
Hervé Glotin, Frédéric Berthommier, Emmanuel Tessier

Noise-invariant representation for speech signals
Aruna Bayya, B. Yegnanarayana

Natural-quality background noise coding using residual substitution
Khaled El-Maleh, Peter Kabal

Microphone array design for robust speech acquisition and recognition
Julian Fernández, Eduardo Lleida, Enrique Masgrau

Study of the influence of noise pre-processing on the performance of a low bit rate parametric speech coder
Gwénaél Guilmin, Régine Le Bouquin-Jeannès, Philippe Gournay

MLP network for enhancement of noisy MFCC vectors
Hemmo Haverinen, Petri Salmela, Juha Häkkinen, Mikko Lehtokangas, Jukka Saarinen

Hands-free voice activation in noisy car environment
J. Iso-Sipilä, K. Laurila, Ramalingam Hariharan, Olli Viikki

A wavelet denoising technique to improve endpoint detection in adverse conditions
Lamia Karray, Emmanuel Polard

Speech enhancement for linear-predictive-analysis-by-synthesis coders
Marcin Kuropatwinski, Dieter Leckschat, Kristian Kroschel, Andrzej Czyzewski, Chaz Hales

Robust HMM to variation of noisy environments based on variance extension of noise models
Hiroshi Matsumoto, Hiroaki Ubukata

The fourth-order cumulant of speech signals with application to voice activity detection
Elias Nemer, Rafik Goubran, Samy Mahmoud

The dependence of feature vectors under adverse noise
Woei-Chyang Shieh, Sen-Chia Chang

Speech detection and SNR prediction basing on amplitude modulation pattern recognition
Jürgen Tchorz, Birger Kollmeier

Fast active noise control for robust speech acquisition
Luis Vicente, Stephen J. Elliott, Enrique Masgrau

Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study
Ascension Vizinho, Phil Green, M. Cooke, Ljubomir Josifovski

Single channel speech enhancement using principal component analysis and MDL subspace selection
Rolf Vetter, Nathalie Virag, Philippe Renevey, Jean-Marc Vesin




Speech Recognition - Adaptation


Prosodic effects on segmental durations in greek
Antonis Botinis, Marios Fourakis, Irini Prinou

Within-utterance correlation for speech recognition
Mats Blomberg

Techniques for robust speech recognition in the car environment
Philippe Gelin, Jean-Claude Junqua

An on-line acoustic compensation technique for robust speech recognition
Diego Giuliani

Using adaptive signal limiter together with noise-robust techniques for noisy speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang

A robust environment-effects suppression training algorithm for adverse Mandarin speech recognition
Wei-Tyng Hong, Sin-Horng Chen

Robust speaker adaptation of continuous density HMMS using multilayer perceptron network
Mikko Harju, Petri Salmela, Olli Viikki, Mikko Lehtokangas, Jukka Saarinen

Regression class selection and speaker adaptation with MLLR in Mandarin continuous speech recognition
Chengrong Li, Jingdong Chen, Bo Xu

Regression transformation of prior means for speaker adaptation
Guoqiang Li, Limin Du, Ziqiang Hou

Linguistic tree based maximum likelihood model interpolation
Liu Feng, Chi-wei Che, Peng Yu, Zuoying Wang

Model-based speaker normalization methods for speech recognition
Masaki Naito, Li Deng, Yoshinori Sagisaka

Maximum likelihood eigenspace and MLLR for speech recognition in noisy environments
Patrick Nguyen, Christian Wellekens, Jean-Claude Junqua

A study of speaker adaptation for speaker independent speech recognition method using phoneme similarity vector
Yoshio Ono, Maki Yamada, Masakatsu Hoshimi

An investigation into vocal tract length normalisation
L. F. Uebel, P. C. Woodland

Adaptation to environment and speaker using maximum likelihood neural networks
Zong Suk Yuk, James Flanagan, Mahesh Krishnamoorthy, Krishna Dayanidhi

Corrective training for speaker adaptation
Xiuyang Yu, Wayne Ward

A robust speaker-independent CPU-based ASR system
R. Obradovic, D. Pekar, S. Krco, V. Delic, V. Senk


Enhancements, Echo Cancellation, And Quality Measures


Delay estimation for transform domain acoustical echo cancellation
Rabih Abouchakra, Peter Kabal

Noise reduction using perceptual spectral change
C. Beaugeant, Pascal Scalart

Intelligibility improvements using diverse sub-band processing applied to noisy speech
Amir Hussain, Douglas R. Campbell

Recognizing simultaneous speech: a genetic algorithm approach
Athanasios Koutras, Evangelos Dermatas, George Kokkinakis

Speech enhancement system for hands-free telephone based on the psychoacoustically motivated filter bank with allpass frequency transformation #
Krzysztof Bielawski, Alexander A. Petrovsky

Speech enhancement using a multi-microphone sub-band adaptive griffiths-jim noise canceller
P. W. Shields, Douglas R. Campbell

Qualiphone-a: a perceptual speech quality evaluation system for analog mobile networks
M. Szarvas, T. Fegyó, P. Tatai, Géza Gordos

Speech enhancement using nonlinear microphone array under nonstationary noise conditions
Hiroshi Saruwatari, Shoji Kajita, Kazuya Takeda, Fumitada Itakura

Auditory masking threshold estimation for broadband noise sources with application to speech enhancement
Ruhi Sarikaya, John H. L. Hansen

Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis
Masashi Unoki, Masato Akagi

Analysis and on-line detection of audible distortions in GSM telephony
Christophe Veaux, Pascal Scalart, André Gilloire

A parameter-based 2-talker detection apparatus for echo cancellation
Wen Rong Ru, Shih-Chen Lin, Po-Cheng Chen, Chun-Hung Kuo

Co-channel speech separation in the presence of correlated and uncorrelated noises
Kuan-Chieh Yen, Jun Huang, Yunxin Zhao


Speech And Noise 2


Speech enhancement using a mixture-maximum model
David Burshtein, Sharon Gannot

Concurrent speakers separation through binaural processing of stereo recordings
Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia

Spectral subtraction with adaptive averaging of the gain function
Harald Gustafsson, Sven Nordholm, Ingvar Claesson

A reliability criterion for time-frequency labeling based on periodicity in an auditory scene
François Gaillard, Frédéric Berthommier, Gang Feng, Jean-Luc Schwartz

Broadband noise cancellation systems: new approach to working performance optimization
Serguei Koval, Mikhail Stolbov, Mikhail Khitrov

Noise subtraction with parametric recursive gain curves
Klaus Linhard, Tim Haulick

Performance comparison of several adaptive schemes for microphone array beamforming
Enrique Masgrau, Luis Aguilar, Eduardo Lleida

An objective distortion estimator for hearing aids and its application to noise reduction
Mitsunori Mizumachi, Masato Akagi

Speech enhancement using fourth-order cumulants and time-domain optimal filters
Elias Nemer, Rafik Goubran, Samy Mahmoud

Missing feature theory and probabilistic estimation of clean speech components for robust speech recognition
Philippe Renevey, Andrzej Drygajlo

Distortion effects of several cumulant-based wiener filtering algorithms
Josep M. Salavedra, Xavier Bou

Combined noise suppression system for monaural cochlear implants
Milan Svoboda, Pavel Sovka, Petr Pollák

Objective prediction of speech intelligibility at high ambient noise levels using the speech transmission index
Sander J. van Wijngaarden, Herman J. M. Steeneken

Noise-regularized adaptive filtering for speech enhancement
Eric A. Wan, Rudolph van der Merwe

Speech enhancement using karhunen-love transformation and wiener filtering in critical bands
F. Zarubin, A. Kovtonyuk, K. Zadiraka






Speech And Noise 3


A robust isolated word recognizer for highly non-stationary environments. recognition results
A. Álvarez, R. Martínez, P. Gómez, V. Nieto, M. M. Pérez

Sequential bias compensation for robust speech recognition
Mohamed Afify

Use of simulated data for robust telephone speech recognition
Coianiz Tarcisio, Falavigna Daniele, Gretter Roberto, Orlandi Marco

On the use of time alignments for noisy speech recognition
Y. Hauptman, Y. Bistritz

Improved feature vector normalization for noise robust connected speech recognition
Juha Häkkinen, J. Suontausta, Ramalingam Hariharan, M. Vasilache, K. Laurila

State based imputation of missing data for robust speech recognition and speech enhancement
Ljubomir Josifovski, Martin Cooke, Phil Green, Ascension Vizinho

A comparison of two strategies for ASR in additive noise: missing data and spectral subtraction
Christopher Kermorvant, Andrew Morris

A comparison of techniques for tone compensation in payphone-based speech recognition
Ben Milner, Mark Farrell

Front-end improvements to reduce stationary & variable channel and noise distortions in continuous speech recognition tasks
Xavier Menéndez-Pidal, Ruxin Chen, Duanpei Wu, Mick Tanaka

Speech recognition in noisy reverberant rooms using a frequency domain blind deconvolution method
G. Nokas, E. Dermatas

Optimization of a speech recognizer for aircraft environments
Volker Schless, Fritz Class, Peter Sandl

Temporal constraints in viterbi alignment for speech recognition in noise
Nestor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump

HMM composition of segmental unit input HMM for noisy speech recognition
Kazumasa Yamamoto, Seiichi Nakagawa

Robust connected word speech recognition using weighted viterbi algorithm and context-dependent temporal constraints
Nestor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump

Liftered forward masking procedure for robust digits recognition
Kaisheng Yao, Bertram Shi, Pascale Fung, Zhigang Cao

Channel identification and spectrum estimation for robust automatic speech recognition
Yunxin Zhao


Search papers
Article
×

Keynotes

Speech Recognition, Adaptation 1

Prosody - Prosodic Features In Dialogues

Speech Recognition - Confidence Measures

Speech Recognition - Acoustic Processing

Articulatory Measurements And Modelling

First And Second Language Learning

Speech Recognition - Adaptation 2

Prosody - Prosodic Phrasing And Interruptions

Assessment

Speech Recognition - Confidence Measures 2

Speech Analysis And Tools

Language Identification

Speech Recognition - Speaking Rate

Speech Acoustics

Speech Recognition - Search And Pronunciation Modelling

Prosody - Stress, Accent And Prominence Phrasing

Speech Disorders &Amp; Speech For Disabled

Speech Recognition - Multi-Stream Asr

Speech Generation And Synthesis - Concatenation

Speech Communication Education

Speech Recognition - Broadcast News

Prosody - Temporal And/Or Intonational Features

Speaker Recognition - Acoustic Features And Robustness

Speech Recognition - Large Vocabulary Continuous Speech Recognition (Lvcsr)

Speech Generation And Synthesis - Systems And Evaluation

Speech Technology For Language Learning

Speech Recognition - Multilinguality

Systems, Architectures, Interfaces

Speaker Recognition - Scoring And Decision

Speech Generation And Synthesis - Acoustic Synthesis

Disorders In Speech Production And/Or Speech Perception

Speech Recognition - Acoustic Modelling 1

Dialogue 1

Speaker Recognition And Topic Detection

Speech Recognition - Search

Systems, Architectures

Audio-Visual Speech

Speech Recognition - Acoustic Modelling 2

Dialogue 2

Speech Coding

Dialogue

Wideband And Perceptually Based Coding

Speech Recognition - Language Modelling

Prosody - Study Of Prosody For Speech Synthesis

Speech Perception 1

Multimodal Interaction

Joint Source-Channel Coding

Speech Generation And Synthesis - Prosody

Speech Perception 2

Speech Recognition - Language Modelling 1

Speech And Noise

Text-Dependent Speaker Verification

Speech Understanding - Miscellaneous Topics

Speech Generation And Synthesis - Systems, Linguistic Processing

Speech &Amp; The Internet

Speech Recognition - Language Modelling 2

Speech Signal Processing

Text-Independent Speaker Verification And Tracking

Corpora

Speech Generation And Synthesis - Acoustic Synthesis And Units

Speech And Noise 1

Speech Translation

Topic Detection And Tracking

Speech Recognition - Adaptation

Enhancements, Echo Cancellation, And Quality Measures

Speech And Noise 2

Spoken Dialogue Systems

Speech Perception

Speech Recognition - Training

Speech Analysis And Segmentation

Speech And Noise 3