The Speaker and Language Recognition Workshop

Les Sables d'Olonne, France
26-29 June 2018

Chairs: Anthony Larcher and Jean-François Bonastre
doi: 10.21437/Odyssey.2018

Language Recognition

The LEAP Language Recognition System for LRE 2017 Challenge - Improvements and Error Analysis
Bharat Padi, Shreyas Ramoji, Vaishnavi Yeruva, Satish Kumar, Sriram Ganapathy

Analysis of DNN-based Embeddings for Language Recognition on the NIST LRE 2017
Alicia Lozano-Diez, Oldrich Plchot, Pavel Matejka, Ondrej Novotny, Joaquin Gonzalez-Rodriguez

Analysis of BUT-PT Submission for NIST LRE 2017
Oldřich Plchot, Pavel Matějka, Ondřej Novotný, Sandro Cumani, Alicia Lozano-Diez, Josef Slavíček, Mireia Diez, František Grézl, Ondřej Glembek, Mounika Kamsali, Anna Silnova, Lukáš Burget, Lucas Ondel, Santosh Kesiraju, Johan Rohdin

The MIT Lincoln Laboratory / JHU / EPITA-LSE LRE17 System
Fred Richardson, Pedro Torres-Carrasquillo, Jonas Borgstrom, Douglas Sturim, Youngjune Gwon, Jesus Villalba, Jan Trmal, Nanxin Chen, Reda Dehak, Najim Dehak

Staircase Network: structural language identification via hierarchical attentive units
Trung Ngo Trong, Ville Hautamaki, Kristiina Jokinen

Language Recognition for Telephone and Video Speech: The JHU HLTCOE Submission for NIST LRE17
Alan Mccree, David Snyder, Greg Sell, Daniel Garcia-Romero

Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System
Weicheng Cai, Jinkun Chen, Ming Li

The 2017 NIST Language Recognition Evaluation
Seyed Omid Sadjadi, Timothee Kheyrkhah, Audrey Tong, Craig Greenberg, Douglas Reynolds, Elliot Singer, Lisa Mason, Jaime Hernandez-Cordero

Approaches to Multi-domain Language Recognition
Mitchell Mclaren, Mahesh Kumar Nandwana, Diego Castán, Luciana Ferrer

Convolutional Neural Network and Language Embeddings for End-to-End Dialect Recognition
Suwon Shon, Ahmed Ali, James Glass

Spoken Language Recognition using X-vectors
David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Daniel Povey, Sanjeev Khudanpur

End-to-End versus Embedding Neural Networks for Language Recognition in Mismatched Conditions
Jesus Antonio Villalba Lopez, Niko Brummer, Najim Dehak

Voice conversion and spoofing

The NU Non-Parallel Voice Conversion System for the Voice Conversion Challenge 2018
Yichiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

NU Voice Conversion System for the Voice Conversion Challenge 2018
Patrick Lumban Tobing, Yichiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda

Average Modeling Approach to Voice Conversion with Non-Parallel Data
Xiaohai Tian, Junchao Wang, Haihua Xu, Eng-Siong Chng, Haizhou Li

Voice liveness detection using phoneme-based pop-noise detector for speaker verification
Shihono Mochizuki, Sayaka Shiota, Hitoshi Kiya

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, WaveNet and low-quality found data
Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen

The HCCL-CUHK System for the Voice Conversion Challenge 2018
Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng

Convolutional Neural Network Based Speaker De-Identification
Fahimeh Bahmaninezhad, Chunlei Zhang, John Hansen

Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model
Kentaro Sone, Shinji Takaki, Toru Nakashika

Phonetically Aware Exemplar-Based Prosody Transformation
Berrak Sisman, Grandee Lee, Haizhou Li

A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech
Akihiro Kato, Tomi Kinnunen

BUT/Phonexia Bottleneck Feature Extractor
Anna Silnova, Pavel Matejka, Ondrej Glembek, Oldrich Plchot, Ondrej Novotny, Frantisek Grezl, Petr Schwarz, Lukas Burget, Jan Cernocky

Search papers

Keynote: Els Kindt

Speaker Recognition I

Language Recognition

Speaker diarization

Noise Robustness

Keynote: Simon King

Voice conversion

Voice conversion and spoofing


Keynote: Pascal Belin

Speaker recognition II

Text-dependent speaker recognition