doi: 10.21437/Odyssey.2022
Magnitude-Aware Probabilistic Speaker Embeddings
Nikita Kuzmin, Igor Fedorov, Alexey Sholokhov
Analyzing Speaker Verification Embedding Extractors and Back-Ends Under Language and Channel Mismatch
Anna Silnova, Themos Stafylakis, Ladislav Mošner, Oldřich Plchot, Johan Rohdin, Pavel Matĕjka, Lukáš Burget, Ondřej Glembek, Niko Brummer
Progressive Contrastive Learning for Self-Supervised Text-Independent Speaker Verification
Junyi Peng, Chunlei Zhang, Jan "Honza" Černocký, Dong Yu
Impostor Score Statistics as Quality Measures for the Calibration of Speaker Verification Systems
Sandro Cumani, Salvatore Sarni
Hybrid Neural Network-Based Deep Embedding Extractors for Text-Independent Speaker Verification
Jahangir Alam, Woo Hyun Kang, Abderrahim Fathan
Learning Noise Robust ResNet-Based Speaker Embedding for Speaker Recognition
Mohammad MohammadAmini, Driss Matrouf, Jean-François Bonastre, Sandipana Dowerah, Romain Serizel, Denis Jouvet
Teager Energy Based-Detection of One-point and Two-point Replay Attacks: Towards Cross-Database Generalization
Anand Therattil, Priyanka Gupta, Piyushkumar K. Chodingala, Hemant A. Patil
Investigation on Mixup Strategies for End-to-End Voice Spoof Detection System
Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan
Speaker-Targeted Synthetic Speech Detection
Diego Castan, Md Hafizur Rahman, Sarah Bakst, Chris Cobo-Kroenke, Mitchell McLaren, Martin Graciarena, Aaron Lawson
Explainable Deepfake and Spoofing Detection: An Attack Analysis Using SHapley Additive exPlanations
Wanying Ge, Massimiliano Todisco, Nicholas Evans
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
You Zhang, Ge Zhu, Zhiyao Duan
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu, Md Sahidullah, Tomi Kinnunen
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion
Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng
Investigating Self-Supervised Front Ends for Speech Spoofing Countermeasures
Xin Wang, Junichi Yamagishi
A Novel Feature Based on Graph Signal Processing for Detection of Physical Access Attacks
Longting Xu, Mianxin Tian, Xing Guo, Zhiyong Shan, Jie Jia, Yiyuan Peng, Jichen Yang, Rohan Kumar Das
Automatic Speaker Verification Spoofing and Deepfake Detection Using Wav2vec 2.0 and Data Augmentation
Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, Nicholas Evans
A Multi-Resolution Front-End for End-to-End Speech Anti-Spoofing
Wei Liu, Meng Sun, Xiongwei Zhang, Hugo Van hamme, Thomas Fang Zheng
Robust Cross-SubBand Countermeasure Against Replay Attacks
Jingze Lu, Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang
Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
Natsuo Yamashita, Shota Horiguchi, Takeshi Homma
Collar-Aware Training for Streaming Speaker Change Detection in Broadcast Speech
Joonas Kalda, Tanel Alumäe
BIT Submission for the Conversational Speaker Diarization Challenge
Chenguang Hu, Qingran Zhan, Miao Liu, Xiang Xie
DP-Means: An Efficient Bayesian Nonparametric Model for Speaker Diarization
Yijun Gong, Xiao-Lei Zhang
Low-Latency Online Speaker Diarization with Graph-Based Label Generation
Yucong Zhang, Qinjian Lin, Weiqing Wang, Lin Yang, Xuyang Wang, Junjie Wang, Ming Li
A Quick and Effective Speaker Diarization System
Zuoer Chen, Liang He
Domain Generalized Speaker Embedding Learning via Mutual Information Minimization
Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan
Baselines and Protocols for Household Speaker Recognition
Alexey Sholokhov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen
Speaker Recognition on Mono-Channel Telephony Recordings
Yosef Solewicz, Noa Cohen, Johan Rohdin, Srikanth Madikeri, Jan ”Honza” Čercnocký
Parameter-Free Attentive Scoring for Speaker Verification
Jason Pelecanos, Quan Wang, Yiling Huang, Ignacio Lopez Moreno
Time-Varying Score Reliability Prediction in Speaker Identification
Sarah Bakst, Chris Cobo-Kroenke, Aaron Lawson, Mitchell McLaren, Allen Stauffer
Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21
Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Magdalena Rybicka, Carlos D. Castillo, Jaejin Cho, L. Paola García-Perera, Pedro A. Torres-Carrasquillo, Najim Dehak
Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention
Yanxiong Li, Wucheng Wang, Hao Chen, Wenchang Cao, Wei Li, Qianhua He
Deep Representation Decomposition for Rate-Invariant Speaker Verification
Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li
A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data
Madina Abdrakhmanova, Saniya Abushakimova, Yerbolat Khassanov, Huseyin Atakan Varol
Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge
Tanel Alumäe, Kunnar Kukk
Enhancing Language Identification Using Dual-Mode Model with Knowledge Distillation
Hexin Liu, Leibny Paola Garcia Perera, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles, Sanjeev Khudanpur
Attentive Temporal Pooling for Conformer-Based Streaming Language Identification in Long-Form Speech
Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio Lopez Moreno
BreizhCorpus: A Large Breton Language Speech Corpus and Its Use for Text-to-Speech Synthesis
David Guennec, Hassan Hajipoor, Gwénolé Lecorvé, Pascal Lintanf, Damien Lolive, Antoine Perquin, Gaëlle Vidal
Cycleflow: Purify Information Factors by Cycle Loss
Haoran Sun, Chen Chen, Lantian Li, Dong Wang
Language-Independent Speaker Anonymization Approach Using Self-Supervised Pre-Trained Models
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia Tomashenko
Robustness of Signal Processing-Based Pseudonymization Method Against Decryption Attack
Hiroto Kai, Shinnosuke Takamichi, Sayaka Shiota, Hitoshi Kiya
Closing the Gap Between Single-User and Multi-User VoiceFilter-Lite
Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw
Single-Channel Target Speaker Separation Using Joint Training with Target Speaker's Pitch Information
Jincheng He, Yuanyuan Bao, Na Xu, Hongfeng Li, Shicong Li, Linzhang Wang, Fei Xiang, Ming Li
C-P Map: A Novel Evaluation Toolkit for Speaker Verification
Lantian Li, Di Wang, Wenqiang Du, Dong Wang
The NIST CTS Speaker Recognition Challenge
Seyed Omid Sadjadi, Craig Greenberg, Elliot Singer, Lisa Mason, Douglas Reynolds
The 2021 NIST Speaker Recognition Evaluation
Seyed Omid Sadjadi, Craig Greenberg, Elliot Singer, Lisa Mason, Douglas Reynolds
Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion
Hye-jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md Sahidullah, Tomi Kinnunen, Nicholas Evans
Advances in Speaker Recognition for Multilingual Conversational Telephone Speech: The JHU-MIT System for NIST SRE20 CTS Challenge
Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Jaejin Cho, Pedro A. Torres-Carrasquillo, Najim Dehak
Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation
Jahangir Alam, Radek Beneš, Marián Beszédeš, Lukáš Burget, Mohamed Dahmane, Abderrahim Fathan, Hamed Ghodrati, Ondřej Glembek, Woo Hyun Kang, Pavel Matĕjka, Ladislav Mošner, Oldřich Plchot, Johan Rohdin, Anna Silnova, Themos Stafylakis
STC Speaker Recognition System for the NIST SRE 2021
Galina Lavrentyeva, Sergey Novoselov, Vladimir Volokhov, Anastasia Avdeeva, Aleksei Gusev, Alisa Vinogradova, Igor Korsunov, Alexander Kozlov, Timur Pekhovsky, Andrey Shulipa, Evgeny Smirnov, Vasily Galyuk
The Volkswagen-Mobvoi System for CN-Celeb Speaker Recognition Challenge 2022
YingWei Tan, XueFeng Ding
Cross-Scene Speaker Verification Based on Dynamic Convolution for the CNSRC 2022 Challenge
Jialin Zhang, Qinghua Ren, Youcai Qin, Zikai Wan, Qirong Mao
Investigation on Deep Speaker Embedding Extraction Methods for Multi-Genre Speaker Verification
Woo Hyun Kang, Jahangir Alam
Combination of Multiple Embeddings for Speaker Retrieval
Xinmei Su, Qingran Zhan, Chenguang Hu, Xiang Xie
An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations
Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang
Formant Dynamics of Chinese Compound Vowels with Implications for Forensic Speaker Identification
Jintao Kang, Aijun Li, Jingyang Li
Generating TTS Based Adversarial Samples for Training Wake-Up Word Detection Systems Against Confusing Words
Haoxu Wang, Yan Jia, Zeqing Zhao, Xuyang Wang, Junjie Wang, Ming Li
Multimodal Emotion Recognition Using Transfer Learning from Speaker Recognition and BERT-Based Models
Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model
Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai, Nobuaki Minematsu
Gamified Speaker Comparison by Listening
Sandip Ghimire, Tomi Kinnunen, Rosa González Hautamäki
Article |
---|