doi: 10.21437/IberSPEECH.2024
Data augmentation techniques for Physical Access in voice anti-spoofing
Jose Carlos Sanchez, Antonio M. Peinado, Angel M. Gomez
The Influence of the Acoustic Context on the Generation and Detection of Audio Deepfakes
Alba Martínez, Claudia Montero, Carmen Peláez
Integrating the Perceptual PMSQE Loss into DNN-based Speech Watermarking
Pablo Hernández-Manrique, Antonio M. Peinado, Angel M. Gomez
Speech Watermarking removal by DNN-based Speech Enhancement Attacks
Álvaro López-López, Eros Rosello, Angel M. Gomez
Acoustic signal correlates of vocal quality and voice dynamics in an adult with hearing loss
Andréa Maia, Aline Almeida, Ana Ghirardi, Luis Jesus
Pronunciation Assessment and Automated Analysis of Speech in Individuals with Down Syndrome: Phonetic and Fluency Dimensions
Mario Corrales-Astorgano, César González-Ferreras, David Escudero-Mancebo, Lourdes Aguilar, Valle Flores-Lucas, Valentín Cardeñoso-Payo, Carlos Vivaracho-Pascual
Multi-Triplet Loss-Based Models for Categorical Depression Recognition from Speech
Adria Mallol-Ragolta, Manuel Milling, Björn Schuller
Comparative Analysis of Power Dynamics from Invasive EEG Recordings During Overt and Covert Speech Production in Epilepsy Patients
Asma Hasan Sbaih, Marc Ouellet, José L. Pérez-Córdoba, Sneha Raman, Ana B. Chica, Owais Mujtaba Khanday, Alberto Galdón, Gonzalo Olivares, Jose A. Gonzalez-Lopez
Assessing the Impact and Potential of TTS for Pathological Voice Data Augmentation on Pathology Detection Systems
Santiago Rubio Felipo, Dayana Ribas González, Eduardo Lleida Solano, Alfonso Ortega Giménez, Antonio Miguel Artiaga
Applying Transfer-Learning on Embeddings of Language Models for Negative Thoughts Classification
Cristina Luna Jiménez, Jonas Jostschulte, Wolfgang Minker, David Griol, Zoraida Callejas
OPENER - Open-NER in domains without annotated resources
Emanuel Matos, Gabriel Silva, Mário Rodrigues, António Teixeira
Analysing Customer-Support Trends in Social Networks through Dialogue Flow Discovery
Isabel Carvalho, Patrícia Ferreira, Ana Alves, Catarina Silva, Hugo Gonçalo Oliveira
Advancing Open Information Extraction for Portuguese by Leveraging Graph Structures and Large Language Models
Gabriel Silva, Mário Rodrigues, António Teixeira, Marlene Amorim
Interactive Machine Translation with Large Language Models in Low Resources Languages
Sergio Gómez, Miguel Domingo, Francisco Casacuberta
Characterising Speech Under Stress through Glottal Source Features based on Quasi-Closed Phase Inverse Filtering
Luis Joglar-Ongay, Francesc Alías-Pujol
Phone Pair Classification During Speech Production Using MEG Recordings
Xabier de Zuazo, Eva Navas, Ibon Saratxaga, Mathieu Bourguignon, Nicola Molinaro
Comparative Analysis of Mono-speaker and Multi-speaker Models for EMG-to-Speech Conversion
Eder del Blanco, Inge Salomons, Víctor García, Eva Navas, Inma Hernáez
Direct Speech Synthesis from Non-audible Speech Biosignals: A Comparative Study
Javier Lobato Martín, José Luis Pérez Córdoba, Jose A. Gonzalez-Lopez
Nos_Celtia-GL: an Open High-Quality Speech Synthesis Resource for Galician
Noelia García Díaz, Marta Vázquez Abuín, Carmen Magariños, Adina Ioana Vladu, Antonio Moscoso Sánchez, Elisa Fernández Rei
On Speech Pre-emphasis as a Simple and Inexpensive Method to Boost Speech Enhancement
Iván López-Espejo, Aditya Joglekar, Antonio M. Peinado, Jesper Jensen
Towards an Efficient and Accurate Speech Enhancement by a Comprehensive Ablation Study
Lidia Abad, Fernando López, Jordi Luque
Evaluation of Automatic Embeddings for Supervised Soundscape Classification in-the-wild
Claudia Montero-Ramirez, Esther Rituerto-González, Carmen Peláez-Moreno
Towards Efficient Conformer-based Sound Event Detection
Sara Barahona, Juan Ignacio Alvarez-Trejos, Doroteo Toledano, Alicia Lozano-Diez
Fourier Attention: The Attention Mechanism as a Frequency Analyzer
Michail Raptakis, Yannis Pantazis
Analyzing Speech Muscle Activity Using Generalized Additive Modeling
Inge Salomons, Inma Hernáez, Eva Navas, Martijn Wieling
Decoding the Mind: Neural Differences and Semantic Representation in Perception and Imagination Across Modalities
Owais Mujtaba Khanday, Marc Ouellet, José L. Pérez-Córdoba, Asma Hasan Sbaih, Laura Miccoli, Jose A. Gonzalez-Lopez
Face Mask Type and Coverage Area Recognition from Speech with Prototypical Networks
Adria Mallol-Ragolta, Anika Spiesberger, Björn Schuller
Intelligent audio-based signal processing for automatic detection of obstructive sleep apnea
Mercedes Velasco, Ning Ma, Jose A. Gonzalez-Lopez
MalCoLiP: A Maltese Corpus for Linguistic Profiling
Amanda Muscat
MentalQuery: a proposal for conversational human-robot interaction to promote mental health literacy
Juan Barrionuevo-Valenzuela, David Griol, Zoraida Callejas
On the Use of Audio to Improve Dialogue Policies
Daniel Roncel Díaz, Federico Costa, Javier Hernando
STEPI: System for Triplet Extraction of Personal Information in Dialogue Systems
Maria Villa Monedero, Jaime Bellver, Mohamed Imed Eddine Ghebriout, Ricardo De Cordoba, Luis Fernando D'Haro
Semantic Information Retrieval through Autonomous Agents
María García Cutando, Eduardo Lleida Solano, Virginia Bazán Gil, Alfonso Ortega Giménez, Antonio Miguel Artiaga
Towards Parameter-Efficient Non-Autoregressive Spanish Audio-Visual Speech Recognition
David Gimeno-Gomez, Carlos David Martinez Hinarejos
Whisper Meets FalAI: From Speech Recognition to End-to-End Spoken Language Understanding
Andrés Piñeiro-Martín, Carmen García-Mateo, Laura Docío-Fernández, María del Carmen López-Pérez
3CatParla: A New Open-Source Corpus of Broadcast TV in Catalan for Automatic Speech Recognition
Carlos Daniel Hernández Mena, Carme Armentano Oller, Sarah Solito, Baybars Külebi
Analysis of the domain mismatch problem in the Speech Emotion Recognition Task
Miguel A. Pastor, Alfonso Ortega, Dayana Ribas
Analyzing DiaPer EEND Speaker Diarization Models on the RTVE2022 Dataset
Jérémie Touati, Juan Ignacio Alvarez-Trejos, Beltrán Labrador, Alicia Lozano-Diez
Encouraging Internal Representations with Speaker Information in End-to-End Neural Diarization by Adding Speaker Loss
Victoria Mingote, Alfonso Ortega, Antonio Miguel, Eduardo Lleida
Enhancing Crowdsourced Audio for Text-to-Speech Models
Jose Giraldo, Martí Llopart, Alex Peiró-Lilja, Carme Armentano-Oller, Gerard Sant, Baybars Külebi
Extending LIP-RTVE: Towards A Large-Scale Audio-Visual Dataset for Continuous Spanish in the Wild
Miguel Zaragozá-Portolés, David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
LaFresCat: A Catalan Multi-Accent Speech Dataset for Text-to-Speech
Alex Peiró-Lilja, Martí Llopart-Font, Carme Armentano-Oller, Jose Giraldo, Ignasi Esquerra, Mireia Farrús, Baybars Külebi
Multilingual Speech Emotion Recognition combining Audio and Text using Small Language Models
Jaime Bellver, Mario Rodríguez Cantelar, Marcos Estecha Garitagoitia, Ricardo De Córdoba Herralde, Luis Fernando D'Haro
Open-Source Multispeaker Text-to-Speech Model and Synthetic Speech Corpus with a Mexican Accent through a Web Spanish Dictionary
Carlos Daniel Hernández Mena, Jose Omar Giraldo Valencia, Irene Baucells De La Peña, Alfonso Medina Urrea, Baybars Kulebi
Prototypical Networks for Speech Emotion Recognition in Spanish
Adria Mallol-Ragolta, Anika Spiesberger, Antonio Barba Salvador, Björn Schuller
Analysis of Speaker Label Matching for Diarization of Long Audios on RTVE2022 Dataset
Juan Ignacio Alvarez-Trejos, Laura Herrera, Jérémie Touati, Alicia Lozano-Diez
Into the Sound: Analysis of Technical Quality of Audio and Speech
Dayana Ribas, Juan Antonio Navarro, Fernando Macías, Luis Guillen Civera, José Javier Castejón, Fernando Barreiro-Lostres, Oihane Albizuri, Andrés Carrión, José María Vinacua, Luis Benavente, Martín Sagardía, José Luis Cortina, Juan Luis Moreno, José Angel de la Cruz
Advances in Binary and Multiclass Audio Segmentation with Deep Learning Techniques: A PhD Thesis Overview
Pablo Gimeno, Alfonso Ortega
Signal and Neural Processing against Spoofing Attacks and Deepfakes for Secure Voice Interaction (ASASVI)
Angel M. Gómez, Antonio M. Peinado, Victoria E. Sánchez, Iván López-Espejo, Alejandro Gómez-Alanis, Eros Roselló, José C. Sánchez-Valera, Juan M. Martín-Doñas
Privacy-preserving Machine Learning for Remote Speech Processing
Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso
Towards improved Automatic Speech Recognition for children
Thomas Rolland, Alberto Abad
#neural2speech: Decoding Speech and Language from the Human Brain
Xabier de Zuazo, Vincenzo Verbeni, Li-Chuan Ku, Ekain Arrieta, Ander Barrena, Anastasia Klimovich-Gray, Ibon Saratxaga, Eva Navas, Eneko Agirre, Nicola Molinaro
Biologically Informed Neural Speech Synthesis
Mateo Cámara, José Luis Blanco
FEMVoQ Project: Three-dimensional finite element simulation of voice quality, considering the influence of phonation types and vocal tract shaping
Oriol Guasch, Francesc Alías-Pujol, Marc Arnela, Marc Freixes, Joan Claudi Socoró, Luis Joglar-Ongay
Adversarial Learning to Remove Sources of Variability in Speech Applications
Juan M. Perero-Codosero, Fernando M. Espinoza-Cuadros, Luis A. Hernández-Gómez
VisIA Project: design of an automated AI-based emotional distress and suicide risk detection system
José Manuel Ramírez Sánchez, Mario Manso, Carmen García-Mateo, Beatriz Gómez-Gómez, Beatriz Pinal, Antía Brañas, Alejandro García Caballero, Laura Docío-Fernandez, M. J. Fernández-Iglesias
DENDRITE Project Overview: Personalized Medicine for the Early Detection of Preclinical Cognitive Decline
Darío Tilves-Santiago, Carmen García-Mateo, Laura Docío-Fernández, Andrea Ropero, Rodrigo Barderas, Ángeles Almeida
BeNeXT project: Biomarker enhanced diagnostic and prognostic tools for rare disorders – using X-chromosome alterations in Turner syndrome as a model
Marc Freixes, Xavier Sevillano, Esther Esteban, Aroa Casado, Carmen Garrido, Alejandro González, Álvaro Heredia-Lidón, Jordi Malé, Joan Claudi Socoró, Luis Joglar-Ongay, Isabella Monlleó, Debora Michelatto, Estephania Candelo, Harry Pachajoa, Rolando González-José, Carina Argüelles, Carola Cheroki, Paula González, Yann Heuzé, Neus Martínez-Abadías
Voices from the South: Study and Synthesis of Andalusian Accents with Artificial Intelligence
Jose Andres Gonzalez Lopez, Antonio Manuel Castilla Rubia, Alfredo Herrero de Haro, Angel Gomez, Antonio M. Peinado
Speech Technologies in the ILENIA Project: Generating Resources to Develop Voice Applications in the Official Languages of Spain
Baybars Külebi, Inma Hernáez, Elisa Fernández Rei, Andres Montoyo, Sarah Solito, Carme Armentano-Oller, Javier Hernando, Eva Navas, Carmen Magariños, Adina Vladu, Ibon Saratxaga, Jon Sánchez, Victor García Romillo, Asier Herranz, Christoforos Souganidis, Noelia García, Antonio Moscoso Sánchez, Xose Luis Regueira, Francisc Dubert, Yoan Gutiérrez
Accelerat.AI: INESC-ID/IST-Universidade de Lisboa contributions towards improved conversational agents in European Portuguese
Alberto Abad, Sérgio Paulo, Rubén Solera-Ureña, Anna Pompili
iRead4Skills @ IberSPEECH 2024: Project presentation and developments for the Portuguese language
Jorge Baptista, Eugénio Ribeiro, Nuno Mamede
AUDIAS System for the ALBAYZIN 2024 WuW Detection Challenge
Enrique Ernesto de Alvear Doñate, Doroteo Torre Toledano
The Vicomtech Speech Transcription Systems for the Albayzín 2024 Bilingual Basque-Spanish Speech to Text (BBS-S2T) Challenge
Juan Camilo Vásquez-Correa, Aitor Álvarez, Haritz Arzelus, Santiago Andrés Moreno Acevedo, Ander González-Docasal, Juan Manuel Martín-Doñas
The PRHLT Speech Recognition System for the Albayzín 2024 Bilingual Basque-Spanish Speech to Text Challenge
David Gimeno-Gomez, Carlos David Martinez Hinarejos
HiTZ-AhoLab ASR System for the Albayzin Bilingual Basque-Spanish Speech to Text Challenge
Asier Herranz, Adrián García-Sebastián, Christoforos Souganidis, Victor García-Romillo, Aitor Bellanco, Eva Navas, Inma Hernáez-Rioja, Ibon Saratxaga
Albayzin 2024 Bilingual Basque-Spanish Speech to Text (BBS-S2T) Challenge: Datasets, Systems and Results
Mikel Peñagarikano, Amparo Varona, Germán Bordel, Luis Javier Rodriguez-Fuentes
Fine-tuning Segmentation Models for the Albayzín diarization challenge
Mirari San Martín, Jónathan Heras, Gadea Mata
HiTZ-Aholab Speaker Diarization System for Albayzin Evaluations of IberSPEECH 2024
Christoforos Souganidis, Gemma Meseguer, Asier Herranz, Inma Hernáez Rioja, Eva Navas, Ibon Saratxaga
AUDIAS-UAM System Description for the Albayzin-RTVE 2024 Speaker Diarization Challenge
Alicia Lozano-Diez, Juan Ignacio Alvarez-Trejos, Laura Herrera, Beltran Labrador, Jeremie Touati, Sara Barahona
ILENIA_VOZ ASR System Fusion for Albayzin 2024 Speech to Text Challenge
Abir Messaoudi, Sarah Solito, Federico Costa, Carlos Daniel Hernández Mena, Marc Casals-Salvador, Lucas Takanori Sanchez Shiromizu, Marti Cortada Garcia, Carme Armentano-Oller, Antonio Moscoso Sánchez, Carmen Magariños, Javier González Corbelle, Asier Herranz, Christoforos Souganidis, Inma Hernáez Rioja, Ibon Saratxaga, Eva Navas
Article |
---|