Adrián Aranda Márquez

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

May 14, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Sara Barahona Quirós. Abstract: Understanding and reasoning over non-speech sounds and music are crucial for both humans and AI agents to interact effectively with their environments. In this paper, we introduce Audio Flamingo 2 (AF2), an Audio-Language Model (ALM)… Read More

Language-Based Audio Retrieval (DCASE Evaluations)

April 30, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Doroteo Torre Toledanos. Abstract: Language-based audio retrieval is the task of retrieving audio segments containing sound described in a natural language text. This task was first proposed in a DCASE Challenge in 2022 as a subtask of the audio… Read More

Development of a Guardrail System for Bank Movement Assistant

April 22, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Miguel Ángel Martínez Pay. Abstract: This seminar outlines the process of creating a guardrail for a banking transactions assistant. The guardrail acts as a security system that filters user queries, determining which can be processed by the assistant and… Read More

Neural Discrete Representation Learning Revisited: Applications of VQ-VAE

April 9, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Manuel Fernando Mollón Laorca. Abstract: Since the publication of Neural Discrete Representation Learning in 2018, Vector Quantized Variational Autoencoders (VQ-VAEs) have gained significant attention for their ability to bridge continuous and discrete representations. In particular, their integration with transformer… Read More

You Only Hear Once: A YOLO-like Algorithm for Audio Segmentation and Sound Event Detection

March 26, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Rosa María Hornero Romera. Abstract: Presentation of paper https://arxiv.org/abs/2109.00962 Audio segmentation and sound event detection are essential aspects of machine listening, focusing on identifying acoustic classes and their boundaries. These tasks play a key role in applications such as… Read More

The Expected Cost: One Performance Metric to Rule Them All

March 12, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Daniel Ramos Castro. Abstract: Based on https://openreview.net/forum?id=3mN9QNWArl. Abstract of original paper: “The expected cost (EC) is one of the main classification metrics introduced in statistical and machine learning books. It is based on the assumption that, for a given… Read More

Cybersecurity Today: Attackers and Defenders

March 5, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Pablo González Escribano. Abstract: As cyber threats continue to evolve at a rapid pace, understanding the tactics, techniques, and procedures (TTPs) employed by attackers is crucial for enhancing defense strategies. In this session, we explored the current landscape of… Read More

Deep Learning Insights Inspired by Reinforcement Learning Research

February 5, 2025June 11, 2025 Adrián Aranda Márquez

Speaker: Tamas Endrei. Abstract: Despite deep reinforcement learning being around for more than 10 years, traditional deep learning best practices have largely avoided the field until now. This talk elaborates on deep learning techniques uncovered through RL-motivated research, touching on… Read More

Joint Automatic Speech Recognition And Structure. Learning For Better Speech Understanding

January 29, 2025January 30, 2025 Adrián Aranda Márquez

Speaker: María Pilar Fernández Gallego. Abstract: Spoken language understanding (SLU) is a structure prediction task in the field of speech. Recently, many works on SLU that treat it as a sequence-to-sequence task have achieved great success. However, This method is… Read More

A Whisper-based Query-by-Example Spoken Term Detection approach for search on speech

January 22, 2025January 30, 2025 Adrián Aranda Márquez

Speaker: Javier Tejedor Noguerales. Abstract: Nowadays, in the digital era, the amount of information stored in audio repositories is undoubtedly growing. This makes necessary the development of efficient and automatic methods to search on audio content. To address it, search… Read More

Author: Adrián Aranda Márquez