AUDIAS logo Audio, Data Intelligence and Speech
EPS logo UAM logo
AUDIAS Seminars - "Los Cafés de AUDIAS"

The AUDIAS series of seminars, known as "Los Cafés de AUDIAS", is our weekly meeting where members and guests present their current activities such as projects, papers, thesis, presentations, conference summaries, etc. Presentation proposals and requests for past presentations slides must be submitted to the Seminar Coordinator.
Seminars are scheduled on Tuesdays, 12:00-13:00, in Laboratory C-109 (C building, EPS-UAM).

Seminars Coordinator: Joaquín González-Rodríguez

Past seminars



2017

Considering Accent Recognition Technology for Forensic Applications[Go to top]

Presenter:

Georgina Brown  (PhD Student in Linguistics, University of York, UK)

Date of presentation:

April 04, 2017

Presentation place:

C-109, 12:00-13:00

Abstract:

Developments in automatic speaker recognition have opened up the possibility of providing forensic speech scientists with additional tools for speaker comparison tasks. Other tasks that a forensic speech scientist might be asked to conduct include speaker profiling, where we might want to extract information about an unknown speaker (e.g. geographical origin). This work explores whether we could use automatic accent recognition technology for this kind of task. With particular attention paid to one specific automatic accent recognition system (the Y-ACCDIST system), this research observes its performance on different speech corpora to uncover its strengths and weaknesses.

Short Bio:

Georgina is currently in her third year of her PhD which looks at automatic accent recognition systems for forensic applications. Before joining the Forensic Speech Science research group at the University of York for her postgraduate studies, she gained her undergraduate Linguistics degree at the University of Edinburgh.

Redes Bayesianas para Investigación Policial en Agresiones Sexuales[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

March 31, 2017

Presentation place:

C-109, 12:00-13:00

Abstract:

En esta charla se presenta una introducción a las redes bayesianas, incluyendo representación, inferencia y aprendizaje. Posteriormente, se describe un proyecto llevado a cabo en Audias en colaboración con el Instituto de Ciencias Forenses y de la Seguridad, en el que se utilizan redes bayesianas para realizar predicciones de variables del agresor en casos de agresiones sexuales con víctima desconocida, así como resultados de rendimiento preliminares.

Análisis y Procesado de Señales Multicanal Procedentes de Sensores Industriales[Go to top]

Presenter:

Adrián García Cantalapiedra  (Electrical Engineering Master Student)

Date of presentation:

February 23, 2017

Presentation place:

c-109, 12:00-13:00

Abstract:

This Project focuses on the study and development of techniques to facilitate the detection of irregularities present in certain industrial components, from temporary signals acquired with different types of sensors. To mitigate the undesired effect of the noise in the signals, methods used in other fields, such as speech processing, have been analyzed. After this, different algorithms have been applied to identify possible events in the signals. In order to combine multichannel information and to improve the efficiency of single-channel detection, fusion methods have also been developed. Since the amount of information available is limited, a synthetic signal generation system has been developed to increase the size of the database, and its degree of representativity has been analyzed in comparison with actual signals.

2016

Odyssey 2016, the Speaker and Language Recognition Workshop: Scientific Report[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

July 11, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

This seminar is in fact an informal description of Odyssey 2016, the Speaker and Language Recognition Workshop, held in Bilbao, Spain, 21-24 June 2016. We will briefly describe the conference, and will give an insight on some contributions that can be currently interesting to ATVS.

Implementación y análisis de efectos de audio en GUI para prácticas de la EPS[Go to top]

Presenter:

Juan Manuel Albañil Aguado

Date of presentation:

July 05, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

En este TFG se han implementado varios efectos de audio digital en Matlab™, se han analizado tanto los efectos generados cómo efectos que tiene la DAW Reaper™ y se ha implementado una interfaz gráfica en Matlab™ para analizar los efectos. Los bloques que se desarrollan son:

- Implementación y análisis de efectos en Matlab™: se han implementado y analizado varios efectos con Matlab.
- Análisis de efectos de Reaper™ en Matlab™: se han seleccionado varios efectos de Reaper™ y se han analizado en profundidad.
- Implementación de interfaz gráfica de Matlab™: se ha implementado una GUI para analizar los efectos de manera sencilla.

La finalidad de este Trabajo de Fin de Grado ha sido generar material docente adicional para las prácticas de laboratorio de la asignatura Tecnologías de Audio de cuarto curso del Grado en Ingeniería de Tecnologías y Servicios de Telecomunicación.

Etiquetado automático de segmentos de audio en distintas unidades fonéticas[Go to top]

Presenter:

Daniel Herreros

Date of presentation:

July 04, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

Este Trabajo Fin de Grado tiene como objetivo realizar el etiquetado de segmentos de audio en fonemas y trifonemas. Para ello realizamos experimentos basados en la extracción de coeficientes MFCC, las características Delta y Delta-Delta, SAT, MMI y fMMI. Finalmente se realizará el entrenamiento de distintas redes neuronales profundas para realizar el etiquetado.
Para los entrenamientos usaremos principalmente dos herramientas: Kaldi y Theano. Kaldi nos proporcionará herramientas para el entrenamiento de los sistemas anteriores a las redes neuronales, pero también nos permitirá entrenar una red neuronal. Por otra parte, las librerías de Theano nos permitirán entrenar otra serie de redes neuronales profundas mediante el uso de arquitecturas GPU.
Las redes neuronales profundas(DNN), brindan mejores resultados en el etiquetado de voz que el resto de experimentos realizados durante el desarrollo del Trabajo Fin de Grado, por ello nos centramos en estudiar sus resultados. Realizaremos comparaciones entre los resultados obtenidos por las DNN's y los resultados de los experimentos nombrados anteriormente. De la misma manera compararemos los distintos resultados de las DNN's entre ellos, teniendo en cuenta el tiempo de entrenamiento y la precisión en cuanto palabras o ventanas acertadas.
Como base de datos se usará la base de datos de LDC con número de serie LDC97S62. Dicha base de datos consta de alrededor de 2400 conversaciones entre dos locutores en inglés. La base de datos se dividirá en distintos subconjuntos para el entrenamiento y validación de los sistemas entrenados.
Finalmente, hablaremos sobre las líneas de investigación futuras de las DNN's para el etiquetado de segmentos de audio como son las redes convolucionales y corta memoria.

Reconocimiento automático de locutor e idioma mediante caracterización acústica de unidades lingüísticas[Go to top]

Presenter:

Javier Franco Pedroso

Date of presentation:

June 13, 2016

Presentation place:

C-109, 13:00-14:00

Abstract:

Esta Tesis se centra en el uso yaplicación de técnicas de caracterización y modelado con carácter eminentemente interpretable, deforma que permitan no sólo distinguir entre locutores o idiomas, sino también ofrecer informaciónde qué aspectos son los que los hacen diferentes.Una de las herramientas empleadas en esta Tesis para este fin es el uso de relaciones deverosimilitud. La relación de verosimilitud (o likelihood ratio, LR) se ha adoptado en el ámbitoforense como el marco teórico más apropiado para la presentación de evidencias, y representa elcociente de probabilidades de la evidencia dadas dos hipótesis opuestas: que las muestras dubitadase indubitadas proceden de la misma fuente, por una parte, y que proceden de fuentes diferentes, porotra. Así, el resultado de la comparación de dos conjuntos de muestras tiene una interpretacióndirecta, frente a las puntuaciones crudas, carentes de significado por sí mismas, proporcionadas porlos sistemas de reconocimiento clásicos. Pero además, facilita la combinación de informaciónprocedente de diversas fuentes gracias a su interpretación probabilística.Esta Disertación comienza afrontando el problema de la segmentación de audio, una etapa crucialpara la identificación de locutor e idioma en entornos no controlados ya que permite aislar lostramos de la grabación que sólo contengan voz. La segmentación de audio tiene por objetivogeneral dividir el flujo de audio en segmentos homogéneos en lo que al contenido acústico serefiere. Esta segmentación suele abordarse como un problema de clasificación, en el que cada tramade audio debe asignarse a una de las posibles clases (por ejemplo, voz con ruido de fondo, voz conmúsica, voz aislada, etc.). Dada la amplitud de esta casuística, en esta Tesis se plantea lasegmentación como un problema de detección de clases acústicas en un sentido más amplio, deforma que la segmentación final venga dada por la combinación de varios de estos detectores. Así,por ejemplo, se tendría un detector de voz en cualquier contexto (voz asilada, voz con ruido defondo, voz con música de fondo, etc.). un detector de ruido (ruido asilado, ruido en presencia devoz, etc.) y un detector de música (música aislada, etc.). Cada uno de estos detectores asigna un LRa cada trama de audio, de forma que sus salidas pueden combinarse de forma natural paradeterminar la presencia de alguna de las posibles combinaciones de ellos, e incluso la ausencia detodos ellos.La forma habitual de obtener LRs en el ámbito del reconocimiento automático consiste en aplicaruna transformación a las puntuaciones crudas proporcionadas por el sistema. Para ello, es necesariodisponer de un conjunto de datos adicional a partir del cual entrenar dicha transformación. Sin embargo, en el ámbito forense los datos suelen ser un recurso escaso, por lo que la aproximaciónhabitual está basada en modelos probabilísticos que dan lugar a la obtención de LRs directamente apartir de las características empleadas. El modelo concreto a aplicar depende de la distribución demuestras en la llamada población de referencia; cuando esta distribución no puede aproximarse deforma paramétrica, debe recurrirse a técnicas de estimación para caracterizarla. En esta Tesistambién se aborda este problema, proponiéndose una nueva aproximación mediante mezclas demodelos de gaussianas (gaussian mixture models, GMMs) frente a la clásica estimación medianteuna función kernel.Otra de las herramientas empleadas en esta Tesis para proporcionar información interpretableadicional al proceso de reconocimiento en sí mismo es la caracterización de locutor e idioma entérminos fonético-acústicos. En el caso del reconocimiento de idioma, se ha combinado lainformación procedente de sistemas acústicos clásicos (basados en características cepstrales) con laproporcionada por sistemas fonotácticos. Estos sistemas hacen uso de la información fonética paraconstruir un modelo del idioma en base a la frecuencia de repetición de determinadas secuencias defonemas (n-gramas), con la particularidad de que la fonética del reconocedor puede ser distinta a ladel idioma que se pretende identificar. Como se mostrará, la combinación de varios sistemasfonotácticos junto con sistemas acústicos da lugar a mejoras significativas, además de proporcionarinformación fácilmente interpretable sobre las diferencias de unos idiomas respecto a otros.Una aproximación similar se ha seguido para abordar el problema del reconocimiento de locutores.Una de las aplicaciones más directas de los sistemas de reconocimiento automático de locutores esel reconocimiento de locutor forense, donde intenta determinarse si la persona que habla en unagrabación determinada (muestra dubitada) es el acusado en cuestión, a partir otras grabaciones deéste (muestras indubitadas). Sin embargo, el reconocimiento de locutor forense y el reconocimientoautomático de locutor han seguido caminos separados debido, en gran parte, a la dificultad deinterpretar los procedimientos empleados por estos últimos, que son vistos por la mayoría de lacomunidad forense como sistemas de caja negra.Esto es debido, por una parte, a que los sistemas automáticos se basan en características cuyarelación con las propiedades anatómicas de los individuos se desdibujan debido a la cadena deprocesado a la que se somete a la señal de voz con el objetivo de eliminar la componente de señalindeseada y de realzar la información discriminante. Y por otra, a que los sistemas automáticosreducen la comparativa entre dos grabaciones de voz a una única puntuación que integra toda lainformación presente en ambas grabaciones. La comparación de voz forense, en cambio, suele haceruso de características directamente ligadas a aspectos anatómicos del individuo, como por ejemplolas frecuencias formantes. Así mismo, es habitual que la comparación se realice atendiendo acriterios fonético-acústicos, comparando unidades equivalentes entre sí desde un punto de vistalingüístico.En esta Tesis se aborda el problema del reconocimiento de locutor desde una perspectiva queintenta ligar ambas ramas, automática y forense. Para ello, se hace uso de sistemas automáticos parasegmentar la señal de audio en base al contenido fonético y extraer características fácilmenteinterpretables como las frecuencias formantes. A partir de esta información, se construyen sistemasautomáticos de reconocimiento de locutor independientes para cada unidad lingüística, lo quepermite analizar qué unidades son más discriminativas en término medio, o si los locutorespresentan particularidades que les hacen más distinguibles entre sí en base a determinadas unidadeslingüísticas. Además, la información discriminativa repartida entre las distintas unidadeslingüísticas puede combinarse de forma natural gracias la obtención de relaciones de verosimilitudpara cada unidad.


Desarrollo de un software de sincronización automática entre tonalidades y colores en musicoterapia audiovisual[Go to top]

Presenter:

Natalia Delgado  (Technical University of Eindhoven)

Date of presentation:

June 10, 2016

Presentation place:

Smart Room @ C-109, 10:30-11:00

Abstract:

This end of bachelor project consists in the automation of music and colour synchronization designed to be used in music therapy. The idea behind this concept is using colour as a new dimension to visually interpret the complex variations in music, and this project contributes to it by improving its efficiency through automation.
Studies have demonstrated that the tensions in a musical piece are related to the emotion or the mood the piece transmits [1], the same way it has been proved that different colours induce a certain mood or emotion in people [2]. Those conclusions have been used by a doctorate student in the Technical University of Eindhoven to back up the proposal of making emotion the common variable between music and colour [3][4] for further music therapy purposes. This project is the technological part of the later research work mentioned. To develop it, a thorough understanding of the doctorate student’s work is required, to then to propose a software that can fulfil its demands.
To start with, this paper studies Music Information Retrieval research, MIDI files format and the relation between them. This is necessary to develop a program capable of reading a MIDI file and converting it into a list of notes with their corresponding timings. Afterwards, high level music features such as chord and tonality changes are extracted, making use of music theory knowledge to reinforce MIR methods. Once the MIDI file has been interpreted into a list of chords, they are expressed as musical intervals, i.e. relative distances between them. This step is done prior to carrying out an automatic mapping of musical intervals to colours, following findings and conclusions to make a coherent matching between both disciplines [3][4]. The music to colour mapping results as a list of colours associated to timings. Finally, to visualize the outcome of the previous steps, coloured lights are changed following the colours list while the music is synchronously played. This last section is controlled by a software that establishes a connection with the lamps via Wi-Fi and executes the change-colour commands.
To finish with the project, an evaluation of an external software employed is carried out using a method based on speaker diarization. Finally, conclusions regarding the expectations of the project are made, and ideas for future work and improvement are suggested.

Detección de Intrusiones en Redes de Ordenadores (Ana Chevasco), Búsquedas en Voz y Detección de Menciones (María Pilar Fernández Gallego), y Reconocimiento del locutor dependiente del texto sobre RSR2015 (Álvaro Mesa Castellanos)[Go to top]

Presenter:

Ana Chevasco, María Pilar Fernández & Álvaro Mesa

Date of presentation:

June 07, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

Ana Chevasco
Título: Reconocimiento de Patrones Aplicado a la Detección de Intrusiones en
Redes de Ordenadores
Abstract:
El Trabajo Fin de Grado que se presenta consiste en el análisis de una base
de datos de detección de intrusiones en redes WiFi (AWID - Aegean Wireless
Intrusion Detection) y en la aplicación de técnicas de reconocimiento de
patrones para tratar de detectar las tramas correspondientes a ataques. La
base de datos reducida con la que se ha trabajado incluye cerca de 1.8
millones de tramas de entrenamiento y más de medio millón de tramas de test.
El trabajo parte de los resultados de los creadores de la base de datos, y
trata de reproducirlos y mejorarlos. Para ello en primer lugar se ha
realizado un análisis y selección de atributos manual de la base de datos,
pasando de 156 a 83 atributos consiguiendo resultados prácticamente
idénticos a los obtenidos por los creadores de la base de datos con todos
los atributos. Posteriormente se realizó una selección automática de
atributos sobre los 83 atributos restantes, seleccionando automáticamente 18
atributos con los que se consiguieron resultados similares a los obtenidos
por los creadores de la base de datos con una selección manual de 20
atributos. Como último resultado, y más innovador, se ha reorganizado la
base de datos en función de las comunicaciones entre dos direcciones MAC y
se ha realizado una expansión del contexto de la trama, haciendo que las
tramas extendidas pasasen a tener un contexto temporal de -2 a +2 tramas en
una comunicación entre dos direcciones MAC. Esta expansión de contexto ha
conseguido mejorar sustancialmente la detección de un tipo de ataques cuya
detección trama a trama resultaba poco efectiva, mejorando así los
resultados obtenidos por los creadores de la base de datos.

María Pilar Fernández Gallego
Título: Mejoras de un Sistema de Búsquedas en Voz y Aplicación a Detección
de Menciones en Medios de Comunicación
Abstract:
Las menciones publicitarias son contenidos publicitarios no pregrabados que
habitualmente dicen los locutores de radio o TV para promocionar un producto
o empresa. La dificultad de la detección de menciones publicitarias consiste
en que el audio no se repite igual cada vez, como ocurre con los anuncios
publicitarios convencionales, donde se pueden emplear técnicas más efectivas
como las de audio fingerprinting. Este Trabajo Fin de Grado propone la
utilización de un sistema de búsqueda de palabras clave en castellano para
la detección de menciones publicitarias.
En principio el TFG tenía como objetivo mejorar un sistema de búsqueda de
palabras claves en castellano para aplicarlo a la detección de menciones
publicitarias. Finalmente se ha construido un sistema nuevo prácticamente
desde cero. Para ello ha sido necesario en primer lugar entrenar y evaluar
un nuevo reconocedor de voz en español empleando la herramienta Kaldi y las
bases de datos Fisher Spanish y Callhome Spanish. Con este proceso se ha
conseguido reducir la tasa de error de palabra del 49.88% que se obtenía con
el anterior reconocedor en español del grupo ATVS al 41.10% sobre voz
conversacional telefónica.
Para la evaluación de detección de menciones a través de palabras clave se
ha creado, también como parte de este TFG y en colaboración con otros
estudiantes de TFG del grupo, una base de datos en castellano, que hemos
denominado ATVS-Radio, que contiene unas 300 horas de audio, de las cuales
25 horas han sido etiquetadas con varias informaciones. En particular para
este trabajo se han etiquetado las 62 menciones publicitarias que aparecían.
Para la detección de menciones se ha modificado el léxico del reconocedor
para incluir 51 palabras clave a detectar en las menciones, y se ha aplicado
el reconocedor a todas las menciones publicitarias, consiguiendo detectar un
74% de las mismas. Este resultado todavía podría mejorase sustancialmente
porque es posible realizar una mejor adaptación del reconocedor a la tarea,
en particular adaptando el modelo de lenguaje, que no ha sido modificado
para la detección de palabras clave.

Álvaro Mesa Castellanos.
Título: Reconocimiento del locutor dependiente del texto: Experimentos con
la base de datos RSR2015
Abstract:
Este Trabajo Fin de Grado tenía como objetivo inicial tomar los sistemas de
reconocimiento del locutor dependiente de texto del grupo ATVS,
desarrollados sobre la herramienta HTK y a partir de las bases de datos
TIMIT y YOHO, y evaluarlas sobre la nueva base de datos RSR2015. Este
objetivo se consiguió relativamente pronto, aunque con resultados no
demasiado satisfactorios dada la complejidad de la base de datos RSR2015 y
su desadaptación a la base de datos TIMIT empleada para entrenar los modelos
acústicos, entre otras cosas por tratarse de habla telefónica en condiciones
realistas (no habla microfónica en condiciones controladas como YOHO y
TIMIT). A partir de este punto se ha desarrollado un sistema de
reconocimiento de locutor dependiente de texto totalmente nuevo empleando la
herramienta Kaldi y los modelos de reconocimiento de voz entrenados con
Switchboard (alrededor de 300 horas de voz telefónica conversacional) como
parte de otro TFG del grupo.
Tras adaptar el modelo de lenguaje del reconocedor a la tarea de la base de
datos RSR2015 seleccionada (secuencias de 10 dígitos) se consiguió una tasa
de error de palabra (WER) de aproximadamente el 5%, que se consideró muy
razonable teniendo en cuenta que RSR2015 contiene inglés pronunciado por
asiáticos a través de móviles, mientras que Switchboard se grabó en Estados
Unidos fundamentalmente por teléfonos fijos. Teniendo en cuenta que las
frases de RSR2015 con las que trabajamos incluyen 10 dígitos el WER del 5%
se traducía aproximadamente en un 50% de frases reconocidas correctamente.
Con este resultado parecía complicado verificar que el código de 10 dígitos
pronunciado fuese el correcto. Sin embargo, empleando decodificación N-best
con hasta 100 hipótesis de reconocimiento y no exigiendo el correcto
reconocimiento de todos los dígitos para dar por válida una verificación se
han conseguido tasas de igual error de en torno al 2% para la verificación
de la secuencia de dígitos, un resultado muy superior al publicado como
referencia para la base de datos RSR2015.
Para discriminar entre locutores se ha empleado el mismo mecanismo que se
empleaba en los sistemas anteriores de ATVS: comparar el score acústico del
reconocimiento con un modelo adaptado al locutor con el score acústico del
reconocimiento con un modelo no adaptado al locutor. Para ello se ha
empleado la adaptación al locutor fMLLR de Kaldi. En las primeras pruebas
realizadas los resultados obtenidos son similares a los publicados para la
base de datos RSR2015 con sistemas i-vector.

Interfaz gráfica de etiquetado de atributos faciales (Beatriz Cid) y Adquisición de base de datos y sistema basado en firma manuscrita (Beatriz Cervantes)[Go to top]

Presenter:

Beatriz Cid y Beatriz Cervantes

Date of presentation:

June 06, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

Interfaz gráfica de etiquetado de atributos faciales (Beatriz Cid)

Traditional identification systems present some scarcity, since they require information such as passwords, IDs or access cards which can get lost, stolen or even easily falsified. Thus, biometric recognition arose with the purpose of identifying individuals across their own characteristics such as the face, fingerprint, etc. These characteristics are known as Soft Biometrics. They involve physical human characteristics (i.e. eyes color, hair color, the presence of beard, etc.), behavioral characteristics (i.e. typing dynamics, writing style, gait, etc.) and characteristics adhered to human characteristics (i.e. clothes color, the presence of tattoos, accessories, etc.), established and proved in the time by human beings with the objective of differentiating individuals. Physical characteristics and human behavior are increasingly being used within security applications due to their several advantages, such as universality, robustness, permanency and accessibility. The present TFG studies, implements and evaluates a recognition system based on Soft Biometrics. Soft Biometrics are interesting since they may provide information on an individual turning into a reliable primary biometric system. In order to carry it out, a facial labeling interface is developed with the objective of studying and evaluating how a set of Soft Biometrics may affect in the identification of an individual. As starting point, the state of the art in facial biometric systems and its evolution along history has been studied. Later, the interface for facial attributes labeling has been designed as well as the LFW database, which has no Soft Biometrics labeling of individuals, labeled. This database was provided by the group of ATVS biometric recognition. As soon as the labeling was obtained, a quantification of the results has been realized with the purpose of operating and extracting conclusions from this data. In the analysis part, the existing correlation among different Soft Biometrics included in this project has been studied and evaluated. In the experimental part, a system based on a set of Soft Biometrics has been designed in which there has been studied the individual yield and several sets of them attending on different criteria. Finally, the conclusions extracted along the development of the project are presented and future lines of work are proposed.

Título: Adquisición de base de datos y desarrollo de sistema de reconocimiento biométrico basado en firma manuscrita (Beatriz Cervantes)

Abstract: Este proyecto se basa en la adquisición y evaluación de una base de datos de firma dinámica con varios dispositivos fijos y móviles, y en dos sesiones separadas en el tiempo. Los dispositivos que se han utilizado para la captura de esta base de datos han sido un Smartphone y dos tabletas: una tableta WACOM, diseñada exclusivamente para la captura de firma dinámica mediante lápiz, y una tableta y smartphone con sistema operativo Android y de uso genérico.

Como punto de partida, se estudia el estado del arte. En esta parte se explican los distintos modos de operación en el reconocimiento biométrico. En el modo verificación, se hace especial hincapié en la arquitectura que presenta este tipo de sistema, comenzando por la etapa de captura de datos y terminando en la normalización de scores.

Una vez entendido el estado del arte desde el punto de vista teórico, el siguiente paso ha sido definir y describir el diseño de la base de datos. Esta base de datos es multisesión, multimodal, y lo suficientemente grande como para dar resultados fiables. Se han recogido firmas genuinas, muestras de escritura e imitaciones. Los usuarios que han realizado estas falsificaciones lo han hecho mediante distintos procedimientos, siendo unos más realistas que otros.

En el momento en el que la base de datos fue finalmente obtenida, se procedió a realizar la parte experimental. Este desarrollo experimental se ha llevado a cabo en dos etapas. Durante la primera etapa el objetivo fue evaluar el rendimiento del sistema de verificación de firma dinámica sin interoperabilidad. En este punto, se distinguen distintos tipos de comparaciones a la hora de evaluar el rendimiento del sistema, siendo la comparación de lápiz contra dedo una de las destacadas. En la segunda etapa, se estudió de nuevo el rendimiento del sistema pero en este caso, aplicando interoperabilidad entre los dispositivos, con el objetivo de conseguir un rendimiento lo más parecido posible al sistema sin interoperabilidad. Finalmente, se presentan las conclusiones extraídas a lo largo de este trabajo, así como el posible trabajo futuro.

Diarización de Locutores en Audio Broadcast[Go to top]

Presenter:

Gonzalo Soriano Morancho

Date of presentation:

June 03, 2016

Presentation place:

Smart Room @ C-109, 16:30-17:00

Abstract:

Audio Diarization is a field of study which has become unstudied over the last years, therefore it has an undeniable interest on nowadays investigation.
This Bacherlor Thesis will take over the current state of the art in this field, we will mostly study an existing system which has been probed to work reasonably well. This system from Le Mans’ University, called LIUM, is an open source system which has become winner on ESTER 2, ETAPE and REPERE evaluations.
Firstly we will study the system using the Albayzin 2010 benchmark, this will provide us with the general system’s usage, performance and reliability. Lately we will evaluate this system with a database which will be created on collaboration with other university students, at the beginning of this Thesis. This database will be created with a group of radio shows broadcasted currently on Spain.
The last step will be to adapt, in the finest way possible, this system to our new database, in order to adapt its performance to the type of corpus we will want to study. This will also provide us with the opportunity to study each of the components of the system separately, so we will know which one needs the most effort to be improved.
This last step consists on the study of a large amount of data coming from different simulations with different input parameters which will be adjusted manually to the system under study.

Investigación reproducible en BEAT (Roberto Daza) y Autenticación mediante dinámica de ratón (Belén Mérida)[Go to top]

Presenter:

Roberto Daza & Belén Mérida

Date of presentation:

June 01, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

Autor: Roberto Daza

Título: Investigación reproducible: uso de la plataforma BEAT para la evaluación tecnológica de algoritmos de reconocimiento biométrico

Abstract: La reproducibilidad de la investigación es un tema de gran preocupación dentro de la comunidad científica. Gran cantidad de los artículos científicos que se publican en la actualidad carecen del código para generar los resultados publicados e incluso de las bases de datos con los que se obtienen los mismos. En este trabajo se presenta una herramienta de evaluación tecnológica desarrollada a partir de la plataforma Biometric Evaluation and Testing (BEAT) orientada a fomentar la experimentación reproducible dentro del área del reconocimiento biométrico. Para ello, se ha propuesto una competición internacional (Keystroke Biometric Ongoing Competition - KBOC) centrada en la evaluación de sistema de reconocimiento de usuarios a través de dinámica de tecleo. Los principales objetivos de esta competición son impulsar la investigación en dinámica de tecleo (atraer a nuevos investigadores), impulsar la plataforma BEAT y ser una base común donde comparar los sistemas biométricos.
La competición incluye una de la mayores base de datos en dinámica de tecleo con más de 300 usuarios y 4 sesiones diferentes (escriben nombre y apellido).
Hay dos maneras de participar en KBOC (misma base de datos para las dos): Competición Ongoing (permanecerá activa por tiempo indefinido), desarrollada en la plataforma BEAT y competición Offline que sirve de referencia para la competición Ongoing. En este TFG se introduce la plataforma (los recursos creados para la competición, etc.), se detallan las características de la competición y se presentan los resultados ongoing (experimentos facilitados por la competición) y los resultados y sistemas de los participantes en la competición offline.


Autora: Belén Mérida

Título: Mejora de sistemas de autenticación de personas basados en dinámica de ratón

Abstract: En este Trabajo Fin de Grado se estudia e implementa un sistema de autenticación de personas basado en dinámica de ratón. Los sistemas de autenticación de personas a partir de rasgos biométricos de comportamiento tienen cada vez mayor demanda. En este sentido, la dinámica de ratón se ha convertido en los últimos años en una interesante línea de investigación. Motivada por el impacto de la biometría web y de comportamiento, ésta biometría presenta gran potencial en multitud de escenarios dado su bajo coste y fácil implantación. Igual que la dinámica de tecleo, la dinámica de ratón se plantea como una alternativa a los sistemas tradicionales de autenticación y una garantía de la tan importante seguridad cibernética. Además, frente a otras tecnologías, permite llevar a cabo una monitorización continua de los usuarios, es decir, permite autenticación continua durante el uso de un servicio en contrapunto a la autenticación puntual única realizada al acceder al servicio. Todos estos factores han convertido a la dinámica de ratón en una herramienta muy atractiva para la industria digital.
Los trabajos realizados en este campo reseñan su utilidad y refuerzan su validez con sus resultados. Así, el objetivo del presente trabajo es evaluar el rendimiento de los sistemas planteados en la literatura y estudiar posibles mejoras de los mismos. Por tanto, el primer punto de este trabajo pasa por realizar una revisión del estado del arte. De todos los trabajos propuestos, se emplea un artículo de referencia. En éste se plantea una evaluación del rendimiento de diferentes implementaciones del problema. En base a ello, se diseña el sistema de mejor resultado y se analiza su rendimiento de acuerdo a las diferentes variables que componen el problema (usuarios, tareas, características). De forma paralela, se estudian nuevas aproximaciones del problema, ya empleadas en rasgos como la firma, basadas en modelos neuromotores asociados a la producción de las señales dinámicas involucradas en la dinámica de ratón. Tras el diseño de los diversos sistemas, se llevan a cabo una serie de experimentos a fin de efectuar diferentes evaluaciones. Finalmente, se presentan las conclusiones y se proponen futuras líneas de trabajo.

Improving Security and Privacy in Biometric Systems[Go to top]

Presenter:

Marta Gómez Barrero

Date of presentation:

May 09, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

With the increased need for reliable and automatic identity verifi cation, biometrics have emergedin the last decades as a pushing alternative to traditional authentication methods. Certainly,biometrics are very attractive and useful for the general public: forget about PINs and passwords,you are your own key. However, the wide deployment of biometric recognition systems at bothlarge-scale applications (e.g., border management at European level or national identity systems)and everyday tasks (e.g., smartphone or PC access), has raised some concerns regarding theuse and storage of such sensitive data. Therefore, understanding the threats which can a ffectthose systems and analysing to what extent the subject's privacy is protected is of the utmostimportance.In this context, the present PhD Thesis pretends to shed some light into the di cult problemof security and privacy evaluation of biometric systems. To that end, a systematic analysisof the privacy provided by unprotected templates is carried out, and new biometric templateprotection schemes are proposed to deal with the unveiled privacy issues, being their robustnessto the mentioned privacy threats thoroughly assessed. This way, the experimental studies presentedin this Dissertation can help to further develop the ongoing standardization e fforts onthe assessment of template protection schemes.

Chemometric-assisted classification of physicochemical data for forensic purposes[Go to top]

Presenter:

Patrick Wlasiuk

Date of presentation:

March 14, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

Stating about the evidential value of physicochemical patterns is in the interests of analytical chemist appointed as a forensic expert in a given case. The analysis of such patterns poses some challenges which ought to be resolved when likelihood ratio has to be given. Those obstacles consist of issues related to probabilistic approach in a feature space that has been tackled with utilization of discriminative approaches with subsequent formulation of likelihoods. The examples based on glass and olive oil datasets are to be used as instances of background information scarcity and calibration issues. Aspects of those topics are to be addressed during the talk.

Short Bio:

Patryk Własiuk is a PhD candidate in the Department of Analytical Chemistry at Jagiellonian University in Kraków (Poland). He conducts his thesis research in Institute of Forensic Research in Kraków. His doctoral research is focused on the use of chemometrics for forensic purposes. He is interested in evaluating the evidential value of physicochemical patterns coming from various analytical methods. Part of challenges related to this topic is to be explored during his stay in ATVS research group.

Segmentation and Detection of Audio Sources in Broadcast and Telephone Audio Streams (Speaker Diarization)[Go to top]

Presenter:

Cristian Sánchez Rodríguez

Date of presentation:

March 07, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

This master thesis includes the tasks of documentation, research, design and development of a Speaker Diarization System for Broadcast News. It was done in collaboration with the Biometric Recognition Group - ATVS at Universidad Autónoma de Madrid (UAM).

Speaker diarization aims to answer the question "who spoke when?", i.e.: to indicate which speaker is talking in each moment. This task is performed without any prior information: neither the number of the speakers nor their identities. The audio broadcast domain makes the task more difficult as it contains many different audio sources and the speech parts are usually found in the presence of music or background noise.

The system developed works independently as it incorporates its own Speech Activity Detector (SAD). The design of the SAD is done by studying the trajectories of the harmonics found in the spectrogram of the speech signal. It can also be used separately for other speech-related tasks.

The speaker diarization system is optimized for the evaluation data of the ‘Albayzin 2010’ speaker diarization campaign. The Diarization Error Rate (DER) is used to measure the performance of the task.

LIUM, an open-source speaker diarization system which got excellent results in the ESTER 2, ETAPE and REPERE evaluations, is also tested on the ‘Albayzin 2010’ database. Its results are compared to the ones of the system developed.

Finally, some ways of improvement are identified, the ongoing work is detailed and different lines of research are proposed for further work.

Speaker Recognition in Unconstrained Environments[Go to top]

Presenter:

Andreas Naustch

Date of presentation:

February 01, 2016

Presentation place:

c-109, 15:00-16:00

Abstract:

Biometrics on mobile devices is uprising. Sample capturing
environments cannot be anymore restricted, such as in conventional
commercial applications installed stationary. Having speaker recognition
most prominent for mobile phones due to capture device permanence,
research on unconstrained conditions is motivated. Different
contemplating aspects for unconstrained duration and noise conditions
are emphasized: estimating biometric information, unconstrained score
normalization, and i-vector reconstruction in a mobile scenario. Further
on, recent developments in a related speaker and signature recognition
project (BioMobile) are depicted including standardization work on voice
data format, testing and reporting of presentation attack detection and
the proposal of a biometric forensic guideline. Collaborative outcomes
from the BioMobile project with the ATVS-UAM lab are put to discussion.

Short Bio:

Andreas Nautsch received his B.Sc. and M.Sc. degrees from
Hochschule Darmstadt (h_da) in a cooperative study program in 2012 and
2014, respectively, and joined Mälardalens Högskola (MDH), Västerås,
Sweden, for an Erasmus exchange in 2013. From 2008 to 2014 he worked as
a student software engineer with focus on automatic speech recognition
and speaker recognition. In June 2014, he became Ph.D. student member of
the da/sec research group at Center for Advanced Security Research
Darmstadt (CASED). In 2012 and 2014, he participated in NIST SREs and
became editor of the ISO/IEC 19794-13 voice data project in 2015. In
December 2015, he joined the ATVS research group at Universidad Autónoma
de Madrid (UAM), Spain, for a 3-months collaborative Ph.D. internship.

Análisis de Técnicas Caligráficas para a Mejora del Reconocimiento Automático de Firma[Go to top]

Presenter:

Francisco José Fernández Herrero

Date of presentation:

January 18, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

En este proyecto se estudian técnicas caligráfi cas para su uso en la mejora de sistemas semi-automáticos de reconocimiento de firma manuscrita. Se ha procedido al estudio y análisis de los atributos caligrá cos que modelan la firma, y se han seleccionado los mas adecuados para este trabajo. En concreto se han seleccionado 10 atributos distintos de entre un conjunto mucho más amplio utilizado por expertos forenses. Se ha creado una herramienta para poder anotar de forma manual estos atributos seleccionados en cada firma de cada usuario, consiguiendo así una nueva base de datos de atributos extraidos manualmente de la firma.

Para el capítulo experimental se ha procedido a evaluar 3 sistemas utilizados en este proyecto (online basado en DTW, offline basado en SIFT y el sistema de atributos propuesto). Además de estos experimentos principales, se han llevado a cabo experimentos relacionados únicamente con los atributos caligrá cos seleccionados, para comparar su rendimiento. El sistema Manual basado en el etiquetado de atributos fusionado con el sistema automático DTW (Dynamic Time Warping) consigue mejorar en un 98% en el caso de random forgeries (el falsi cador reproduce la firma genuina sin ningún conocimiento de la misma) y un 28% en el caso de skilled forgeries (el falsi cador reproduce la rma genuina habiendo practicado y observado la fi rma con antelación), frente al sistema automático basado en información dinámica, DTW. Finalmente se presentan las conclusiones extraídas a lo largo del trabajo, así como las posibles líneas de trabajo futuro.

A Semi-supervised Pipelined Deep Learner for an Adaptive and Efficient Biometric System[Go to top]

Presenter:

Abhijit Das  (Griffith University)

Date of presentation:

January 11, 2016

Presentation place:

C-109, 15:00-16:00

Abstract:

The presentation will be organised in to three sections. The first section will be highlighting my works sclera biometrics, followed by the proposed schema of Deep Lerner and last section will illustrate the IEEE grant/ the fellowship that I have received to undergo my present proposal.

2015

ATVS submission and overview of NIST Language Recognition Evaluation 2015[Go to top]

Presenter:

Alicia Lozano & Rubén Zazo

Date of presentation:

December 14, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

The ATVS submission, consisting of LSTM Recurrent Neural Networks and i-vector systems, will be fully described. An overview of last week LRE 2015 workshop will also be provided.

Towards human-assisted signature recognition[Go to top]

Presenter:

Derlin Morocho Checa  (Escuela Politécnica del Ejército, Ecuador)

Date of presentation:

November 30, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Este café presentará los trabajos realizados en los últimos meses relacionados con el reconocimiento de firma a través de sistemas semiautomáticos asistidos por humanos. Aunque los sistemas asistidos por humanos son populares en áreas como el reconocimiento facial, estos esquemas colaborativos resultan novedosos en el campo del reconocimiento de firma y presentan un gran potencial de investigación. En la presentación se hará hincapié en dos propuestas estudiadas durante los últimos meses: el reconocimiento de firma a través de atributos inspirados en el análisis de los FDE (Forensic Document Examiniers) y las tareas Inteligentes humanas (HIT) realizadas a través de crowdsourcing.
Durante el café también se presentará el artículo “Signature Recognition: Establishing Human Baseline Performance via Crowdsourcing” enviado recientemente al International Workshop on Biometrics and Forensics (IWBF) y actualmente en revisión.

Occlusions in Face Recognition: Experimental Comparison of Region-based Matchers[Go to top]

Presenter:

Ester González

Date of presentation:

November 23, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

The last research efforts made in the face recognition community have been focusing in improving the robustness of systems under different variability conditions like change of pose, expression, illumination, low resolution and occlusions. In our society, there are plenty of different situations in which people have their face covered partially or completely, hindering the possibility of being recognized. Occlusions are also a manner of evading identification, which is commonly used when committing crimes or thefts. Several previous works have proved the value of using the non occluded regions in scenarios in which occlusions are present. In this work, we reported experiments based on the ARFace database for three different systems: the well known Face++ commercial software and two other systems based on LBP approaches. Results prove the robustness of the commercial software under two types of occlusions (sunglasses and scarf) and the convenience of increasing the number of facial regions up to 15 regions when working with the LBP system

7th European Academy of Forensic Science Conference, EAFS 2015: Scientific Report[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

November 16, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

This seminar is in fact an informal description of the 7th European Academy of Forensic Science Conference, EAFS 2015, held in Prague, Czeck Republic, 6-11 September 2015. We will briefly describe the conference, and will give an insight on some contributions that can be currently interesting to ATVS.

Update Strategies for HMM-Based Dynamic Signature Biometric Systems[Go to top]

Presenter:

Rubén Tolosana Moranchel

Date of presentation:

November 02, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Biometric authentication on devices such as smartphones and tablets has increased significantly in the last years. One of the most acceptable and increasing traits is the handwriting signature as it has been used in financial and legal agreements scenarios for over a century. Nowadays, it is frequent to sign in banking and commercial areas on digitizing tablets. For these reasons, it is necessary to consider a new scenario where the number of training signatures available to generate the user template is variable and besides it has to be taken into account the lap of time between them (inter-session variability). In this work we focus on dynamic signature verification. The main goal of this work is to study system configuration update strategies of time functions-based systems such as Hidden Markov Model
(HMM) and Gaussian Mixture Models (GMM). Therefore, two different cases have been considered. First, the usual case of having an HMM-based system with a fixed configuration (i.e. Baseline System). Second, an HMM-based and GMM-based systems whose configurations are optimized regarding the number of training signatures available to generate the user template. The experimental work has been carried out using an extended version of the Signature Long-Term database taking into account skilled and random or zero-effort forgeries. This database is comprised of a total of 6 different sessions distributed in a 15-month time span. Analyzing the results, the Proposed Systems achieve an average absolute improvement of 4.6% in terms of EER(%) for skilled forgeries cases compared to the Baseline System whereas the average absolute improvement for the random forgeries cases is of 2.7% EER. These results show the importance of optimizing the configuration of the systems compared to a fixed configuration system when the number of training signatures available to generate the user template increases.

"Comparison of binaural microphones for externalization of sound" & "TFM Project Plan: Speaker Diarization"[Go to top]

Presenter:

Cristian Sánchez

Date of presentation:

October 26, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Cristian Sánchez presentará, por un lado, el trabajo desarrollado en el "Center for Applied Hearing Research" de la DTU (Technical University of Denmark), titulado:

Comparison of binaural microphones for externalization of sound

así como el plan de trabajo para la realización de su TFM en ATVS-UAM, titulado:

Segmentation and Detection of Audio Sources in Broadcast and Telephone Audio Streams (Speaker Diarization)

Bottleneck Features for Speaker Recognition[Go to top]

Presenter:

Alicia Lozano

Date of presentation:

October 19, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Deep Neural Networks (DNNs) have shown to be able to model properly the speech signal. Recently, several approaches based on DNNs have been successfully applied to the field of speaker recognition. One of those approaches is to use a DNN as feature extractor: a DNN with a relatively small layer ("bottleneck") is trained to discriminate between phoneme states; then, a forward pass is performed and the activations obtained in the bottleneck layer are used as input of a classical i-vector/PLDA speaker recognition system. In this talk, this technique will be described and some experiments following this approach, obtained as a result of the work done during the research internship in the Speech Processing Group (Brno University of Technology), will be presented.

Scientific Report IEEE Seventh International Conference on Biometrics: Theory, Applications and Systems (BTAS15)[Go to top]

Presenter:

Aythami Morales & Rubén Vera

Date of presentation:

October 05, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

This seminar is an informal description of the BTAS 2015, hold in Arlington, USA, 8 and 11 September 2015. We will briefly describe the conference, and will give an insight on the contributions that can be currently interesting to ATVS including the two works presented by the group: "e-BioSign Tool: Towards Scientific Assessment of Dynamic Signatures under Forensic Conditions" and "Keystroke Dynamics Recognition based on Personal Data: A Comparative Experimental Evaluation Implementing Reproducible Research".

Evidential value of spectra assessed using the likelihood ratio supported by chemometric tools[Go to top]

Presenter:

Agnieszka Martyna

Date of presentation:

September 28, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

The application of various chemometric tools, which account for many important aspects of data mining, feature selection and extraction of the most relevant information combined with LR approach will be presented for some examples involving three databases within the comparison problem: 27 infrared spectra (FTIR) recorded for the polypropylene samples originating from car body parts and plastic containers and within the Raman spectra for 30 solid, and 30 metallic blue automotive paints.

Short Bio:

Agnieszka Martyna is a PhD student at the Jagiellonian University, Krakow, Poland. She is engaged in research focused on evaluating the evidential value of physicochemical data for forensic purposes. This research is conducted at the Institute of Forensic Research, Krakow, Poland. Her work focuses on the analysis of microtraces of glass and polymers and the application of statistical and chemometric methods for assessing their evidential value. Agnieszka Martyna holds an MSc (2011) in chemistry from the Faculty of Chemistry, Jagiellonian University, Krakow, Poland and an Eng. (2012) from the University of Science and Technology, Krakow, Poland. She is also the co-author of the book Zadora G., Martyna A., Ramos D., Aitken C., Statistical Analysis in Forensic Science. Evidential Value of Multivariate Physicochemical Data; John Wiley and Sons, 2014 devoted to the evaluation of evidence.

Deep Learning Architectures for Voice Activity Detection[Go to top]

Presenter:

Rubén Zazo

Date of presentation:

September 21, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Choosing the right set of features and the proper model for Voice Activity Detection have been active areas of research. In this paper we propose a method based on a state-of-the-art CLDNN architecture fed with the raw waveform that aims to face both problems at the same time. The proposed method is trained on a large dataset (~4000h) and tested in clean and noisy conditions. Specifically, we will show the benefit of temporal modeling for this task, as well as the gains introduced by learning the proper features directly from the raw waveform. The proposed system achieves over 78% relative improvement on both clean and noisy conditions when compared to a standard DNN of comparable size fed with the widely used log mel as an input and trained on the same conditions. In addition, a deeper study on the impact of the model size and the characteristic of the learned features has been performed in order to have a better understanding on the outstanding performance shown.

Fully Unlinkable and Irreversible Template Protection Based on Bloom Filters[Go to top]

Presenter:

Marta Gomez-Barrero

Date of presentation:

September 02, 2015

Presentation place:

C-109, 15:30-16:30

Abstract:

In this paper we propose a new biometric template protection scheme providing irreversibility and unlinkability, while preserving verification performance, in accordance with the ISO/IEC IS 24745 on biometric information protection. The original approach based on Bloom filters is improved in order to ensure unlinkability and robustness to cross-matching attacks. A thorough experimental evaluation, carried out on a publicly available database and system, shows that those three major requirements are met (verification performance preservation, irreversibility and unlinkability).

Caracterización de hablantes mediante extracción de información de cualidad vocal[Go to top]

Presenter:

Adrián García Cantalapiedra

Date of presentation:

July 09, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

El reconocimiento de locutor se ha basado típicamente en características del tracto vocal. En este trabajo se propone un sistema totalmente novedoso donde se emplearán parámetros relativos a la glotis. Para ello se recurrirá a los algoritmos más avanzados en este campo, recogidos en un repositorio de libre uso denominado COVAREP, y los cuales serán brevemente explicados. Se llevará a cabo un análisis de las diferentes características glotales que pueden ser de interés y se creará un sistema de identificación de locutores con los métodos más comunes y robustos en esta tarea. Seguidamente se analizarán posibles agrupaciones en diferentes tipos de locutores de acuerdo a estas características glotales. Por último, se creará un sistema básico de MFCCs y se combinarán los resultados de verificación obtenidos con ambos tipos de características.
Palabras clave:
Caracterización de locutor, glotis, GMM-UBM, NIST, pulso glotal, GCI, COVAREP, clustering aglomerativo, fusión de scores

Optimal Feature Selection and Inter-Operability Compensation for On-Line Biometric Signature Authentication[Go to top]

Presenter:

Javier Ortega García

Date of presentation:

July 06, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Due to the technological evolution and the increasing popularity of smartphones, people can access an application with many different devices. This device interoperability is a very challenging problem for biometrics. In
this paper we focus on inter-operability device compensation for on-line signature verification. The proposed approach is based on two main stages. The first one is a preprocessing stage where data acquired from different devices are processed in order to normalize the signals in similar ranges. The second one is based on a feature selection of time functions taking into account the inter-operability device comparisons in order to select features which are robust in these conditions. The experimental work has been carried out with Biosecure database using a Wacom tablet (DS2) and a PDA tablet (DS3), and the system developed is based on dynamic time warping (DTW) elastic measure over the selected time functions. The performance of the
proposed system is very similar compared to an ideal system.
Also, the proposed approach provides average relative improvements for the cases of inter-operability comparisons of 26.5% for random forgeries and, around 14.2% for the case of skilled forgeries comparing the results with the
case of having a system specifically tuned for each device, proving the robustness of the proposed approach. These results open the door for future works using devices as smartphones or tablets, commonly used nowadays.

Fingerprint Recognition for Forensic Applications (revised version)[Go to top]

Presenter:

Ram P. Krish

Date of presentation:

June 17, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

The latent fingerprints obtained from crime scenes are highly partial in nature, and are also of poor quality. To identify a partial latent fingerprint from a huge criminal database using automated fingerprint matchers is a challenging problem. In this talk, we will focus on methods to improve the identification accuracy of fingerprint matchers in this forensic scenario. Many automated fingerprint matchers assume approximately the same size of the minutiae set under comparison for best performance. We propose a method to reduce the minutiae search space of full fingerprint roughly to the size of the partial fingerprint by registering the orientation fields of an input partial fingerprint and a full good quality fingerprint (usually the case in forensic casework) , thereby increasing the identification accuracy of the whole system.

Automated Fingerprint Identification Systems (AFIS) commonly use typical minutiae features such as ridge-endings and bifurcations for matching. We also propose a method to improve the identification accuracies of minutiae-based matchers by incorporating extended fingerprint feature sets (rare minutiae features). We also propose a robust evidence evaluation framework based on likelihood ratio which uses the AFIS scores from the method proposed to incorporate extended fingerprint feature sets.

Monitorización de parámetros biométricos para aplicaciones web[Go to top]

Presenter:

Elena Luna

Date of presentation:

June 10, 2015

Presentation place:

C-109, 15:30-16:30

Abstract:

Technological development in recent years has contributed to the access and massive storage of digital information. This increasing development along with the proliferation of Web services, involve the need of management systems of large-scale identities. Cyber Security is a matter of great importance that concerns governments, companies and users. Because of the many disadvantages of the current password-based authentication systems, such as loss, theft or hacking, systems based on biometric authentication are potencial substitutes of them.
Keystroke dynamics has become an active area of research because of its low cost and easy integration into devices. Due to being a behavioural biometric feature, data and results are highly variable. Score normalization has proved a useful technique to mitigate this effects.
This work studies the usefulness of the score normalization in a system based on keystroke dynamics. It assesses different algorithms in two different databases, one of own development and other public database. It studies the normalization impact in each database and evaluates the performance of each algorithm.
Finally, potencial targets, to improve the results, and future research are presented.

Comparación de Algoritmos de Cálculo de Ratios de Verosimilitudes para Interpretación Forense[Go to top]

Presenter:

Juan Maroñas

Date of presentation:

June 08, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Este TFG compara varios algoritmos de cálculo de ratios de verosimilitudes (LR), en un entorno de casos forenses reales. El marco de aplicación son los sistemas automáticos de reconocimiento de locutores utilizados en casos forenses reales. Los LR son actualmente la recomendación para emitir conclusiones evaluativas en informes forenses en Europa.

En la primera parte del TFG se hace uso de tres métodos populares para calcular LR a partir de puntuaciones de sistemas biométricos (llamadas “scores” en inglés): regresión logística (LogReg), modelado gaussiano de máxima verosimilitud (Gauss-ML) y Kernel Density Function Gaussianas (KDF). La comparación de estos métodos se lleva a cabo en el contexto de la evaluación NIST 2012 de reconocimiento de locutor (NIST SRE 2012), donde se presentan varios escenarios de habla conversacional telefónica y microfónica realista. El análisis pone de manifiesto problemas típicos en este tipo de entornos, como el desajuste de bases de datos, el sobreajuste de los modelos, etc.

En la segunda parte del TFG se propone el uso del modelado gaussiano bayesiano (Gauss-Bayes), propuesto recientemente como alternativa a los tres métodos analizados anteriormente. Se concluye que, cuando existe gran cantidad de datos de entrenamiento de los modelos, este método equivale a Gauss-ML. Sin embargo, si los datos de entrenamiento son escasos, Gauss-Bayes supera ampliamente en rendimiento a Gauss-ML.

Posteriormente, el TFG presenta una aportación original de aplicación: se propone un escenario real en ciencia forense, en el que las proposiciones definidas en el caso dan lugar a una falta considerable de scores de entrenamiento para los modelos. Este caso es muy común en ciencia forense, tal y como se describe. Se muestra que el método Gauss-Bayes supera con creces al método Gauss-ML en este escenario, dando lugar a cálculos de LR robustos y coherentes. Más aún, se proponen dos esquemas de cálculo de scores de entrenamiento (llamados también esquemas de anclaje, o “anchoring”) que pueden ser adecuados para el uso en casos forenses reales, y se presenta el rendimiento para ambos esquemas, constatando que Gauss-Bayes es la mejor opción en dichos escenarios. Los resultados de esta sección son de relevancia, y previsiblemente se enviarán para publicación tras la finalización del TFG.

Epistemology applied to conclusions of expert reports[Go to top]

Presenter:

José Juan Lucena-Molina

Date of presentation:

June 01, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

It is believed that to build a robust reasoning logic to make probabilistic inferences in forensic science from a merely mathematical or logistical viewpoint is not enough. Mathematical logic is the positive science of reasoning and as for that it is only interested in the positive calculus of its validity, regardless any prior ontological assumption. But without a determined ontology and epistemology which imply to define the concepts that they will use, it seems difficult that the proposed scientifically correct mathematical solution be successful as a European standard for making conclusions in forensic reports because it has to be based on judicial language.

Forensic experts and Courts aren’t interested in the development of a positive science but in a practical science: in clarifying whether certain known facts are related to a possible crime. Therefore, not only the coherence of the demonstrative logic reasoning used (logic of propositions) is important, but also the precision of the concepts used by language and consistency among them in reasoning (logic of concepts).

There is a linguistic level essential for a successful communication between the forensic practitioner and the Court which is mainly related, in our opinion, to semantics and figures of speech. The first one is involved because words used in forensic conclusions often have different meanings – it is said that they are polysemic - and the second one because there is often metonymy as well. Besides, semantic differences among languages regarding words with the same etymological root add another difficulty for a better mutual understanding.

The two main European judicial systems inherit a wide and deep culture related to evidence in criminal proceedings and each of them has coined their own terminology but there are other two more abstract levels such as logical and epistemological, where we can find solid arguments by which terms used at legal level on conclusions of forensic reports could be accurate and consistent for all users of an intended EU guideline. An effort has been made to elucidate the following terms: truth, certainty, uncertainty, opinion, conjecture, probability, evidence, belief, credibility, determinism, indeterminacy, cause, principle, condition, and occasion.

Segmentación de audio mediante características cromáticas en ficheros de noticias[Go to top]

Presenter:

Elena Gómez Rincón

Date of presentation:

May 25, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Este proyecto se centra en la implementación y análisis de diversas técnicas de extracción de características para segmentación de audio en ficheros de noticias. En él se distinguen dos tipos de características, tímbricas y cromáticas, si bien el objetivo principal es investigar en mayor profundidad el uso de estas segundas, cuya introducción viene motivada por una mayor capacidad potencial para distinguir entre voz y música.
En la primera fase del proyecto se optimiza un sistema basado en características tímbricas MFCC-SDC, que además de servir como sistema de referencia, ha permitido al grupo ATVS participar en la evaluación Albayzín 2014 de segmentación de audio. Dicho sistema se basa en tres detectores GMM-UBM, cada uno de los cuales se diseña para detectar la presencia de cada una de las clases acústicas consideradas en la evaluación: voz, música y ruido. La base de datos proporcionada por la organización para el desarrollo de los sistemas de segmentación ha servido además como marco experimental para desarrollar y evaluar las nuevas técnicas de extracción de características propuestas en este proyecto.
En la segunda fase, el proyecto se ha centrado en el uso de características cromáticas para la segmentación de audio. En primer lugar, se ha actualizado y adaptado un sistema previo de segmentación de audio basado en estadísticos de la entropía cromática, comparando su rendimiento con el sistema basado en características tímbricas (MFCC-SDC) en el mismo marco experimental (base de datos de desarrollo de la evaluación Albayzín 2014). Posteriormente, se han implementado dos extractores de características armónicas: uno basado en la agrupación de la energía por subbandas y otro en la agrupación por octavas. Ambas características han sido empleadas de forma conjunta para el desarrollo de un nuevo sistema de segmentación de audio, evaluado sobre el mismo marco experimental (Albayzín 2014).
Finalmente, se ha analizado la complementariedad de los sistemas de segmentación de audio basados en distintos los tipos de características mediante su combinación tanto a nivel de características como a nivel de puntuaciones.

Mejora de la robustez frente al ruido en un sistema de búsqueda rápida de audio en audio[Go to top]

Presenter:

Andrés Martín López

Date of presentation:

May 11, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Partiendo de un sistema de búsqueda de audio en audio inicial con un comportamiento bueno en condiciones óptimas de ruido, se han introducido mejoras para aumentar la robustez del sistema en condiciones de ruido realistas, en las que el comportamiento del sistema inicial era muy limitado. Se han introducido técnicas de robustez en el dominio de características, incluyendo Ceptral Mean and Variance Normalization (CMVN) en varios niveles, lo que ha mejorado los resultados, pero no ha sido suficiente para solventar otras situaciones en las que el ruido es claramente dominante. Para estas otras situaciones se han desarrollado técnicas novedosas de robustez basadas en el análisis de las trayectorias temporales definidas por las detecciones, que permiten no sólo mejorar los resultados sino detectar de forma fiable las situaciones en las que las búsquedas no son correctas, permitiendo solventar estas situaciones mediante la utilización de más cantidad de audio para la búsqueda.

Fingerprint Recognition for Forensic Applications[Go to top]

Presenter:

Ram P. Krish

Date of presentation:

May 04, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

The latent fingerprints obtained from crime scenes are highly partial in nature, and are also of poor quality. To identify a partial latent fingerprint from a huge criminal database using automated fingerprint matchers is a challenging problem. In this talk, we will focus on methods to improve the identification accuracy of fingerprint matchers in this forensic scenario. Many automated fingerprint matchers assume approximately the same size of the minutiae set under comparison for best performance. We propose a method to reduce the minutiae search space of full fingerprint roughly to the size of the partial fingerprint by registering the orientation fields of an input partial fingerprint and a full good quality fingerprint (usually the case in forensic casework) , thereby increasing the identification accuracy of the whole system.

Automated Fingerprint Identification Systems (AFIS) commonly use typical minutiae features such as ridge-endings and bifurcations for matching. We also propose a method to improve the identification accuracies of minutiae-based matchers by incorporating extended fingerprint feature sets (rare minutiae features). We also propose a robust evidence evaluation framework based on likelihood ratio which uses the AFIS scores from the method proposed to incorporate extended fingerprint feature sets.

Biometric Template Protection based on Bloom Filters: Finger Vein and Unlinkability[Go to top]

Presenter:

Marta Gómez-Barrero

Date of presentation:

April 27, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Biometric data are considered sensitive personal data and any privacy leakage poses severe security risks. Biometric templates should hence be protected, obscuring the biometric signal in a non-reversible manner, while preserving the unprotected systems performance. Bloom filters have shown their capabilities for protecting face-, iris- and fingerprint-based templates while maintaining the verification performance. During the present research internship, we will test the feasibility of protecting fingervein templates and fusing several modalities at feature level, in order to further improve the security provided, the irreversibility of the templates and the verification performance. A general method is also presented in order to achieve the second requirement of biometric template protection schemes: unlinkability.

Mejora de algoritmos de reconocimiento de huellas dactilares en entornos forenses[Go to top]

Presenter:

Fátima García Donday

Date of presentation:

April 22, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

Este proyecto se centra en el uso de huellas dactilares como elemento de reconocimiento de personas. Basándose en su gran poder de discriminación como rasgo biométrico, en este PFC se desarrollarán mejoras sobre un sistema biométrico de extracción automática de características.
Para la realización de pruebas y experimentos sobre las mejoras realizadas en el sistema se ha adquirido una base de datos de casos forenses reales en colaboración con la Dirección General de la Guardia Civil como parte de una beca otorgada por el laboratorio de investigación biométrica ATVS. Dicha base de datos, existente sobre papel en el laboratorio de Lofoscopia del departamento de Criminalística, ha sido digitalizada y convertida a un formato conveniente para su posterior utilización.
Adicionalmente, se han desarrollado, en colaboración con otros PFC’s, dos herramientas para el marcado automático de minucias sobre las imágenes de las huellas y el cálculo de relaciones de verosimilitud (LR). Se han realizado sobre estas herramientas pruebas para detección y corrección de errores. Ambas han sido utilizadas en casos reales por la DGGC.
La mejora realizada sobre el sistema consiste en la eliminación de falsas minucias detectadas fuera de la región de la huella o región de interés (ROI). Para su eliminación se ha procedido a la segmentación de las imágenes que componen la base de datos para separar la huella del fondo. La segmentación está basada en Filtros de Gabor. Tras la identificación de la ROI, se eliminan las falsas minucias.
El plan de pruebas y experimentos desarrollado hace uso de un software basado en algoritmos de códigos cilíndricos. En ellos se analizaran los resultados del sistema biométrico antes y después de la extracción de la ROI que demuestran una mejora notable del en el proceso de reconocimiento.

International Workshop on Biometrics and Forensics, IWBF 2015: Scientific Report[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

March 23, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

This seminar is in fact an informal description of the International Workshop on Biometrics and Forensics 2015, hold in Gjovik, Norway, 3 and 4 March 2015. We will briefly describe the conference, and will give an insight on the contributions that can be currently interesting to ATVS.

Web-Based Biometric Recognition Using Keystroke Dynamics[Go to top]

Presenter:

Mario Falanga

Date of presentation:

February 19, 2015

Presentation place:

C-109, 13:00-14:00

Abstract:

This work analyse the discriminate ability of keystroke dynamics based on the biometric signature of the subjects acquired when they type their personal data (e.g. name, surname, email, nationality and ID number) on a web-form. The contribution is summarized below: 1) Data Acquisition Platform based on Keystroke Dynamics; 2) A novel database including keystroke dynamics from 53 users; 3) A deep analysys of the performance of the most popular features and classifiers for keystroke dynamic recognition in the proposed environment.

This work includes the analys of the performance of 4 popular keystroke dynamic classification lgorithms: Normalized Distance, Nearest Neighbour Manhattan Distance, Nearest Neighbour Manhattan+ Mahalanobis Distance and Manhattan Scaled Classifiers. The best result is achieved for Normalized Distance Classifier and the combination of five keystroke signatures (name, surname, email, nationality and id number) with a Mean Equal Error Rate of 2.9%. These results encourage to further research in this area with several potential applications.

Short Bio:

Mario Falanga is an MSc student in Computer Science and Engineering at Polytechnic School (Scuola Politecnica e delle scienze di base) of University of Naples 'Federico II', where he also received the Bachelor’s Degree defending a thesis on “Opinion Analysis from web sources”. From September 2014, he is working on his thesis at UAM (Universidad Autonoma de Madrid) on the topic of Behavioral Biometrics related with human computer interactions (mouse, keystroke, etc.) within the Biometric Recognition Group – ATVS. The title of his thesis is “Web-Based Biometric Recognition Using Keystroke Dynamics”. The work was carried out under the supervision of Prof. Carlo Sansone, Prof. Javier-Ortega Garcia and Dr. Aythami Morales Moreno.

Desarrollo de herramientas de apoyo para comparación forense de firmas manuscritas[Go to top]

Presenter:

Alejandro Acién Ayala

Date of presentation:

February 09, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

En este proyecto se estudia y desarrolla una aplicación software que incluye herramientas de apoyo forense para el cotejo de firmas manuscritas capturadas con dispositivos digitales. Concretamente se diseñan herramientas con funcionalidades que aprovechan la información capturada por dichos dispositivos (p.ej presión, velocidad, inclinación, etc), ofreciendo una gran ayuda para los expertos forenses a la hora de realizar el cotejo pericial.

A multi-modal eye biometrics[Go to top]

Presenter:

Abhijit Das  (Associate Professor, Departmet of Computer Science and Engineering, Indian Institute of Technology Kharagpur)

Date of presentation:

January 13, 2015

Presentation place:

C-109, 15:00-16:00

Abstract:

A security breach due to mis-identification of an individual poses one of the greatest threats in today’s world. Biometric is a technique that can, substantially minimising these threat. Traditional biometric systems can be applied universally and which is sufficiently robust to changes in acquiring and environmental conditions and also not equipped for distinguishing fake and real data.
In order to mitigate the above mentioned disadvantages and to overcome the addressed challenges, the proposed research was conceived. Among the biometric traits, eye traits are considered as a good choice of biometric. The eye offers a wide range of unique biometric characteristics and greater stability than other biometric traits. Iris combination with sclera pattern is one of the most reliable and user-friendly ocular biometric.
However, in order to establish this proposed concept of liveliness-based multi-modal eye biometrics, combining both iris and sclera, it is first necessary to assess if sufficient discriminatory information for biometric identification can be gained from the sclera patterns individually as well as in combination with the iris patterns. It is also important to investigate its adaptability with respect to the changes in environmental conditions, population, data accruing techniques and time span. To date, sclera biometric has not been extensively studied and so the literature related to sclera biometric is still in its infancy
Consequently, the beginning of this research concentrates on designing an image processing and pattern recognition module for evaluating the potentiality of the sclera biometric in regards to accuracy and its adaptability to changes in environmental conditions, as well as evaluating its performance in combination with the iris pattern and latter half of the research involves with the liveness issue of the mention biometrics.

2014

IberSPEECH 2014: Overview and Selected Papers[Go to top]

Presenter:

Alicia Lozano y Doroteo Torre

Date of presentation:

December 15, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

This talk will consist of two parts. Firstly, an overview of the conference program will be presented. Then, some selected papers will be briefly explained. These papers have been chosen according to research lines that are of special interest within the ATVS group, such as improvements of Probabilistic Linear Discriminant Analysis (PLDA) techniques for speaker recognition tasks, or Deep Neural Networks (DNNs) applied to speech modeling.

Avances en biometría de firma dinámica: habilidades humanas para el reconocimiento y nuevas características discriminantes[Go to top]

Presenter:

Derling Morocho

Date of presentation:

December 10, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

En esta presentación se adelantará la idea de Tesis Doctoral que el ponente tiene en la actualidad.
Por una parte, se pretende desarrollar una infraestructura software que permita la recopilación de información masiva a través de Amazon Mechanical Turk en cuanto a las habilidades humanas de no expertos en el reconocimiento de firmas manuscritas a diferentes niveles: sólo con la imagen de la firma, cuando se considera información dinámica parcial, y considerando información dinámica completa. Dicho desarrollo se utilizará tanto para analizar las habilidades humanas en esta tarea, como para generar nuevas características discriminantes que permitan mejorar los actuales sistemas biométricos de comparación de firmas manuscritas dinámicas.
Por otra parte, también se analizarán los procedimientos y características usados por expertos forenses en la comparación de firmas manuscritas, tanto estáticas como dinámicas, también para generar nuevas características discriminantes que mejoren los sistemas biométricos actuales.

ICPR 2014: Overview and selected works on biometrics[Go to top]

Presenter:

Julian Fierrez, Ram P. Krish, Marta Gómez Barrero, Ester Gonzalez

Date of presentation:

November 17, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

The seminar will begin with a short overview of the conference program, with emphasis on the plenary talks. This will follow by a short summary of selected works related to PhD research being now conducted in the ATVS lab, namely: advances in fingerprint recognition, unconstrained face recognition (including the use of LBP features and quality measures), and direct attacks to iris biometrics.

Construcción de una nueva base de datos para el reconocimiento automático de caracteres manuscritos y generación de resultados de referencia[Go to top]

Presenter:

Sara García Mina

Date of presentation:

October 29, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

In this project a database is acquired, complementary to the MNIST database comprising handwritten characters of the Spanish alphabet. The database will be publicly available for the research community.


After an introduction to recognition of handwriting characters and a study of the state of the art of offline recognition and reference databases, the most representative tools are chosen to carry out the project.


The database is not only acquired but processed, using image processing tools to normalize it. The aim is to get a wide database and baseline results with recognition systems of handwritten characters of the state in the art.


In the experimental section a set of recognition result was generated for the database acquired. The results will be used as a reference to evaluate the performance of future algorithms on handwritten character recognition.


Different experiments were performed, depending on the different kinds of handwritten characters used for each experiment. The different kinds are:

Numbers
Capital letters
Lowercase letters

The gathered results for the experimental section are the error rates of each experiment as well as the confusion matrix and the CMC Curves, which will be explained, analyzed and commented .To complete the experimental section a comparison between the different experiments is performed. Finally, conclusions and future work are exposed.

Forensic comparison of ink color and paper structure[Go to top]

Presenter:

Charles Berger  (Chief Scientist at NFI (Netherland Forensic Institute) )

Date of presentation:

October 28, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

This talk addresses the inference of identity of source in forensic comparison.
In particular the comparison of ink color and paper structure are discussed.
Univariate and bivariate methods are used to assign likelihood ratios.

Short Bio:

Charles Berger was appointed to the chair of Criminalistics at Leiden University in 2011. His field of interest is the logically correct interpretation and use of forensic evidence in police investigations and court proceedings. His research focuses particularly on objective methods for the analysis of evidence and the assignment of (numerical) evidential values. Professor Berger holds a PhD in Applied Physics from the University of Twente, and has extensive international experience, having worked in the USA (Naval Research Labs and the University of California) and France (the University of Bordeaux and the École Normale Supérieure, Paris).

Adaptación de un Sistema de Búsqueda de Palabras Clave al Castellano[Go to top]

Presenter:

Junchen Xu "Chen"

Date of presentation:

October 13, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

La proliferación de contenidos multimedia hoy en día, en especial archivos de audio-vídeo, lleva consigo una nueva necesidad a la hora de localizar piezas específicas de este tipo a través de su contenido. Este proyecto se centra en adaptar al castellano un sistema de búsqueda de palabras clave, herramienta que se presenta como una posible solución a la problemática anterior, entre otros usos.
El sistema completo lo forman dos subsistemas bien diferenciados que realizan las tareas de reconocimiento de voz y reconocimiento de palabras clave, respectivamente. El entrenamiento y desarrollo del sistema se realiza con una base de datos compuesta por un gran volumen de conversaciones telefónicas entre locutores hispanohablantes.
Para comprobar su rendimiento, además de ser sometido a las pertinentes pruebas a nivel interno, el sistema participará también en la evaluación “ALBAYZIN 2014 Search on Speech”.

Estudio de interoperabilidad en sistemas biométricos de firma manuscrita dinámica[Go to top]

Presenter:

Rubén Tolosana Moranchel

Date of presentation:

October 06, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

En este proyecto se estudian, implementan y evalúan sistemas de reconocimiento biométrico de firma dinámica en presencia de firmas procedentes de distintos dispositivos de captura. Para llevarlo a cabo se han utilizado y comparado diversas técnicas del estado del arte en reconocimiento de firma. A su vez se ha realizado un estudio de las diversas técnicas de normalización de datos usadas en el ámbito de reconocimiento biométrico para conseguir un sistema robusto independientemente del dispositivo de captura utilizado para entrenar o testear el sistema.

Como punto de partida del proyecto se ha realizado un estudio de las diferentes técnicas que han ido marcando el estado del arte, haciendo especial hincapié en los sistemas basados en características globales y en los sistemas basados en características locales o funciones temporales.

Una vez entendido el estado del arte desde el punto de vista teórico, el siguiente paso ha sido definir la tarea sobre la que se han evaluado las diferentes técnicas. Históricamente, la tarea principal en evaluaciones de firma dinámica ha consistido en entrenar y testear el sistema con firmas obtenidas de un mismo dispositivo de captura (sin interoperabilidad). En la tarea que hemos llevado a cabo para la realización de este proyecto disponemos de firmas de un mismo usuario obtenidas con distintos dispositivos de captura.

Para la parte experimental se han llevado a cabo tres etapas. Durante la primera etapa el objetivo fue evaluar el rendimiento del sistema de verificación de firma dinámica con y sin interoperabilidad siguiendo el protocolo de las evaluaciones Biosecure Multimodal Evaluation Campaign (BMEC). En la segunda etapa se estudió y se aplicó al sistema con interoperabilidad técnicas de normalización presentes en el ámbito de reconocimiento biométrico con el objetivo de conseguir un rendimiento lo más parecido posible al sistema sin interoperabilidad. En la última etapa se ha aplicado técnicas de selección y fusión de características para obtener un sistema global robusto ante firmas de test provenientes de distintos dispositivos de captura.

Finalmente, se presentan las conclusiones extraídas a lo largo de este trabajo, así como las posibles líneas de trabajo futuro.

Towards predicting good users for biometric recognition based on keystroke dynamics[Go to top]

Presenter:

Aythami Morales Moreno

Date of presentation:

September 22, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

This paper studies ways to detect good users for biomet ric recognition based on keystroke dynamics. Keystroke dynamics is an active research eld for the biometric scienti c community. Despite the great e orts made during the last decades, the performance of keystroke dynamics recognition systems is far from the performance achieved by traditional hard biometrics. This is very pronounced for some users, who generate many recognition errors even with the most sophisticate recognition algorithms. On the other hand, previous works have demonstrated that some other users behave particularly well even with the simplest recognition algorithms. Our purpose here is to study ways to distinguish such classes of users using only the genuine enrollment data. The experiments comprise a public database and two popular recognition algorithms. The results show the e ffectiveness of the Kullback-Leibler divergence as a quality measure to categorize users in comparison with other four statistical measures.

Odyssey 2014 Greatest Hits[Go to top]

Presenter:

Alicia Lozano, Javier Franco y Rubén Zazo

Date of presentation:

September 08, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

Los ponentes presentarán cada uno dos contribuciones relevantes presentadas en Junio 2014 en Joensuu (Finlandia) durante Odyssey 2014, cubriendo las novedades principales en los ámbitos de domain adaptation, i-vector challenge & clustering, calibración trial-based, rendimiento en condiciones forenses, text-dependent y Deep Neural Networks.

Graphical Password-based User Authentication with Free-Form Doodles[Go to top]

Presenter:

Marcos Martinez Díaz

Date of presentation:

September 03, 2014

Presentation place:

C-109, 13:00 horas

Abstract:

This session is a preview of what will be presented by the author at the EAB 2014 award ceremony. We analyze doodle-based graphical passwords, as a behavioral biometric. Verification schemes based on Dynamic Time Warping and Gaussian Mixture Models are proposed. The best performing features are studied using feature selection. Experiments are carried out using the recently presented DooDB database. Results show varying performances depending on the verification scheme, and indicate a possible higher consistency in features related to vertical movements.

Evaluación de características musicales para detección de tipos de audio[Go to top]

Presenter:

Ricardo Landriz

Date of presentation:

September 02, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

La segmentación automática de contenido audiovisual supone un campo de investigación en alza en los últimos años. Tecnologías como la segmentación de audio cobran cada vez más peso en evaluaciones internacionales como la ISMIR o la MIREX.
El objetivo de este estudio es el de desarrollar un sistema capaz de segmentar audio radiofónico en distintas clases acústicas (voz, música, ruido...) utilizando para ello características musicales derivadas del análisis usado habitualmente en reconocimiento de voz e identificación de locutor. Así mismo se propone una fusión con el sistema presentado por el grupo ATVS de la Escuela Politécnica Superior de la UAM en la evaluación ALBAYZIN de Audio Segmentation de 2010.

Laboratorio de Tecnologías de Audio[Go to top]

Presenter:

Sergio Carrero

Date of presentation:

July 17, 2014

Presentation place:

C-109, X 17 Julio, 16:15-16:45

Abstract:

En el presente proyecto se realizará la implementación de diferentes prácticas de laboratorio de la asignatura Tecnologías del Audio, de 4º de grado ITST de la EPS-UAM, con el fin de desarrollar material docente para los alumnos en el campo del procesado de audio musical. Los tres principales bloques que se desarrollan a lo largo del proyecto son:
* Síntesis sustractiva: en este apartado se analiza, estudia e implementa un sintetizador sustractivo con interfaz visual en Matlab.
* Síntesis granular: en este apartado se desarrollan los conceptos de la teoría de síntesis mediante granos, para implementar mediante un API denominada EarSketch sobre Python un sintetizador granular utilizando la Estación de Trabajo de Audio Digital Cockos Reaper.
* Efectos digitales de audio: se desarrollan en Matlab una serie de efectos de procesado de audio para producción musical, analizando su comportamiento en profundidad.

Calibración de puntuaciones procedentes de sistemas biométricos[Go to top]

Presenter:

Sandra Uceda

Date of presentation:

July 16, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

El proyecto está basado en el cálculo de relaciones de verosimilitud para poder dar un peso o un grado de apoyo en la decisión final del juez. Se persigue evitar afirmaciones categóricas para no caer en errores de declarar culpable a un inocente o viceversa. Para lograrlo se estudiará el rendimiento obtenido de los LR calculados mediante tres modelos distintos: Gaussiano, regresión logística y KDF. Se ha analizado la robustez, calibración y precisión de los métodos propuestos, dando difernetes recomendaciones para los distintos escenarios posibles.

Para la realización de este proyecto se han utilizado muestras procedentes de huellas dactilares de casos forenses reales. Con ellas se ha realizado una base de datos.

Comparison of Body Shape Descriptors for Biometric Recognition using MMW Images[Go to top]

Presenter:

Ester González Sosa

Date of presentation:

July 14, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

The use of Millimetre wave images has been proposed recently in the biometric field to overcome certain limitations when using images acquired at visible frequencies. In this paper, several body shape-based techniques were applied to model the silhouette of images of people acquired at 94 GHz. We
put forward several methods for the parameterization and classification stage with the objective of finding the best configuration in terms of biometric recognition performance. Contour coordinates, shape contexts, Fourier descriptors and silhouette landmarks were used as feature approaches and for classification we utilized Euclidean distance and a dynamic programming method. Results showed that the dynamic programming algorithm improved the performance of the system with respect to the baseline Euclidean distance and the necessity of a minimum resolution of the contour to achieve promising equal error rates. The use of the contour coordinates is the most suitable feature to use in the system regarding the performance and the computational cost involved when having at least 3 images for model training. Besides, Fourier descriptors are more robust against rotations, which may be of interest when dealing with few training images.

Clasificación de géneros musicales basada en contenido[Go to top]

Presenter:

Ángel Pérez Lemonche

Date of presentation:

July 10, 2014

Presentation place:

C-109, 13:00-14:00

Abstract:

El reconocimiento de géneros musicales a partir de contenidos es una tarea clásica del área llamada Music Information Retrieval (MIR). Si conocemos el género de una canción partiendo de la información musical que nos proporciona ésta, tendríamos una ayuda a problemas como el indexado automático de contenidos musicales o los sistemas de recomendación.

En este Trabajo de Fin de Grado se construye desde cero un clasificador de género musical basado en información tímbrica. Se utiliza un extractor de características basado en MFCC y un modelado basado en GMM. Se analizará el rendimiento de dicho clasificador variando parámetros y características como el uso de UBM, normalización de puntuaciones, deltas, características frecuenciales (tasa de cruces por cero), etc. Todo ello con una base de datos pública y accesible (Marsyas) para permitir la reproducibilidad de los resultados.

Protected Facial Biometric Templates Based on Local Gabor Patterns and Adaptive Bloom Filters[Go to top]

Presenter:

Marta Gómez Barrero

Date of presentation:

July 07, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

Biometric data are considered sensitive personal data and any privacy leakage poses severe security risks. Biometric templates should hence be protected, obscuring the biometric signal in a non-reversible manner, while preserving the unprotected system’s performance. In the present work, irreversible face templates based on adaptive Bloom filters are proposed. Experiments are carried out on the publicly available BioSecure DB utilizing the free Bob image processing toolbox, so that research is fully reproducible. The performance and security evaluations proof the irreversibility of the protected templates, while preserving the verification performance. Furthermore, template size is considerably reduced.

What are we missing with i-vectors? A perceptual analysis of i-vector-based falsely accepted trials[Go to top]

Presenter:

Joaquín González-Rodríguez

Date of presentation:

June 30, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

Speaker comparison, as stressed by the current NIST i-vector Machine Learning Challenge where the speech signals are not available, can be effectively performed through pattern recognition algorithms comparing compact representations of the speaker identity information in a given utterance. However, this i-vector representation ignores relevant segmental (non-cepstral) and supra-segmental speaker information present in the original speech signal that could significantly improve the decision making process. In order to confirm this hypothesis in the context of NIST SRE trials, two experienced phoneticians have performed a detailed perceptual and instrumental analysis of 18 i-vector-based falsely accepted trials from NIST HASR 2010 and SRE 2010 trying to find noticeable differences between the two utterances in each given trial. Remarkable differences were obtained in all trials under detailed analysis, where combinations of observed differences vary for every trial as expected, showing specially significant differences in voice quality (creakiness, breathiness, etc.), rhythmic and tonal features, and pronunciation patterns, some of them compatible with possible variations across recording sessions and others highly incompatible with the same speaker hypothesis. The results of this analysis suggest the interest in developing banks of non-cepstral segmental and supra-segmental attribute detectors, imitating some of the trained abilities of a non-native phonetician. Those detectors can contribute in a bottom-up decision approach to speaker recognition and provide descriptive information of the different contributions to identity in a given speaker comparison.

Análisis y caracterización de series temporales financieras[Go to top]

Presenter:

Alfredo Serrano

Date of presentation:

June 23, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

El objetivo del proyecto es el del estudio de las series temporales financieras, abordando temas que van desde la caracterización, predicción, visualización y generación sintética. En primer lugar, se ha realizado un análisis de las características más relevantes de las series temporales financieras. Posteriormente, se ha estudiado el estado del arte de los métodos de predicción actuales como son las medias móviles, así como el desarrollo de un nuevo método consistente en la aplicación primero de un algoritmo de reducción de dimensionalidad mediante PCA con el posterior cálculo de medias móviles, con el objetivo de mejorar los resultados ofrecidos por el anterior.

Se ha implementado, además, una interfaz gráfica de usuario con la que poder visualizar, entre otras funcionalidades, las series financieras de la base de datos disponible.

Finalmente, de las conclusiones extraídas en predicción, se estudiará el efecto que tiene su aplicación en la generación de series sintéticas, mediante el generador que se dispone en el grupo ATVS, con la finalidad de ver si se consigue obtener una mayor verosimilitud de las características de las series sintéticas generadas ahora con respecto a las características que presentan las series financieras reales.

Extracción de información en señales de voz para el agrupamiento por locutores de locuciones anónimas[Go to top]

Presenter:

Iván Gómez Piris

Date of presentation:

June 09, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

Uno de los puntos críticos de tecnologías de compensación de variabilidad (como LDA y PLDA) comúnmente utilizadas en los sistemas de reconocimiento de locutor es que necesitan el conocimiento de etiquetas de identidad de locutor. Históricamente, en las evaluaciones NIST SRE (NationaI Institute of Standars and Technology Speaker Recognition Evaluation) los miles de archivos de audio que se entregaban para el desarrollo de los sistemas de locutor estaban etiquetados con la identidad del locutor. Sin embargo, no siempre va a existir dicha disponibilidad de una base de datos de audio etiquetada. Para el desarrollo de nuevas aplicaciones en las que se necesite recoger una gran cantidad de audio, el trabajo que implica etiquetar todo el audio de manera manual puede alargar extremadamente el tiempo y los costes necesarios para la implementación de dichas aplicaciones.
Este trabajo se centra en la situación de no tener audio con etiquetas de identidad de locutor y en su extracción de manera automática para el aprovechamiento de técnicas de compensación de variabilidad. La solución propuesta se basa en la obtención de agrupaciones por locutores, mediante clustering (tomando como observaciones los i-vectors de las locuciones), que permitan una correcta estimación de los subespacios de variabilidad, permitiendo así alcanzar rendimientos similares a los que se obtendrían si se dispusieran de etiquetas de locutor conocidas.

International Workshop on Biometrics and Forensics, IWBF 2014: Scientific Report[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

June 04, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

This seminar is in fact a description of the International Workshop on Biometrics and Forensics 2014, hold in Valetta, Malta, 27 and 28 March. Firstly, a note on the importance of presenting a report on the conference where ATVS is present is given, in order to foster these kind of persentations in future ATVS seminars. Second, the conference will be described from the scientific point of view, and the contributions that can be currently interesting to ATVS will be highlighted.

Cognitive approach for Signature Generation[Go to top]

Presenter:

Miguel Ángel Ferrer

Date of presentation:

May 28, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

The talk will describe a procedure for generating synthetic handwritten signature images imitating the mechanism of motor equivalence which divides human handwriting into two steps: the working out of an effector independent action plan and its execution via the corresponding neuromuscular path. The proposed method allow to generate synthetic signatures containing text and flourish, if there is one. Also it is possible to generate forgeries. An ink deposition model, applied pixel by pixel to the pen trajectory, provides realistic static signature images. The lexical and morphological properties of the synthesized signatures as well as the range of the synthesis parameters have been estimated from real databases of real signatures such as the MCYT Off-line and the GPDS960GraySignature corpuses. The performance and perceptual experiment show the realism of the signatures synthetized. The utility of the synthesized signatures is demonstrated by studying the influence of the pen type and number of users on an automatic signature verifier.

Short Bio:

Miguel A. Ferrer received the M.Sc. degree in telecommunications, in 1988, and his Ph.D. degree, in 1994, each from the Universidad Politécnica de Madrid, Spain. He belongs to the Digital Signal Processing research group (GPDS) of the research institute for technological development and Communication Innovation (IDeTIC) at the University of Las Palmas de Gran Canaria in Spain where since 1990 he has been an Associate Professor. His research interests lie in the fields of computer vision, pattern recognition, biometrics, mainly those based on hand and handwriting, audio quality, mainly for health and condition machinery evaluation and vision applications to fisheries and aquaculture.

Deep Neural Networks for Speaker and Language Identification[Go to top]

Presenter:

Ignacio López-Moreno & Javier González-Domínguez

Date of presentation:

May 14, 2014

Presentation place:

C-109, 15:30-16:30

Abstract:

In this talk we present recent Google efforts* in both speaker and
language identification by using Deep Neural Networks (DNNs).
Motivated by their recent success in acoustic modeling, we adapt DNNs to the
problem of identifying the language/speaker of a given spoken utterance
from short-term acoustic features. Results on Google 5M LID corpus and
NIST LRE 2009 show how language identification can largely benefit
from using DNNs (up to 70% of improvement), especially when a
large amount of training data is available. Results on speaker verification,
while preliminary, also show the benefits of using deep neural networks
combined with classical approaches, such as i-vector based systems.

*published in ICASSP 2014 conference:

Automatic Language Identification using Deep Neural Networks
Ignacio Lopez Moreno (Google Inc., USA); Javier Gonzalez-Dominguez (Universidad Autónoma de Madrid, Spain); Oldrich Plchot (Brno University of Technology, Czech Republic); David Martínez González (University of Zaragoza, Spain); Joaquin Gonzalez-Rodriguez (Universidad Autonoma de Madrid, Spain); Pedro J Moreno (Google, Inc., USA)

Deep Neural Networks for Small Footprint Text-Dependent Speaker Verification
Ehsan Variani (Johns Hopkins University, USA); Xin Lei (Google Inc., USA); Erik McDermott (Google Inc., USA); Ignacio Lopez Moreno (Google Inc., USA); Javier Gonzalez-Dominguez (Universidad Autónoma de Madrid, Spain)

Short Bio:

Ignacio Lopez-Moreno received his M.S. degree in Electrical Engineering in 2009 from Universidad Politecnica de Madrid (UPM). He is currently pursuing his PhD degree with the Biometric Recognition Group - ATVS at Universidad Autonoma de Madrid, where he is working as an assistant researcher since 2004. He has participated several national projects and technology evaluations, such as NIST speaker and language recognition evaluations carried out since 2005. His research interests include speaker verification, language identification, pattern recognition, speech processing, statistics and forensic evaluation of the evidence. He has been recipient of several awards and distinctions, such as the IBM Research Best Student Paper in 2009. After two summer research stays in the Speech Group at Google Research, he is since 2012 permanent staff in the Google Speech group in New York.

Javier Gonzalez-Dominguez received his M.S. degree in Computer Science in 2005 from Universidad Autonoma de Madrid, Spain. In 2005 he joined Biometric Recognition Group - ATVS at Universidad Autonoma de Madrid (U.A.M) as a Ph. D. student. In 2007 he obtained the postgraduate Master in Computer Science and Electrical Engineering from U.A.M and received a FPI research fellowship from Spanish Ministerio de Educacion y Ciencia. His research interests are focused on robust speaker and language recognition. He has been recipient of several awards as the Microsoft Best student paper at SIG-IL 2009 conference and fellowships. Javier Gonzalez-Dominguez has actively participated and led several ATVS systems submitted to the NIST speaker and language evaluation recognition since 2006. During his Ph.D. pursuit he has been member of several research sites as SAIVT-QUT (2008, Brisbane, Australia), TNO (2009, Utrecht, The Netherlands) and Google Inc. Research (2010 New York, U.S.A). Since 2012, he is a Teaching Assistant (Profesor Ayudante Doctor) at UAM. During the 2013-2014 term, he is in a resarch stay at the Speech Group in Google Research New York.

Biometría de la Mano: más allá de la huella dactilar[Go to top]

Presenter:

Aythami Morales

Date of presentation:

May 05, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

La mano es una de las regiones del cuerpo humano más ricas en lo que a información biométrica para reconocimiento de personas se refiere. Dentro de estos rasgos destaca la huella dactilar, pero en los últimos 15 años se han estudiado y analizado muchos otros como la textura de la palma, la textura de los dedos, la geometría, el patrón vascular, etc.... El seminario presentará las principales contribuciones de Aythami Morales a través de un recorrido por estos 15 años de estado del arte.

Short Bio:

Aythami Morales recibió el título de Doctor por la Universidad de Las Palmas de Gran Canaria en 2011 con la tesis titulada: Estrategias para la Identificación de Personas Mediante Biometría de la Mano Sin Contacto.

Speech and handwriting biometrics: exploring some cross-modality commonalities[Go to top]

Presenter:

Carmen García Mateo

Date of presentation:

April 28, 2014

Presentation place:

C109, 15:00-16:00

Abstract:

In this talk I will talk about how to apply for writer verification some
state-of-art techniques currently used in the field of speaker and language
verification. GMM-UBM and i-vector modelling and scoring approaches are
explored. Regarding feature extraction, SIFT and SURF keypoint descriptors
are used and compared. Results are presented using IAM handwriting database.

Short Bio:

Prof. Carmen Garcia Mateo is the principal investigator of the Multimedia
Technologies Group (GTM) at the AtlantTIC Research Centre of the University
of Vigo (http://gtm.uvigo.es). She is Full Professor at the “Escuela de
Ingeniería de Telecomunicación, Universidade de Vigo”. Her research
interests are focused on speech technology: speech and speaker recognition,
audio segmentation and multibiometrics. She has published over 100
international contributions, including book chapters, refereed journal and
conference papers. She and her group have an extensive experience in the
development of software for multimedia applications.

Template Protection for Face Biometrics Based on Adaptive Bloom Filters[Go to top]

Presenter:

Marta Gómez Barrero

Date of presentation:

March 31, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

Biometric data are considered sensitive personal data and any privacy leakage poses severe security risks. Biometric templates should hence be protected, obscuring the biometric signal in a non-reversible manner, while preserving the unprotected systems performance. In the present work, irreversible face templates based on adaptive Bloom filters are proposed. Experiments are carried out on the publicly available BioSecure DB utilizing the free Bob image processing toolbox, so that research is fully reproducible. The performance and security evaluations proof the irreversibility of the protected templates, while preserving the verification performance. Furthermore, template size is considerably reduced.

Latent fingerprint pre-alignment based on orientation-field[Go to top]

Presenter:

Julián Fierrez Aguilar

Date of presentation:

February 24, 2014

Presentation place:

C-109, 15:00-16:00

Abstract:

Forensic fingerprint identification deals with some unique challenges that makes its full automation impractical with current technology, specially when dealing with partial latent prints of low image quality. In such cases, a human expert usually marks manually several minutiae which are then used to align the latent to the impression under comparison from the suspect database. In this speech, we will present the ongoing research in ATVS aimed at developing robust fingerprint pre-alignment methods than can work even with low quality partial prints. For that purpose, the pre-alignment methods considered only use information from the orientation field of the image, which can be estimated in a very robust way. Initial results show that incorporating such methods as initial step in commercial matchers can improve their performance significantly.

Información de locutor basado en características glotales sobre unidades lingüísticas[Go to top]

Presenter:

Ignacio Rodríguez Ortega

Date of presentation:

January 13, 2014

Presentation place:

C-109, 13:00-14:00

Abstract:

En este estudio pretendemos obtener información glotal que clasifique a los locutores según su cualidad vocal. El primer paso es obtener el pulso glotal a partir de la señal de voz ya segmentada en fonemas, para lo cual escogeremos entre distintos algoritmos con el criterio de hacer nuestro sistema lo más robusto y estable. Más tarde extraeremos de la señal glotal ciertos parámetros siguiendo el modelo LF. Por último se realiza un análisis inter/intra variabilidad con 200 locutores de la base de datos TIMIT.

2013

Medidas de Similitud de Audio para Recuperación de Información Musical[Go to top]

Presenter:

Ricardo Landriz

Date of presentation:

December 16, 2013

Presentation place:

C-109, 13:00-14:00

Abstract:

Abstract pending

Participación en los proyectos recientes de lofoscopia forense de ATVS[Go to top]

Presenter:

Sandra Uceda y Fátima Garcia Donday

Date of presentation:

December 11, 2013

Presentation place:

C-109, 15:30-16:30

Abstract:

-Motivacion: el proyecto de huella de DGGC
-Punto de partida: la base de datos original
-Trabajo realizado: es una explicación detallada del proceso que hemos llevado a cabo para llegar desde lo que teníamos en un principio, a la base de datos que conseguimos finalmente (procesado de imágenes, marcado de minucias, etc)
-Herramientas con las que hemos trabajado
-Resultado final del trabajo
- Pequeña demostración en directo del funcionamiento de las herramientas

Dealing with variability factors and its application to biometrics at a distance[Go to top]

Presenter:

Pedro Tomé

Date of presentation:

December 02, 2013

Presentation place:

C-109, 13:00-14:00

Abstract:

This Thesis is focused on dealing with variability factors in biometric recognition and applications of biometrics at a distance. In particular, this PhD Thesis explores the problem of variability factors assessment and how to deal with them by the incorporation of soft biometrics information in order to improve person recognition systems working at a distance. The proposed methods supported by experimental results show the benefits of adapting the system considering the variability of the sample at hand.

Although being relatively young compared to other mature and long-used security technologies, biometrics have emerged in the last decade as a pushing alternative for applications where automatic recognition of people is needed. Certainly, biometrics are very attractive and useful for videosurveillance systems at a distance, widely distributed in our lifes, and for the final user: forget about PINs and passwords, you are your own key. However, we cannot forget that as any technology aimed to provide a security service, biometric systems should ensure a reliable performance in any scenario. Thus, it is of special relevance to understand and analyse the variability factors to which they are subjected in order to ensure a suitable performance and increase their benefits for the users.

In this context, the present PhD Thesis gives an insight into the difficult problem of variability factors evaluation through the systematic study of biometric scenarios at a distance and the analysis of effective compensation methodologies that can minimize the effects of them. Pursuing the aim to increase the performance of the remote person recognition in this thriving technology. This way, the experimental studies presented in this Dissertation can help to further develop the ongoing variability compensation efforts, and may be used as guidelines to adapt the existing systems in biometric at a distance and make them more secure and stable.

The problem of variability compensation in biometric systems had already been addressed in some previous works, but in most cases not using the acquisition distance related with the variability factors in order to identify and define scenarios. In this Dissertation, after summarizing and classifying the most relevant works related to the Thesis and defining what we understand as scenario at a distance, we describe methods applied throughout the experimental chapters. These experimental chapters are dedicated first to the study of variability factors (scenario analysis), and then to the application of the proposed techniques to deal with them (soft biometrics and adaptive fusion). All experiments are conducted using standard biometric data and benchmarks.

The experimental part of the Thesis starts a the scenario evaluation of the variability factors found in face recognition systems. We evaluate, between others, the relationship between variability factors and the acquisition distance in this kind of systems, the variability of facial landmarks in mugshot and CCTV images, and the performance variability of different facial regions of the human face on various forensic scenarios at a distance. In addition to be a useful background information that can guide and help experts to interpret and evaluate face evidences, findings can have a significant impact on the design of face recognition algorithms.

We then study various types of soft biometric information available in biometrics at a distance suitable for videosurveillance and forensics applications. These soft labels can be visually identified at a distance by humans (or an automatic system) and their discriminative information will vary depending on the distance. It is worth noting that this relation between scenarios at a distance and the performance of soft biometrics for person recognition has not been studied in this way before. Moreover, the largest set of morphological facial soft biometric features extracted following forensic protocols is also introduced and evaluated. The experimental results using this set of features show that a system that is completely based on facial soft biometrics features for forensics is feasible.

Finally, we study experimentally various types of adaptive fusion exploiting soft biometrics. In particular, we study: scenario-based, soft biometrics-based, facial regions-based, and color facial regions-based schemes of score-level fusion and their benefits in systems at a distance. The proposed adaptive fusion schemes achieve notable improvements demonstrating their utility in biometrics at a distance.

The research work described in this Dissertation has led to novel contributions which include the development of two new methods to deal with variability factors in biometrics systems at a distance, namely: i) soft biometrics suitable for videosurveillance and forensics, and ii) adaptive fusion schemes at score-level based on scenario acquisition, soft biometrics, facial regions, and color facial regions. Moreover, different original experimental studies have been carried out during the development of the Thesis (e.g., relation between scenarios at a distance and variability factors). Besides, the research work completed throughout the Thesis includes the generation of various literature reviews and the generation of new biometric resources.

Construcción de una base de datos para el reconocimiento automático de caracteres manuscritos y generación de resultados de referencia[Go to top]

Presenter:

Sara García Mina

Date of presentation:

November 25, 2013

Presentation place:

C-109, 13:00-14:00

Abstract:

Para el desarrollo de sistemas de reconocimiento automático de caracteres manuscritos es fundamental la disponibilidad de grandes bases de datos que contengan una suficiente variabilidad entre muestras representativa de las distintas formas de escribir de las personas. Además, el número de muestras debe ser lo bastante elevado para poder obtener resultados estadísticamente significativos sobre el rendimiento de los sistemas de reconocimiento. Actualmente hay varias bases de datos públicas para ser utilizadas por los investigadores en el campo del reconocimiento de caracteres manuscritos. La más utilizada, por su tamaño y porque se ha convertido en el benchmark de facto para evaluar estos algoritmos, es la base de datos MNIST. Sin embargo, esta base de datos presenta varias limitaciones. En primer lugar, sólo contiene muestras de los 10 dígitos y no del conjunto de caracteres del alfabeto occidental. Además, todas las muestras fueron producidas por usuarios de cultura angloparlante, lo que puede producir resultados sesgados al intentar utilizar los resultados obtenidos sobre ella sobre datos capturados de personas con distinta ascendencia (por ejemplo, latina). Por último, los sistemas de reconocimiento de caracteres manuscritos están iniciando a saturar en esta base de datos, con tasas de error por debajo del 0.1%.

En este contexto, en el presente PFC, y en la ponencia del café, se presentará una nueva base de datos de caracteres que complementa a MNIST (en cuanto a formato y estructura) al tiempo que intenta mejorar sus limitaciones: presencia de todos los caracteres del alfabeto occidental (dígitos, mayúsculas y minúsculas), captura de datos de usuarios de ascendencia latina (española) y presencia de distintos escenarios de entrenamiento y test (controlado y no controlado) que hacen que la base de datos sea más exigente. Además de describir el proceso de diseño, adquisición y procesado de la base de datos, se presentarán también resultados iniciales que sirvan como baseline para la base de datos, que sirvan para futura comparación con otros algoritmos de reconocimiento de forma que se pueda establecer una clara evolución del state-of-the-art al estilo de lo que ha sucedido con MNIST:

http://yann.lecun.com/exdb/mnist/

Además, la presentación contará con una demostración práctica de cómo funciona la herramienta biógrafo, desarrollada en el ATVS y que se utilizó para la captura de los datos.

Análisis y caracterización de series temporales financieras[Go to top]

Presenter:

Alfredo Serrano

Date of presentation:

November 18, 2013

Presentation place:

C-109

Abstract:

Contenido:
- Introducción a las series temporales financieras y a sus principales características
- Errores cometidos por dos métodos diferentes de predicción
- Representación de gráficas por medio de una interfaz gráfica de usuario

Weighted Complex Spectral Minutiae Representation for Forensic Fingerprint and Palmprint Comparison[Go to top]

Presenter:

Ruifang Wang

Date of presentation:

October 21, 2013

Presentation place:

C-109, 13:00-14:00

Abstract:

Large non-linear distortion existing in finger/palm marks compared to fingerprints/palmprints has been a most important concern when designing forensic fingerprint/palmprint recognition system. In this work, we first present a new method of distortion assessment based on various window shapes (i.e., rectangle, ellipse, circle) at feature level using the information of paired minutiae, including position, direction and corresponding labels. Then inspired by the observation using the circle window on forensic fingerprint databases NIST SD27 and NFI DB58, we propose a new weighted complex spectral minutiae representation (Weighted-SMC) incorporating weights of position and direction for each minutia spectra. Finally, we propose a new comparison scheme for forensic fingerprint and palmprint recognition based on the proposed Weighted-SMC. We evaluate the Weighted-SMC comparator on forensic fingerprint and palmprint databases. For fingerprints, a rank-1 identification rate of 94.83% is achieved on NFI DB58 which contains 58 pairs of finger marks and fingerprints, while a rank-1 identification rate of 72.87% on NIST SD27 which contains 258 pairs of finger marks and fingerprints. On a forensic palmprint database including 22 palm marks and 100 full palmprints, a rank-1 identification rate of 86.36% is achieved.

Análisis de parámetros nasales en unidades lingüísticas para el reconocimiento de locutor [Go to top]

Presenter:

Fernando Espinoza

Date of presentation:

October 14, 2013

Presentation place:

C-109, 13:00-14:00

Abstract:

In recent years, automatic speaker recognition systems based on GMM-UBM, i-vectors or PLDA have demonstrated go.od performance for text-independent speaker recognition and high robustness on speaker and session variability. Similarly, the usage of linguistic units (high level features) on speaker recognition has demonstrated desired properties as high discrimination and large power on fusing with short-time spectral systems. However, these methods used on automatic speaker recognition differ from human speaker indentificaction since the latter use the perceptual speaker identification (PSI). The perceptual speaker identification are related to the speaker individualities presented on speech. The accuracy of perceptual speaker identification performance depends on what types of sound are presented to the listeners. For example, the listerners can identify speakers more accurately whe vowels and voiced consonants are presented to them. Specifically, the availability of liquids in English

and nasal consonants. The latter was consistently effective for PSI due to greater inter- variability and smaller intra-variability on speaker. In this sense, there are cues of speaker individualities due to the nasalization. Therefore, several researchers have found a number of acoustical and perceptual correlates of nasality. Hence, a set of acoustic parameters (AP) has been proposed to capture the acoustic correlate and to measure the degree of nasalization. Therefore, many front-end has been developed based on the analysis of AP, being the most important references of this Master’s Thesis: automatic detection of vowel nasalization, nasality measures for speaker recognition data selection and performance prediction and clinical assesment of nasal speech quality.

Therefore, this Master’s Thesis aims to explore the set of acoustic parameters in phonetic units and to analyse their levels of discriminability, regarding inter-variability and intra-variability, that might contribute to the improve of the speaker identification.

Feature Extraction for Biometric Recognition using Millimetre-Wave Images[Go to top]

Presenter:

Ester González

Date of presentation:

October 07, 2013

Presentation place:

C-109

Abstract:

The use of Millimetre-Wave images (MMW) has been proposed recently in
the biometric field aiming to overcome certain limitations when using
images acquired at visible frequencies. In this work, a body-shape
based biometric system has been developed using the information of the
human contour extracted from MMW images. Images are extracted from the
BIOGIGA, a synthetic database which simulates the effect of the 94 GHz
radiation over the human body. The images of this database are
obtained from real measures from 50 people.

We propose several methods for the parameterization and classification
stage with the objective of finding the best configuration of the
biometric-system. Among the methods proposed for the parameterization
stage we find: the contour coordinates, which constitutes the baseline
technique; shape contexts descriptor which uses a log-polar histogram
to describe the relative situation of all the points within the shape
with respect to an specific point; Fourier Descriptors, that applies
the Fourier transform to the contour coordinates; and finally
landmarks, a reduced set of points which describes some singular
points within the human-body shape. In the classification stage, we
use two methods: a naive classifier, the Euclidean distance (ED) and a
classifier based on dynamic programming, dynamic time warping
algorithm (DTW).

Several experiments are carried out with the objective of selecting
the most discriminative feature set and classifier from all the
proposed approaches. The experiments are developed following two main
types of protocols, depending on the number of images use per person
in the training and evaluation stage.

The results show to what extent the DTW improves the performance of
the system with the respect to the baseline Euclidean distance and the
necessity of a high resolution of the contour in order to obtain a
good performance of the system. The use of the contour coordinates is
the most suitable feature to use in the system regarding the
performance and the computational cost involved. Even though the
results obtained with more complex features such as shape contexts or
Fourier descriptors are quite reasonable, their computational cost
makes them less appropriate for practical scenarios.

Automatic Language Recognition using Deep Neural Networks[Go to top]

Presenter:

Alicia Lozano

Date of presentation:

September 23, 2013

Presentation place:

C-109

Abstract:

In recent years, deep learning has been arisen as a new paradigm within machine learning field. In particular, Deep Neural Networks (DNNs) are an important part of this new paradigm. This set of architectures has properties that make them suitable for difficult tasks among which it can be highlighted automatic language recognition (or Spoken Language Recognition, SLR). Their capability to model complex functions in high-dimensional spaces and to get a good representation of the input data makes these architectures and algorithms proper for processing complex signals as, for instance, the voice signal. Thereby, they can be used as a technique to provide an automatic way to distinguish the language that has been used in a specific segment of speech.

This Master Thesis is intended to provide a new approach that, combining both deep learning and automatic language recognition fields, improves the SLR task by getting a better represen- tation of voice signals for classification purposes so that it can be identified which language has been used in that voice signal.

In order to do this, both DNNs and state-of-art SLR systems have been studied thoroughly. Firstly, it has been reviewed the application of DNNs to speech recognition tasks. Then, con- volutional deep neural networks, in particular, have been adapted to the language recognition problem and their performance has been evaluated on a challenging dataset such as NIST LRE 2009 (National Institute of Standards and Technology Language Recognition Evaluation).

Although some results do not always outperform the reference system that has been considered in the experimental part of this work, the new approach based on DNNs can be seen as a starting point to improve current SLR systems.

Análisis Multifactor de Series Temporales Financieras mediante Descomposición en Subespacios[Go to top]

Presenter:

Álvaro Diéguez

Date of presentation:

September 16, 2013

Presentation place:

C-109

Abstract:

Resumen:
Un objetivo de la ingeniería financiera es el análisis de las series temporales financieras. Para poder realizar análisis, estimaciones y predicciones más precisas se hace necesario el uso de modelos econométricos. Debido a las debilidades de los modelos econométricos clásicos, se hace necesario incluir una serie de factores de riesgo y se desarrollan los modelos multifactor. Estos factores son variables cuyo cambio afecta al valor de los activos de alguna forma. Modelar estos factores y estas dependencias puede ayudar a una mejor estimación de los mismos.
La finalidad, por tanto, del Trabajo Fin de Master de Álvaro Diéguez Sánchez-Largo ha sido
la implementación de un sistema de análisis multifactor que, mediante el uso de subespacios,
permita aportar información sobre la serie que no se puede extraer directamente. También se
ha desarrollado un marco de optimización que, haciendo uso de técnicas de procesado de señal
(como el Filtro Kalman) y, una técnica, que fusiona diferentes sistemas, se ha mejorado
la estimación de exposición de las series.


Abstract:
A financial engineering objective is to analyze financial time series. Using econometric models is necessary in order to carry out analysis and more accurate estimates and predictions. Due to the weaknesses of traditional econometric models, it is necessary to include a number of risk factors so multifactor models are developed. These factors
are variables which change affects the value of assets in any way. Modeling these factors
and dependencies can help estimate future losses.
The purpose, therefore, of this Master’s Thesis is implementing a multifactor analysis system
that, using subspace decomposition provides good quality information on the series that couldn’t
be extracted directly. We also developed an optimization framework, that making use of Signal
Processing techniques (such as Kalman Filter) and, a technique, that fuse different systems,
had improved the estimated exposition of the series.

Reconocimiento Facial en el Ámbito Forense[Go to top]

Presenter:

Maya Binetskaya

Date of presentation:

September 09, 2013

Presentation place:

C-109

Abstract:

En este proyecto se estudia, desarrolla y evalúa un sistema biométrico de reconocimiento facial forense en entornos controlados, basado en el análisis morfológico del rostro humano siguiendo los protocolos utilizados por laboratorios forenses como la Dirección General de la Guardia Civil y Netherlands Forensic Institute. El proceso de desarrollo se puede dividir en tres fases de trabajo: i) pre-procesado necesario de las imágenes (con el objetivo de eliminar todas las fuentes de ruido), ii) caracterización morfológica de todos los rasgos faciales, y finalmente iii) extracción de dos grandes conjuntos de características continuas y discretas utilizadas para el reconocimiento biométrico y evaluadas mediante diferentes medidas de similitud. La parte experimental se estructura en dos fases, en la primera se analizan en detalle las características propuestas y en la segunda se evalúa su desempeño en las bases de datos utilizadas demostrando su viabilidad para el reconocimiento de personas.

Reconocimiento facial en tiempo real[Go to top]

Presenter:

Javier Eslava Ríos

Date of presentation:

July 15, 2013

Presentation place:

C-109

Abstract:

En este proyecto se estudia, implementa y evalúa un sistema completo de reconocimiento facial en tiempo real. Para llevarlo a cabo, se han utilizado diversas técnicas del estado del arte de reconocimiento facial y estudiado su aplicación a sistemas de vídeo aprovechando así las ventajas que el vídeo puede ofrecer. En el diseño del proyecto se han seguido dos líneas de trabajo organizadas en el tiempo, primero se implementó un sistema off-line de reconocimiento facial basado en librerías OpenCV/C++. Posteriormente se adaptó y mejoró el diseño e implementación para generar un sistema en tiempo real (on-line) que aprovecha las ventajas de las secuencias de vídeo. El proyecto culmina con la generación de un sistema final (demostrador/prototipo) totalmente funcional disponible en el laboratorio de investigación donde se ha desarrollado dicho proyecto.

Deep Learning: Motivation, Theory and Applications[Go to top]

Presenter:

Javier González Domínguez

Date of presentation:

July 08, 2013

Presentation place:

C-109

Abstract:

During the last years, Deep Learning has emerged as a new methodology causing a great impact within the machine learning research community. As a key factor of success, the main concept behind all Deep learning approaches is the automated discovery of data abstraction through different information levels. That is, avoiding hand-crafted representations while yielding multiple levels of representation, with higher-level features representing more abstract aspects of the data. The use of deep architectures has lead to breakthroughs in several machine learning areas, such as computer vision, speech or image recognition; reducing previous state-of-the-art error rates by 30% to 50% on well-known and difficult to beat benchmarks. In this talk, the motivation, theory and some of the most significant advances in Deep Learning will be reviewed.

Biometric Security: A New Multimodal Hill-Climbing Attack[Go to top]

Presenter:

Marta Gómez Barrero

Date of presentation:

July 01, 2013

Presentation place:

C-109

Abstract:

As any technology aimed to provide a security service, biometric systems are exposed to external attacks which could compromise their integrity. Thus, it is of special relevance to understand the threats to which they are subjected and to analyse their vulnerabilities in order to prevent possible attacks and increase their benefits for the users.

In this work, new indirect attacks based on hill-climbing algorithms against both unimodal and multimodal systems have been proposed. Their performance has been thoroughly analysed on systems based on face and iris, working on widespread multimodal databases, and compared to existing state-of-the-art algorithms.

The results show to what extent the proposed techniques affect the security offered by the biometric systems tested, and the necessity of new measures to counterfeit these types of threats.

Face Recognition from Still Images to Video Sequences[Go to top]

Presenter:

Rubén Vera Rodríguez

Date of presentation:

June 24, 2013

Presentation place:

C-109

Abstract:

This talk will review some of the recent advances in the field of face recognition at a distance. First, a robust method for still images face recognition “Multi-Region Probabilistic Histograms” will be described in depth. This method has been shown to provide good performance for face recognition under several concurrent and uncontrolled factors, such as variations in pose, expression, illumination, resolution, as well as scale and misalignment problems. Second, an extension of this method for video-based face recognition will be covered comparing different options with the application of recognition in CCTV surveillance systems in mind.

The ATVS System at NIST OpenKWS (KeyWord Search) 2013 Evaluation[Go to top]

Presenter:

Doroteo Torre Toledano

Date of presentation:

June 17, 2013

Presentation place:

Abstract:

With the exponential growth of multimedia information the need for efficient search on speech is growing in applications such as Internet search, call center quality assurance and even homeland security. In this context, the National Institute of Standards and Technology has launched a new series of competitive evaluations called Open Keyword Search (OpenKWS) evaluations. In this talk I will give an introduction to the field of speech search, to the NIST OpenKWS Evaluation series, and will present the work that we carried out in collaboration with CSTL at Tsinghua University (Beijing) and HTCLab-UAM to build a system for searching keywords in a new language (Vietnamese) in less than a month and successfully process the 3 days of speech provided as test for searching 1000 terms in only 4 days.

Evaluación MOBIO de reconocimiento de locutor en entornos móviles[Go to top]

Presenter:

Rubén Zazo

Date of presentation:

June 10, 2013

Presentation place:

Abstract:

La evaluación oganizada por IDIAP para ICB2013 combina reconocimieno de cara con reconocimiento de locutor independiente de texto. En este café se analizará el desempeño de ATVS en la tarea de reconocimiento de locutor así como un resumen de la tarea en general. La base de datos que se ha utilizado, MOBIO, resulta especialmente interesante ya que combina la grabación utilizando dispositivos móviles con una gran variabilidad en duración.

Ensayos Pre-ICB: (1) Variations of Handwritten Signatures with Time: A Sigma-Lognormal Analysis, (2) Formant Trajectories in Linguistic Units for Text-Independent Speaker Recognition[Go to top]

Presenter:

(1) Marta Gómez Barrero y (2) Javier Franco Pedroso

Date of presentation:

May 27, 2013

Presentation place:

C-109

Abstract:

Abstract (1)

The variation of dynamic signatures with time is analysed for the first time using the Kinematic Theory, following a general, consistent and fully reproducible protocol. Experiments are carried out on a new long-term database captured in 6 sessions uniformly distributed over a 15 month
time span, under almost identical conditions. Signatures are represented with the Sigma Log-Normal model, which takes into account the effects of body ageing closely related to handwriting, such as neuromuscular response times. After studying the evolution of signatures with time, an analysis on age groups based on the model parameters is carried out.

Abstract (2)

Inspired by successful work in forensic speaker identification, this work presents a higher level system for text-independent speaker recognition by means of the temporal trajectories of formant frequencies in linguistic units. Feature extraction from unit-dependent trajectories provides a very flexible system able to be applied in different scenarios. At a fine-grained level, it is possible to provide a calibrated likelihood ratio per linguistic unit under analysis (extremely useful in applications such as forensics), and at a coarse-grained level, the individual contributions of different units can be combined to obtain a more discriminative single system with high potential for combination with short term spectral systems. With development data being extracted from NIST SRE 2004 and 2005 datasets, this approach has been tested on NIST SRE 2006 1side-1side task, English-only male trials, consisting of 9,720 trials from 219 speakers. Remarkable results have been obtained for some single units from extremely short segments of speech, and the combination of several units leads to a relative improvement of 17.2% on EER when fusing with an i-vector system.

Exploiting user differences in biometrics with an application to keystroke verification[Go to top]

Presenter:

Julian Fierrez

Date of presentation:

May 20, 2013

Presentation place:

Abstract:

Independently of how good the features or the modelling are in biometric recognition, there are usually different kind of users in terms of behavior against impostors, or consistency with respect to their own client data. This effect results in varying recognition performance depending on the user at hand, which may be significantly large in some cases like the ones involving behavioural biometrics. The talk will briefly review how this topic has been addressed in the literature, will picture potential applications of an adequate modelling of this factor, and will study this effect in a practical scenario with keystroke biometrics.

Making Machines Understand Us in Reverberant Rooms, por T. Yoshioka et al.[Go to top]

Presenter:

Joaquín González  (ATVS-UAM)

Date of presentation:

May 13, 2013

Presentation place:

Abstract:

Este articulo, aparecido en el IEEE Signal Processing Magazine de noviembre de 2012, es un excelente resumen sobre el estado del arte en el complejo problema de la lucha contra los efectos de la reverberación en sistemas de reconocimiento automático de habla. Así, tras revisar brevemente las bases de los sistemas de reconocimiento, y presentar algunos elementos básicos de Acústica de recintos, se buscarán las causas por las que las técnicas habituales de compensación de ruido y canal (MLLR, PMC, VTS) no son suficientes para el problema de la reverberación (siendo este otro "canal"). Una vez justificada la necesidad de técnicas específicas para combatir la reverberación, se expondrán las distintas alternativas propuestas, muchas de ellas muy recientemente, para tratar el problema tanto en el front-end como en el back-end de los sistemas de reconocimiento.

El papel del fonetista en el análisis de voces con fines forenses[Go to top]

Presenter:

Juana Gil  (CSIC)

Date of presentation:

May 06, 2013

Presentation place:

Abstract:

En el ámbito del análisis del habla con propósitos legales, cada vez es mayor el número de expertos que defiende la conveniencia de aplicar enfoques de naturaleza "híbrida", esto es, aquellos basados tanto en las aproximaciones de carácter puramente lingüístico-fonético (ya sea acústico, articulatorio o perceptivo), como en la utilización de medios automáticos de reconocimiento y cotejo. En este encuentro, se tratará, pues, de exponer a un público predominantemente no lingüista cuáles son las aportaciones con las que un fonetista puede contribuir a un análisis mixto de este tipo, en cada una de sus fases: la caracterización de un hablante, la descripción de una voz y la comparación de voces con fines forenses.

Short Bio:

Juana Gil es actualmente la directora del Laboratorio de Fonética del Instituto de Lengua, Literatura y Antropología (ILLA) del Consejo Superior de Investigaciones Científicas (CSIC). Con anterioridad fue profesora en la Universidad Autónoma de Madrid y en la Universidad Nacional de Educación a Distancia, de la que sigue siendo Titular. Desde 2011 es miembro del profesorado de la Graduate School de Purdue University, EE.UU. Sus campos de interés son la relación fonética-fonología y algunas de las aplicaciones de la fonética, en concreto la fonética judicial y la didáctica de la pronunciación. Ha publicado y/o editado varios libros y diversos artículos. En 2007 creó el Posgrado Oficial en Estudios Fónicos (CSIC / UIMP), que desde entonces dirige, y, en 2013, ha fundado en el CSIC la revista digital Loquens. Spanish Journal of Speech Sciences.

Enfoques Probabilísticos para el Reconocimiento Forense de Huellas Dactilares y Palmares[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

April 29, 2013

Presentation place:

Abstract:

La charla se centrará en el trabajo realizado en los últimos 4 años por el ATVS en el marco de un proyecto con las áreas de Lofoscopia y SAID del Departamento de Identificación de la Guardia Civil. En primer lugar, realizaremos una introducción que justifique el uso de enfoques probabilísticos en la comparación de huellas dactilares. Tras ello, describiremos brevemente los objetivos del trabajo realizado en las primeras etapas del proyecto, para centrarnos en los logros de la anualidad de 2012: herramientas software que posibilitan el cálculo de una relación de verosimilitudes (LR) a partir de la comparación de dos patrones de minucias provenientes de una huella incriminatoria (huella dubitada) y una huella impresa con buena calidad y de la que se conoce su procedencia (huella indubitada).

On-Line Handwritten Biometrics @ ATVS[Go to top]

Presenter:

Javier Ortega-Garcia

Date of presentation:

April 15, 2013

Presentation place:

Abstract:

En la charla se abordará la problemática de la autenticación de usuarios basada en el reconocimiento dinámico (on-line) de su escritura (manuscrita), a través de toda la trayectoria del ATVS en esta línea. Se introducirá el problema de los escenarios de aplicación, así como las diferentes perspectivas de aproximación al problema de la autenticación, analizando las etapas de adquisición y preprocesado (normalización, muestreo espacial/temporal, etc.), la representación de la información (características globales vs locales a partir de secuencias temporales), y el cálculo de similitud/matching tanto de los sistemas basados en características vectoriales como los sistemas basados en funciones. Luego se abordará la problemática de las evaluaciones (competitivas) de sistemas, para terminar la charla con los avances más recientes en el área (seguridad/privacidad y criptosistemas, generación sintética, calidad y entropía, …).

From the iriscode to the iris: a new vulnerability of iris recognition systems[Go to top]

Presenter:

Javier Galbally

Date of presentation:

April 08, 2013

Presentation place:

Abstract:

A binary iriscode is a very compact representation of an iris image, and, for a long time, it has been assumed that it did not contain enough information to allow the reconstruction of the original iris. The present work proposes a novel probabilistic approach to reconstruct iris images from binary templates and analyzes to what extent the reconstructed samples are similar to the original ones (that is, those from which the templates were extracted). The performance of the reconstruction technique is assessed by estimating the success chances of an attack carried out with the synthetic iris patterns against a commercial iris recognition system. The experimental results show that the reconstructed images are very realistic and that, even
though a human expert would not be easily deceived by them, there is a high chance that they can break into an iris recognition system.

Algoritmos y Sistemas para Análisis del Habla[Go to top]

Presenter:

Eduardo González Moreira

Date of presentation:

March 11, 2013

Presentation place:

Abstract:

El presente trabajo tiene como objetivo el desarrollo de algoritmos y programas computacionales para el análisis del habla. Los resultados han sido obtenidos en cuatro vertientes fundamentales, estrechamente interrelacionadas:
- Desarrollo de algoritmos para la detección de eventos de interés
- Desarrollo de algoritmos para obtener índices o parámetros para caracterizar la voz
- Contribuciones al análisis de alteraciones y patologías del habla
- Desarrollo de software de aplicación para el análisis y entrenamiento del habla

Dynamic Signature Verification on Smart Phones[Go to top]

Presenter:

Ram Prasad Krish

Date of presentation:

March 04, 2013

Presentation place:

Abstract:

This work is focused on dynamic signature verification for state-of-the-art smart phones, including performance evaluation. The analysis was performed on database consisting of 25 users and 500 signatures in total acquired with Samsung Galaxy Note. The verification algorithm tested combines two approaches: feature based (using Mahalanobis distance) and function based (using DTW), and the results are shown in terms of EER values. A number of experimental findings associated with signature verification in this scenario are obtained, e.g., the dominant challenge associated with the intra-class variability across time. As a result of the algorithm adaptation to the mobile scenario, the use of a state-of-the-art smart phone, and contrarily to what has been evidenced in previous works, we finally demonstrate that signature verification on smart phones can result in a similar verification performance compared to one obtained using more ergonomic stylus-based pen tablets. In particular, the best result achieved is an EER of 0.525%.

Looking for hand shape based biometric devices interoperability[Go to top]

Presenter:

Ester González-Sosa

Date of presentation:

February 25, 2013

Presentation place:

Abstract:

Identification of people through hand based biometry has been lately researched by different scientific groups due to its simplicity, reliability, and acceptability. In the very early years several works based on different acquisition devices or procedures has been presented. We focused the proposal on a property of the hand shape biometric devices that has been barely studied: the interoperability. In this work, a preliminary study based on a database composed by 6240 hand images acquired with 6 different hand-shape biometric approaches including flat scanner, webcams at different wavelengths, high quality cameras, and contactless devices acquiring the hand by both sides is presented. Our results suggest we are in the way to reach acceptable results of interoperability.

i-vector Based Speaker Recognition on Short Utterances[Go to top]

Presenter:

Ahilan Kanagasundaram

Date of presentation:

February 18, 2013

Presentation place:

Abstract:

Robust speaker verification on short utterances remains a key consideration, and a significant amount of speech is required for speaker model enrolment and verification, especially in the presence of large intersession variability. This talk introduces the Source- and Utterance-Normalized Linear Discriminant Analysis (SUN-LDA) and Utterance Variance Discrimination (UVD) approaches to short utterance based i-vector speaker verification system. By capturing the source and utterance variation information from short- and full-length development i-vectors, the SUN-LDA approach is shown to provide an improvement in speaker verification performance on short utterance evaluation conditions over traditional Linear Discriminant Analysis (LDA) approach. UVD approach is used to compensate the utterance variation information, and channel and utterance variation compensated approach is shown to provide improvement over channel compensated approach. Based upon the results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset, we believe that the SUN-LDA and UVD are best approaches for short utterance based i-vector speaker verification system.

Reconocimiento facial basado en puntos característicos de la cara en entornos no controlados[Go to top]

Presenter:

Luis Blázquez

Date of presentation:

February 04, 2013

Presentation place:

Abstract:

En este proyecto se estudia, implementa y evalúa un sistema automático de detección y corrección de puntos característicos faciales mal marcados, obtenidos mediante un sistema comercial automático. Como bases de datos para la experimentación se emplean varias bases de datos, emulando entornos controlados e incontrolados, de libre acceso a la comunidad científica. Se ha llevado a cabo un análisis antropométrico sobre el entorno controlado, probando su potencial; además de un análisis de cada uno de los rasgos faciales.

Download presentation

Actividad neuronal y funcionamiento cerebral[Go to top]

Presenter:

Juan A. Sigüenza

Date of presentation:

January 21, 2013

Presentation place:

Abstract:

Esta segunda parte sobre técnicas para la investigación del funcionamiento cerebral estará dedicada principalmente a las propiedades eléctricas del tejido nervioso: generación del impulso nervioso, transmisión e integración del mismo desembocando en aspectos vinculados a la conducta y finalizando con un repaso acerca de algunas de las teorías más populares acerca del funcionamiento cerebral.

Indirect Attacks based on Hill-Climbing Algorithms[Go to top]

Presenter:

Marta Gómez-Barrero

Date of presentation:

January 14, 2013

Presentation place:

Abstract:

The talk will be focused on the vulnerabilities of biometric systems posed by indirect attacks based on Hill-Climbing schemes. The vulnerabilities of standard iris and face verification systems to two novel indirect attacks based on a binary genetic algorithm and the Uphill Simplex algorithm are studied. The experiments are carried out on the iris and face subcorpora of the publicly available BioSecure DB. The attacks have shown a remarkable performance, thus proving the lack of robustness of the tested systems to these types of threats. Furthermore, some countermeasures are studied: quantization of scores and the use of the most consistent bits in the iriscode.

2012

Assesment of Gait Recognition Based on the Lower Part of the Human Body[Go to top]

Presenter:

Silvia Gabriel-Sanz

Date of presentation:

December 17, 2012

Presentation place:

Abstract:

This talk will be focused on the assessment of gait recognition on a constrained scenario, where limited information can be extracted from the gait image sequences. In particular we are interested in assessing the performance of gait images when only the lower part of the body is acquired by the camera and just half of a gait cycle is available (SFootBD database).
Thus, various state-of-the-art feature approaches have been followed and applied to the data. A comparison with a standard and ideal gait database (USF database) is also carried out using similar experimental protocols.
Results show that good recognition performance can be achieved using such limited data information for gait biometric (around 85% of rank 5 identification rate and 8.6% of EER). The comparison with a standard database shows that different feature approaches perform differently for each database, achieving best individual results with MPCA and EGEI methods for the SFootBD and the USF database respectively.

Forensic palmprint recognition[Go to top]

Presenter:

Ruifang Wang

Date of presentation:

December 10, 2012

Presentation place:

Abstract:

Forensic palmprint recognition mainly deals with high-resolution palmprints and focus on latent-to-full palmprint comparison. The problems we try to solve are:
(1) How to implement robust latent-to-full palmprint comparison for source searching;
(2) How to improve latent-to-full palmprint comparison combining palmprint evidence from different algorithms and different features;
(3) How to combine information from different regions of the palmprint to implement realistic source searching;
(4) How to evaluate the evidence in palmprints using forensic palmprint recognition systems.

Firstly, we proposed and implemented a baseline system composed of a feature extractor MinutiaLine and radial triangulation (RT) based comparator with a rank-1 identification rate of 68.2% on the forensic database including 22 latent palm marks and 8680 full palmprints. Then we obtained MinutiaCode-based palmprint recognition system to implement multi-algorithm combination., and we applied spectral minutiae (SM) as a kind of global feature to palmprints to implement multi-feature combination. As our future work related to problems (3) and (4), we will implement palmprint region classification and segmentation for regional combination and evidence evaluation using standard calibration procedures such as logistic regression.

Understanding the Discrimination Power of Different Facial Regions in Forensic Casework[Go to top]

Presenter:

Pedro Tome

Date of presentation:

December 03, 2012

Presentation place:

Abstract:

This talk is focused on automatic facial regions extraction for forensic applications. Forensic examiners compare different facial areas of face images obtained from both uncontrolled and controlled environments taken from the suspect. In this work, we study and compare the discriminative capabilities of 15 facial regions considered in forensic practice such as full face, nose, eye, eyebrow, mouth, etc. This study is useful because it can statistically support the current practice of forensic facial comparison. It is also of interest to biometrics because a more robust general-purpose face recognition system can be built by fusing the similarity scores obtained from the comparison of different individual parts of the face.

On the use of Total Variability and Probabilistic Linear Discriminant Analysis for Speaker Verification on Short Utterances[Go to top]

Presenter:

Javier Gonzalez-Dominguez

Date of presentation:

November 26, 2012

Presentation place:

Abstract:

This paper explores the use of state-of-the-art acoustic systems, namely Total Variability and Probabilistic Linear Discriminant Analysis for speaker verification on short utterances. While the recent advances in the field dealing with the session variability problem have proved to greatly outperform speaker verification systems on typical scenarios where a reasonable amount of speech is available, this performance rapidly degrades at the presence of limited data in both enrollment and verification stages. This paper studies the behavior of TV and PLDA on those scenarios where a scarce amount of speech (∼10s) is available to train and testing a speaker identity. The analysis has been carried out on the well defined and standard 10s-10s task belonging to the NIST Speaker Recognition Evaluation 2010 (NIST SRE10) and it explores the multiple parameters, which define TV and PLDA in order to give some insight about their relevance in this specific scenario.

Cepstral Trajectories in Linguistic Units for Text-Independent Speaker Recognition[Go to top]

Presenter:

Javier Franco-Pedroso

Date of presentation:

November 19, 2012

Presentation place:

Abstract:

In this paper, the contributions of different linguistic units to the speaker recognition task are explored by means of temporal trajectories of their MFCC features. Inspired by successful work in forensic speaker identification, we extend the approach based on temporal contours of formant frequencies in linguistic units to design a fully automatic system that puts together both forensic and automatic speaker recognition worlds. The combination of MFCC features and unit-dependent trajectories provides a powerful tool to extract individualizing information. At a fine-grained level, we provide a calibrated likelihood ratio per linguistic unit under analysis (extremely useful in applications such as forensics), and at a coarse-grained level, we combine the individual contributions of the different units to obtain a highly discriminative single system. This approach has been tested with NIST SRE 2006 datasets and protocols, consisting of 9,720 trials from 219 male speakers for the 1side-1side English-only task, and development data being extracted from 367 male speakers from 1,808 conversations from NIST SRE 2004 and 2005 datasets.

Capital Asset Pricing Model and modern portfolio Theory[Go to top]

Presenter:

Álvaro Diéguez

Date of presentation:

November 12, 2012

Presentation place:

Abstract:

Knowledge and analysis of existing economic series is
increasingly important. There is a need to be able to
assess possible economic scenarios in many areas. It is therefore
necessary to develop a new technology and research area known
as computational intelligence for financial engineering.
The origin of this discipline is set in the fifties. Although
financial systems has been a subject of study for many
years, the increase in computing power has opened a range of
very important possibilities. In the last 20 years, the field of
financial engineering has expanded to virtually all areas of the
finance and demand has grown dramatically. The main objective of this talk is to give a brief introduction to the Capital Asset Pricing Model of how assets work and to show the importance of diversification. With the use of mathematical and statistical tools and techniques from quantitative and computational finance to analyze financial data, I will estimate statistical models and show how to construct optimized portfolios.

Reconocimiento Biométrico de Personas a Distancia y en Movimiento.[Go to top]

Presenter:

Rubén Vera

Date of presentation:

November 05, 2012

Presentation place:

Abstract:

En la charla de hoy se hablará sobre el reconocimiento biométrico de personas a distancia y en movimiento, y se propone su aplicación en el caso concreto de los arcos de seguridad que se encuentran en aeropuertos y demás accesos de alta seguridad. En estos escenarios las modalidades biométricas que se pueden emplear sin “molestar” demasiado a los usuarios son: la forma de andar tanto usando sensores de presión (footstep biometric) como usando secuencias de imágenes del movimiento (gait recognition), el reconocimiento facial, y también el uso de imágenes corporales fuera de la banda visible. Avances en estas modalidades biométricas serán tratados, incluyendo la fusión multimodal en algunos de los casos.

Speaker recognition using temporal contours in linguistic units on large databases[Go to top]

Presenter:

Fernando Espinoza

Date of presentation:

October 22, 2012

Presentation place:

Abstract:

The usage of linguistic units (high level features) on speaker recognition has demostrated desired properties as high discrimination, interpretability, aceptance and large power on fusing with short-time spectral systems.

In this talk we present a new approach to automatic speaker recognition based on the modeling of Temporal Contours in Linguistic Units (TCLU) inspired by successful work in forensic speaker identification. The contributions of different linguistic units to speaker recognition are explored by means of temporal trajectories of their MFCC (Mel Frecuency Cepstral Coefficients) features. The combination of MFCC features and unit-dependent trajectories provides a powerful tool to extract individualizing information. At a fine-grained level, the system provides indivual score per unit under analysis, and at coarse-grained level, we combine the individual contributions of different units to obtain a highly discriminative single system.

Thus, this new approach yields good discrimination capabilities allowing to obtain speaker detection performance levels similar to equivalent acoustic/spectral systems.

Speech processing applied to the diagnosis of severe obstructive sleep apnoea.[Go to top]

Presenter:

Doroteo Torre-Toledano

Date of presentation:

October 15, 2012

Presentation place:

Abstract:

Obstructive sleep apnoea is a highly prevalent disease, affecting an estimated 2-4% of the adult male population, that is difficult and very costly to diagnose because symptoms can remain unnoticed for years and the reference diagnostic method (polysomnography) requires the patient to spend a night at the hospital monitored by specialized equipment. Therefore non-intrusive, fast and convenient screening techniques would be very helpful for setting priorities to proceed to the polysomnography diagnosis. In this talk we will first employ standard speaker recognition techniques for the task of detecting severe apnoea. Then a set of voice features that could be related to apnoea are defined, based on previous results from other authors and our own analysis. These features are analyzed first in isolation and then in combination to assess its discriminative power to classify voices as corresponding to apnoea patients and healthy subjects. Our results indicate that the proposed method, only requiring a few minutes to record the patient’s voice during the visit to the specialist, could help in the development of non-intrusive, fast and convenient screening techniques for obstructive sleep apnoea.

Deep Belief Networks & Extreme Learning Machines[Go to top]

Presenter:

Vasileios Vasilakakis

Date of presentation:

October 08, 2012

Presentation place:

Abstract:

Neural networks have been widely used in Pattern Recognition during last decades. Though, only recently, greedy layer-wise pre-training of a stack of RBMs (Restricted Boltzmann Machines) has given the opportunity to better learn deeper networks and apply them in real-life problems such as image and speech recognition, compression and classification. Additionally, a recently introduced Single Layer Feed-Forward Network approach, Extreme Learning Machine (ELM), is presented and comparison between RBMs and ELMs is performed.

Short Bio:

Vasileios Vasilakakis obtained his MSc in Artificial Intelligence at University of Edinburgh, UK, 2009. He is currently pursuing a PhD in Forensic Speaker Verification at the Politecnico di Torino in Italy. His research interest includes deep learning, pattern recognition and machine learning techniques and their application in biometrics and forensics. Currently he is focusing on the use of deep belief networks for speaker verification.

La investigación cerebral a través de sus técnicas de estudio[Go to top]

Presenter:

Juan A. Sigüenza

Date of presentation:

October 01, 2012

Presentation place:

Abstract:

Más de una década después de la "década del cerebro" (1990-2000), la investigación en neurociencia sigue estando tan viva como en sus inicios en tiempos de Ramón y Cajal, y aunque se ha avanzado sustancialmente en su conocimiento todavía quedan numerosas incógnitas por desvelar, especialmente todo aquello que tiene que ver con el envejecimiento cerebral y los procesos neurodegenerativos. En la charla se hará un repaso general a los conocimientos actuales sobre el cerebro, a través de sus técnicas de estudio, partiendo de las técnicas morfológicas clásicas, pasando por las neurofisiológicas y culminando con las denominadas técnicas de imagen. El enfoque del estudio del cerebro ha seguido una secuencia organizacional que ha ido avanzando desde los componentes macro y microscópicos hasta los puramente biofísicos y genéticos. Como conclusión se esbozarán algunas de las teorías no exentas de controversia sobre el funcionamiento cerebral.

Reliable support: Measuring calibration of likelihood ratios[Go to top]

Presenter:

Daniel Ramos

Date of presentation:

September 24, 2012

Presentation place:

Abstract:

Calculation of likelihood ratios for evidence evaluation still presents major challenges in many forensic disciplines, leading to the risk of forensic reports supporting the wrong hypothesis in a given case. Under this context, measuring the performance of likelihood ratio calculation methods is a fundamental step towards its validation for its use in casework. In this talk we propose a framework for measuring the performance of likelihood ratios in an empirical way, by means of an information-theoretical metric known as cross-entropy. We start by identifying the desirable properties of the likelihood ratio, adopting a decision-theoretical perspective to describe the inferential process in a given forensic case. We then introduce the proposed cross-entropy metric, showing that it consists of two important components: refinement and calibration. The former is related to the ability of likelihood ratios to discriminate between cases where each of the proposed propositions is true. The latter has been also called reliability, and we highlight its importance as a desirable property of the likelihood ratio. We present some examples of empirical performance measurement with speech databases from NIST evaluation campaigns (some of them including hundreds of speakers) and glass evidence from real forensic cases, showing that the quantitative measure of the calibration of the likelihood ratios illustrates their reliability. We finally relate cross-entropy, calibration and refinement with other performance representations such as Tippett plots.

Aspectos forenses e identificación de locutores mediante trayectorias temporales en unidades lingüísticas[Go to top]

Presenter:

Joaquín González-Rodríguez

Date of presentation:

September 17, 2012

Presentation place:

Abstract:

En la charla, se revisarán distintos aspectos forenses poco tratados en el grupo, como la diferencia entre mentalidad científica y mentalidad policial en análisis de evidencias forenses, y se mostrará la aproximación clásica al cálculo de relaciones de verosimilitud (LRs) a partir de perfiles de ADN. La segunda mitad de la exposición tratará de la descomposición de la información de identidad en múltiples contribuciones a partir de cada una de las unidades lingüísticas bajo análisis, ya sea mediante trayectorias de formantes o de MFCC en dichas unidades.