Speaker: Pilar Fernández Gallego

Abstract:

Automatic Speech Recognition (ASR) systems have exhibited notable disparities in performance across demographic groups, raising important concerns about fairness in AI technologies. Recent investigations have synthesized findings on gender, dialect, and skin tone biases within ASR models. One study analyzed the intersection of gender and minority dialects in the United States, developing a podcast dataset annotated for gender and dialect, including African American Vernacular English (AAVE), Chicano English, and Spanglish. Another study examined the performance of Whisper’s ASR system across a range of diverse native and non-native English accents. Collectively, these studies highlight the limitations of current ASR systems in achieving equitable performance, particularly for speakers of minority dialects and diverse demographic backgrounds.