Product Detail

Prof. Carlos BussoKeynote Speech 4: Tuesday 09:00-10:00, December 19th, 2023

Item Number: 189801
Product Category: KEYNOTE SPEAKERS

Out of Stock

Full Description

Keynote Speech 4: Tuesday 09:00-10:00, December 19th, 2023

Abstract

Can the Production and Perception of Human Emotions Inspire Speech-Based Affective Computing?

Emotions play an important role in human-human interactions, influencing our decision-making processes, the manner in which we express ourselves, and how our interlocutors respond to us. Therefore, it is important to advance affective computing systems aiming to analyze, recognize, and synthesize emotions using computational models. This talk will focus on speech emotion recognition (SER), although the observations considered in the presentation are also relevant to other speech tasks in affective computing. The intrinsic variability in which we express and perceive emotions makes SER a unique and challenging research problem that differs from other classical machine learning (ML) tasks. A key difference with other problems is that we do not have ground truth labels describing the felt emotion of the speaker of a target sentence. Therefore, the prevalent strategy involves relying on perceptual assessments collected from diverse evaluators, potentially leading to variance in their perceived emotional interpretations. Some may consider SER as a noisy ML problem given the inter-evaluator differences affecting the labels. However, our thesis is that valuable information is conveyed in the way we externalize and perceive emotions that can inform the design of better speech-based affective computing technology. This talk will describe principled observations rooted in the production and perception of emotions that have direct implications for the design of SER systems including (1) the ordinal nature of emotion, (2) the nonuniform externalization of emotions, (3) the specific modulation observed in speech for each emotional attribute, and (4) the intrinsic relation between speech and other modalities including facial expressions that can be leveraged even if the ultimate goal is to have speech-based systems.

Biography

Carlos Busso is a Professor at the University of Texas at Dallas’s Electrical and Computer Engineering Department, where he is also the director of the Multimodal Signal Processing (MSP) Laboratory. His research interest is in human-centered multimodal machine intelligence and application, with a focus on the broad areas of affective computing, multimodal human-machine interfaces, in-vehicle active safety system, and machine learning methods for multimodal processing. He has worked on audio-visual emotion recognition, analysis of emotional modulation in gestures and speech, designing realistic human-like virtual characters, and detection of driver distractions. He is a recipient of an NSF CAREER Award. In 2014, he received the ICMI Ten-Year Technical Impact Award. In 2015, his student received the third prize IEEE ITSS Best Dissertation Award (N. Li). He also received the Hewlett Packard Best Paper Award at the IEEE ICME 2011 (with J. Jain), and the Best Paper Award at the AAAC ACII 2017 (with Yannakakis and Cowie). He received the Best of IEEE Transactions on Affective Computing Paper Collection in 2021 (with R. Lotfian) and the Best Paper Award from IEEE Transactions on Affective Computing in 2022 (with Yannakakis and Cowie). In 2023, he received the Distinguished Alumni Award in the Mid-Career/Academia category by the Signal and Image Processing Institute (SIPI) at the University of Southern California. He is currently serving as an associate editor of the IEEE Transactions on Affective Computing. He is an IEEE Fellow. He is a member of the ISCA, and AAAC and a senior member of ACM.