IntroductionThe ILHAIRE project will last 36 months. It will gather data on laughter using high quality sound, video, facial and upper body motion capture. The process of database collection will be grounded in psychological foundations and the data will be used to validate computational and theoretical models of analysis and synthesis of audio-visual laughter. Dialog systems between humans and avatars will be developed and we will conduct studies to capture the qualitative experience evoked by a laughing avatar.
- Work Package 1
- Work Package 2
- Work Package 3
- Work Package 4
- Work Package 5
- Work Package 6
- Work Package 7
Incremental DatabaseWP1 provides the resources required for most of the other work packages by assembling existing resources and generating new ones to create an incremental database. Initially existing resources containing audio-visual records of laughter will be assembled. These resources will be used to construct an annotated database showing multimodal records of laughter in naturalistic interactions, and will incorporate a system of labels distinguishing the main kinds and functions of laughter represented in it.
WP1 will also use focus groups and large-scale questionnaires to identify a range of material that reliably makes different kinds of people laugh. In the early stages this will focus on material in English; at a later stage it will be extended to other languages and cultures. The laughter-inducing material will also inform experiments designed to induce and record naturalistic laughter using both audio-visual and motion capture recording techniques.
Multimodal Analysis and Recognition of LaughterWP2 concerns multimodal analysis and modelling of laughter. This covers both hilarious and conversational (social) laughter.
- to infer the sequences of phonemes and facial action units of laughs;
- to define a set of expressive gesture features for analysing gesture during laugh;
- to develop novel fusion algorithms based on the current signal context, for integrating information from the auditory, facial, and gestural channels; to automatically detect and classify laugh, based on such an integrated multimodal information.
Moreover, WP2 aims at investigating the influence of culture and gender to improve laughter detection. The models and techniques developed here will form the ground for the adaptive models to be employed for laughter synthesis. The work is organised in steps, according to a spiral research and development approach, designed to converge toward the final results. A first step uses existing material to produce baseline methods; then these methods are refined for hilarious laughter and finally integrated for specific analyses of social laughter.
Multimodal Generation and Synthesis of LaughterWP3 deals with the development of models for generation and synthesis of audiovisual laughter.
The task of the Generation stage is to describe laughter episodes at the behavioural level (audio structure, body posture, facial action unit sequences), so as to make them appropriate in a given conversational context. The input of the Generation stage is provided by the Dialogue module (WP4), which makes decisions on the timing, duration, and style of laughter as a function of the application scenario and according to user interaction.
The Synthesis stage then provides the acoustic and visual representation of laughter, from this behavioural description. Audio synthesis will be based on statistical parametric synthesis (HMMs), adapted to the specifically inarticulate nature of laughter. Visual synthesis is based on a Finites State Machines in which states will be mostly associated to body, head, or facial postures, and transitions will provide natural looking movements from/to these postures.
Both stages heavily rely on the annotated laughter databases obtained by from WP1 and analyzed in WP2. First, for training the HMM models involved in audio synthesis and for having a large collection of examples of visual laughter to choose from for visual synthesis. Second, for using the data in copy synthesis experiments, for checking each stage separately.
Laughter-enabled Dialog ModellingThe objective of WP4 is to design an adaptive and data-driven multimodal dialogue management system for achieving natural and user-friendly interactions integrating laughter. The dialogue management task will focus on non-linguistic information to generate acceptable and appropriate types of laughter at appropriate times. Machine learning methods such as reinforcement learning will serve to optimise the interaction strategy and to decide the most optimized moments when to generate laughter. It will first use existing data and data collected in WP1 to generate baseline dialogue strategies, and inputs from WP2 for adapting the strategy online afterwards. It will provide information to the laughter synthesis system developed in WP3.
More specifically, the goals to achieve in this WP will be to:
- Develop and evaluate a robust and adaptive dialogue manager capable of integrating information from the multimodal analysis of the user inputs and laughter recognition (WP2);
- Develop data-driven Reinforcement Learning methods able to handle large state spaces for optimising dialogue management, based on the current context, user inputs, and laughter gathered from WP2 and data collected in WP1;
- Develop imitation-based algorithms based on rules or learning from data to artificially expand data sets and to mimic human users.
Psychological Foundations of LaughterWork Package 5 will lay the psychological foundations of laughter. The goals are to understand the factors affecting the conveyance and perception of laughter expressions. Psychologically appropriate methods (quantitative and qualitative) of assessing affective and cognitive responses to laughing avatars will be developed.
In experimental settings, personality characteristics, such as trait cheerfulness or dispositions to being laughed at and ridicule (e.g., gelotophobia, the fear of being laughed at) will be considered, as they predispose individuals to different responses to laughter and laughter-related stimuli. It is expected that persons with different personalities will respond differently to avatars and their laughter.
The knowledge of those differences will help identifying factors that make an avatar laughter contagious and positively valued (sound, facial expression, intensity). A further goal is to identify a model of mimicry, counter-mimicry and emotional contagion multimodal responses, also under the focus of cultural differences.
Integration and EvaluationsThis WP has two main tasks, one related to the integration of the various technologies developed within ILHAIRE, and one regarding the evaluation of the integrated system and of its components.
The first task is concerned with integrating the various components developed within the ILHAIRE project (laugh recognition, dialog manager, laughing conversational model, contagion model, etc). There will be 3 phases for the system integration, one per year.
Regarding the evaluation studies, we will verify the contribution of expressive features in laugh through experimental procedures. Congruent and incongruent combinations of facial expression and body movement arising during acted and spontaneous laughter will be shown to participants.
Both quantitative and qualitative approaches (e.g., semi-structured interviews, effect on task performance) will be used to evaluate the level of emotional contagion triggered by the avatar's laughter expressions. The results will be used to refine the emotional contagion probabilistic model driving the ECA.
The WP will also establish the dimensions under which the evaluation will be conducted, and perform those evaluations, in particular in terms of acceptability, believability, added-value and impact.
Dissemination and ExploitationThis work package includes the creation of this web site for the dissemination of project results. Other communication channels will also be used through participation in conferences and creation of showcases, such as a laughter authenticity detector, a hilarious contagious laughter machine, and a laughter driven virtual agent. These showcases will be inspired by the evaluation WP.
Other actions will be carried out to maximize the reused potential of project knowledge and software, in connection with other projects of the partners.
- University of Mons
- University of Augsburg
- Università degli Studi of Genova
- University College London
- Queen's University Belfast
- University of Zurich