People highlight the intended interpretation of their utterances within a larger discourse by a diverse set of non-verbal signals. These signals represent a key challenge for animated conversational agents because they are pervasive, variable, and need to be coordinated judiciously in an effective contribution to conversation. In this paper, we describe a freely available cross-platform real-time facial animation system, RUTH, that animates such high-level signals in synchrony with speech and lip movements. RUTH adopts an open, layered architecture in which fine-grained features of the animation can be derived by rule from inferred linguistic structure, allowing us to use RUTH, in conjunction with annotation of observed discourse, to investigate the meaningful high-level elements of conversational facial movement for American English speakers.
All Science Journal Classification (ASJC) codes
- Computer Graphics and Computer-Aided Design
- Embodied conversational agents
- Facial animation