Computer Graphics 2016: Co-articulation and speech synchronization in MPEG-4 based facial animation- Abdennour El Rhalibi- Liverpool University

Abdennour El Rhalibi

doi:.

Editorial Open Access

Computer Graphics 2016: Co-articulation and speech synchronization in MPEG-4 based facial animation- Abdennour El Rhalibi- Liverpool University

Abstract

In this talk, Professor Abdennour El Rhalibi will present an summary of his research in game technologies at LJMU. He will present some recent projects developed with BBC R & D, on game middleware development and in facial animation. In particular he will introduce a completely unique framework for co-articulation and speech synchronization for MPEG-4 based facial animation. The system, referred to as Charisma, enables the creation, editing and playback of high resolution 3D models; MPEG-4 animation streams; and is compatible with well-known related systems like Greta and Xface. It supports text-to-speech for dynamic speech synchronization. The framework also enables real-time model simplification using quadric-based surfaces. The co-articulation approach provides realistic and high performance lip-sync animation, supported Cohen-Massaro’s model of co-articulation adapted to MPEG-4 facial animation (FA) specification. He will also discuss some experiments which show that the co-articulation technique gives overall good results when compared to related state-of-the-art techniques.The system the authors have developed uses MPEG-4 FA standard and other development to enable the creation, editing and playback of high-resolution 3D models; MPEG-4 animation streams; and is compatible with well-known related systems like Greta and Xface. It supports text-to-speech for dynamic speech synchronization. The framework enables real-time model simplification using quadric-based surfaces.Facial animation has proved to be an immense fascination and interest by research community and the industry. As its usesisimmensely broad, from the uses in the movie industry, to generate CGI characters or even to recreate real actors in CGI animation;tothe games industry where coupled with several advancesin both hardware and software it has permitted the creation of realistic characters that immerse the players like never before. Facial animationalso reached more recently different sectorsand applications, such as virtual presenceandmedical research. There are a few toolkits and frameworks that are dedicated to facial animation, such as Facegen, and other tools.Such broad potential and fascination around this subject has generated intense and dedicated research in the past three decades.Thesehave led to the creation of several branches, that focusin key aspects,such as modelling (e.g.wrinkles, skin, and hair), and embodied agent systems that enhance the user’s experience by incorporatingvirtual characterwith subsystemsinvolvingpersonality,emotions,and speech moods.These subsystemscontributeto mimic our behavioursand physical appearancemore realistically, and all have an important role in communication, either directly through the use of speech, or indirectly usingbody gestures,gaze, moods, and expressions. However, speech is the principaldirect means of communication between embodied agents andthe user. This justifies the efforts taken by the research community inthe last thirty yearstoward thecreation ofrealistic synthetic speech and lip movements forvirtual characters. In the past two decades several advanceshave permitted the creation of synthetic-visual audio speech forvirtual characters, involving speech processing enabled with the creation of text-to-speech engines such asMary-TTS (Schröder, 2001). With the creation of these speech engines, coarticulation also saw several advances beingmade (Massaro, 1993, Pelachaud, 2002, Sumedha and Nadia, 2003, Terry and Katsaggelos, 2008)which havebeen further complemented with facial Initial lip-sync studies attempted to concatenate phonetic segments, however it was found that phonemes do not achieve their ideal target shape at all times, due tothe influence of consecutive phonetic segments on each other.Such phenomenonis known as coarticulation. Coarticulation refers to the phonetic overlap, or changes in articulation that occurs between segments, which does not allow the segment to reach its perfect target shape. Coarticulation effects can be divided within two main phenomena, perseverative coarticulation:if the segments affectedare the preceding ones;and anticipatory coarticulation if the segments are affected by the upcoming ones.

Biography:

Abdennour El Rhalibi is a Professor of Entertainment Computing and Head of Strategic Projects at Liverpool John Moores University. He is Head of Computer Games Research Lab at the Protect Research Centre. He has over 22 years’ experience doing research and teaching in Computer Sciences. He has worked as Lead Researcher in three EU projects in France and in UK. His current research involves Game Technologies and Applied Artificial intelligence. He has been leading for six years several projects in Entertainment Computing funded by the BBC and UK based games companies, involving cross-platform development tools for games, 3D Web-Based Game Middleware Development, State Synchronisation in Multiplayer Online Games, Peer-to-Peer MMOG and 3D Character Animation. He has published over 150 publications in these areas. He serves in many journal editorial boards including ACM Computer in Entertainment and the International Journal of Computer Games Technologies. He has served as Chair and IPC Member in over 100 conferences on Computer Entertainment, AI and VR. He is a Member of many International Research Committees in AI and Entertainment Computing, including IEEE MMTC IG: 3D Rendering, Processing and Communications (3DRPCIG), IEEE Task Force on Computational Intelligence in Video Games and IFIP WG 14.4 Games and Entertainment Computing.

Abdennour El Rhalibi

To read the full article Download Full Article