This manual pertains to the Audiovisual Speech Processing publication, edited by Gerard Bailly, Pascal Perrier, and Eric Vatikiotis-Bateson, and published by Cambridge University Press. This work explores the intricate relationship between visible facial movements and audible speech acoustics, detailing how these signals are produced, perceived, and interact in human communication. It delves into research investigating audiovisual performance, the integration of auditory and visual signals for accessing the mental lexicon, and the neural processes involved. The book further examines the production and perception of multimodal speech, including the coordination of structures across modalities, and concludes with overviews of recent advancements in machine-based audiovisual speech recognition and synthesis.
The purpose of this manual is to provide comprehensive insights into the field of audiovisual speech processing. It covers fundamental questions regarding human audiovisual capabilities, the coordination of speech production and perception, and the latest developments in artificial intelligence for speech synthesis and recognition. This resource is intended for researchers, students, and professionals interested in linguistics, phonetics, computer science, and cognitive science, offering a detailed examination of how visible and audible cues contribute to spoken communication and how these can be leveraged in technological applications.
When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audiovisual performance: how auditory and visual signals combine to access the mental lexicon and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech.
Editor: Bailly, Gerard
Editor: Perrier, Pascal
Editor: Vatikiotis-Bateson, Eric
Publisher: Cambridge University Press
Illustration: N
Language: ENG
Title: Audiovisual Speech Processing
Pages: 00500 (Encrypted PDF)
On Sale: 2012-04-30
SKU-13/ISBN: 9781107006829
Category: Language Arts & Disciplines : Linguistics - Phonetics & Phon
When we speak, we configure the vocal tract which shapes the visible motions of the face and the patterning of the audible speech acoustics. Similarly, we use these visible and audible behaviors to perceive speech. This book showcases a broad range of research investigating how these two types of signals are used in spoken communication, how they interact, and how they can be used to enhance the realistic synthesis and recognition of audible and visible speech. The volume begins by addressing two important questions about human audiovisual performance: how auditory and visual signals combine to access the mental lexicon and where in the brain this and related processes take place. It then turns to the production and perception of multimodal speech and how structures are coordinated within and across the two modalities. Finally, the book presents overviews and recent developments in machine-based speech recognition and synthesis of AV speech.
Editor: Bailly, Gerard
Editor: Perrier, Pascal
Editor: Vatikiotis-Bateson, Eric
Publisher: Cambridge University Press
Illustration: N
Language: ENG
Title: Audiovisual Speech Processing
Pages: 00500 (Encrypted PDF)
On Sale: 2012-04-30
SKU-13/ISBN: 9781107006829
Category: Language Arts & Disciplines : Linguistics - Phonetics & Phon