This manual provides comprehensive information for the Experience-Based Language Acquisition (EBLA) system, a computational model designed for understanding natural language in a humanlike manner. Developed with insights from infant cognitive development, EBLA utilizes an open framework for visual perception and grounded language acquisition. It is capable of processing video input to identify objects and relationships, subsequently acquiring a protolanguage of nouns and verbs. This system is engineered for robust performance in scene analysis and generating descriptions of novel visual content, demonstrating a significant advancement in artificial language acquisition.
The purpose of this manual is to detail the architecture and operational procedures of the EBLA system. It covers key areas including the three-stage process of vision processing, entity extraction, and lexical resolution, along with the underlying algorithms such as mean shift analysis and cross-situational learning. This guide is intended for researchers, developers, and students involved in natural language processing, computer vision, and artificial intelligence, offering insights into the system's design, implementation, and evaluation metrics for acquisition speed and description accuracy.
Almost from the very beginning of the digital age, people have sought better ways to communicate with computers. This research investigates how computers might be enabled to understand natural language in a more humanlike way. Based, in part, on cognitive development in infants, we introduce an open computational framework for visual perception and grounded language acquisition called Experience-Based Language Acquisition (EBLA). EBLA can watch a series of short videos and acquire a simple language of nouns and verbs corresponding to the objects and object-object relations in those videos. Upon acquiring this protolanguage, EBLA can perform basic scene analysis to generate descriptions of novel videos. The general architecture of EBLA is comprised of three stages: vision processing, entity extraction, and lexical resolution. In the vision processing stage, EBLA processes the individual frames in short videos, using a variation of the mean shift analysis image segmentation algorithm to identify and store information about significant objects. In the entity extraction stage, EBLA abstracts information about the significant objects in each video and the relationships among those objects into internal representations called entities. Finally, in the lexical acquisition stage, EBLA extracts the individual lexemes (words) from simple descriptions of each video and attempts to generate entity-lexeme mappings using an inference technique called cross-situational learning. EBLA is not primed with a base lexicon, so it faces the task of bootstrapping its lexicon from scratch. The performance of EBLA has been evaluated based on acquisition speed and accuracy of scene descriptions. For a test set of simple animations, EBLA had average acquisition success rates as high as 100% and average description success rates as high as 96.7%. For a larger set of real videos, EBLA had average acquisition success rates as high as 95.8% and average description success rates as high as 65.3%. The lower description success rate for the videos is attributed to the wide variance in entities across the videos. While there have been several systems capable of learning object or event labels for videos, EBLA is the first known system to acquire both nouns and verbs using a grounded computer vision system.
Author: Pangburn, Brian E.
Publisher: Dissertation.Com
Illustration: N
Language: ENG
Title: Experience-Based Language Acquisition: A Computational Model of Human Language Acquisition
Pages: 00142 (Encrypted PDF)
On Sale: 2005-10-31
SKU-13/ISBN: 9781581121711
Category: Computers : General
Almost from the very beginning of the digital age, people have sought better ways to communicate with computers. This research investigates how computers might be enabled to understand natural language in a more humanlike way. Based, in part, on cognitive development in infants, we introduce an open computational framework for visual perception and grounded language acquisition called Experience-Based Language Acquisition (EBLA). EBLA can watch a series of short videos and acquire a simple language of nouns and verbs corresponding to the objects and object-object relations in those videos. Upon acquiring this protolanguage, EBLA can perform basic scene analysis to generate descriptions of novel videos. The general architecture of EBLA is comprised of three stages: vision processing, entity extraction, and lexical resolution. In the vision processing stage, EBLA processes the individual frames in short videos, using a variation of the mean shift analysis image segmentation algorithm to identify and store information about significant objects. In the entity extraction stage, EBLA abstracts information about the significant objects in each video and the relationships among those objects into internal representations called entities. Finally, in the lexical acquisition stage, EBLA extracts the individual lexemes (words) from simple descriptions of each video and attempts to generate entity-lexeme mappings using an inference technique called cross-situational learning. EBLA is not primed with a base lexicon, so it faces the task of bootstrapping its lexicon from scratch. The performance of EBLA has been evaluated based on acquisition speed and accuracy of scene descriptions. For a test set of simple animations, EBLA had average acquisition success rates as high as 100% and average description success rates as high as 96.7%. For a larger set of real videos, EBLA had average acquisition success rates as high as 95.8% and average description success rates as high as 65.3%. The lower description success rate for the videos is attributed to the wide variance in entities across the videos. While there have been several systems capable of learning object or event labels for videos, EBLA is the first known system to acquire both nouns and verbs using a grounded computer vision system.
Author: Pangburn, Brian E.
Publisher: Dissertation.Com
Illustration: N
Language: ENG
Title: Experience-Based Language Acquisition: A Computational Model of Human Language Acquisition
Pages: 00142 (Encrypted PDF)
On Sale: 2005-10-31
SKU-13/ISBN: 9781581121711
Category: Computers : General