ISI Language Grounding Data Set
|Data on this page is described in the following publication:|
Yonatan Bisk, Daniel Marcu, and William Wong. Towards a Dataset for Human Computer Communication via Grounded Language Acquisition. In the Proceedings of the AAAI 2016 Workshop on Symbiotic Cognitive Systems
|Preliminary Models on this data are described in the following forthcoming publication:|
Yonatan Bisk, Deniz Yuret, and Daniel Marcu. Natural Language Communication with Robots. Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics 2016
All data takes the form of Problem-Solution Sequences (PSSs), like the one pictured above. A series of images containing blocks in a 3D environment are rearranged to accomplish some goal. The initial dataset release focuses on drawing 100 digits from the MNIST corpus, which have been downsampled to required 20 or fewer blocks. The data was generated via Amazon's Mechanical Turk and annotators were asked to provide directions (as they would to a friend) to help complete the task. This task might be the movement of a single block or the completion of a sequence of actions. No restrictions were placed on the language used by annotators. This leads to lots of ambiguity in the phrasing of similar actions and in the task of grounding the specific entities being referenced.
Images are only necessary if vision algorithms are to be employed. Otherwise, the location and ID of all blocks are in the JSONs (<1Mb vs 360Mb for images)
CodeConveniently manipulating the data or training models: https://github.com/ybisk/GroundedLanguage
Our simulator/environment: https://github.com/danielmarcu/ISI-CWIC/
FAQBlock Decoration: Each sequence (JSON in the files) has a field labeled "decoration" which takes the values logo/digit/blank.
A0/A1/A2: A0 refers to single actions, A1 to short sequences and A2 to annotations of the full sequences.