The MegaAcceptability dataset

Authors: Aaron Steven White, Hannah YoungEun An, and Kyle Rawlins

Contact: aaron.white@rochester.edu, yan2@ur.rochester.edu, kgr@jhu.edu

Version: 2.0

Release date: 14 Aug 2019

Overview

This MegaAcceptability dataset consists of ordinal acceptability judgments for 1,007 clause-embedding verbs of English in 50 surface-syntactic frames and three matrix tenses. This dataset combines the MegaAcceptability version 1.0 and data collected for 25,000 additional verb-frame pairs on Amazon’s Mechanical Turk using Ibex on Mechanical Turk.

For a detailed description of the dataset, the item construction and collection methods, and discussion of how to use a dataset on this scale to address questions in linguistic theory, please see the following papers:

White, A.S. & K. Rawlins. 2020. Frequency, acceptability, and selection: A case study of clause-embedding. Accepted to Glossa.

An, H.Y. & A.S. White. 2020. The lexical and grammatical sources of neg-raising inferences. Proceedings of the Society for Computation in Linguistics 3:23, 220-233.

White, A. S. & K. Rawlins. 2016. A computational model of S-selection. In M. Moroney, C-R. Little, J. Collard & D. Burgdorf (eds.), Semantics and Linguistic Theory 26, 641-663. Ithaca, NY: CLC Publications.

If you make use of this dataset in a presentation or publication, we ask that you please cite these papers.

Version history

1.0: first public release, 30 Oct 2016 1.1: formatting update, 14 Aug 2019 2.0: first public release, 14 Aug 2019

Description

Column	Description	Values
participant	anonymous integer identifier for participant that provided the response	0…1293
list	integer identifier for list participant was responding to	0…1499
presentationorder	relative position of item in list	1…50
verb	clause-embedding verb found in the item	see paper
frame	clausal complement found in the item	see paper
tense	matrix tense found in the item	`present`, `past`, `past_progressive`
response	ordinal scale acceptability response	1…7
nativeenglish	whether the participant reported speaking American English natively	`True`, `False`
sentence	sentence that was judged	see paper
version	MegaAcceptability dataset version where the judgment first appeared	1, 2

Notes

A javascript error produced 10 NA values for response, none of which affect the same verb-frame pair. All such values are inherited from version 1.x