The MegaAcceptability dataset
Authors: Aaron Steven White and Kyle Rawlins
Contact: aaron.white@rochester.edu, kgr@jhu.edu
Version: 1.2
Release date: 7 Apr 2020
Overview
This dataset consists of ordinal acceptability judgments for 1,000 clause-embedding verbs of English in 50 surface-syntactic frames. The data were collected on Amazon’s Mechanical Turk using Turktools.
For a detailed description of the dataset, the item construction and collection methods, and discussion of how to use a dataset on this scale to address questions in linguistic theory, please see the following papers:
White, A.S. & K. Rawlins. 2020. Frequency, acceptability, and selection: A case study of clause-embedding. Accepted to Glossa.
White, A. S. & K. Rawlins. 2016. A computational model of S-selection. In M. Moroney, C-R. Little, J. Collard & D. Burgdorf (eds.), Semantics and Linguistic Theory 26, 641-663. Ithaca, NY: CLC Publications.
If you make use of this dataset in a presentation or publication, we ask that you please cite these papers.
Version history
1.0: first public release, 30 Oct 2016 1.1: formatting update, 14 Aug 2019 1.2: adds normalized data, 7 Apr 2020
Description
mega-attitude-v1.tsv
contains the raw data.
Column | Description | Values |
---|---|---|
participant | anonymous integer identifier for participant that provided the response | 0…728 |
list | integer identifier for list participant was responding to | 0…999 |
presentationorder | relative position of item in list | 1…50 |
verb | lemma of clause-embedding verb found in the item | see paper |
frame | clausal complement found in the item | see paper |
response | ordinal scale acceptability response | 1…7 |
nativeenglish | whether the participant reported speaking American English natively | True , False |
sentence | sentence that was judged | see paper |
mega-attitude-v1-normalized.tsv
contains data normalized using the procedure described in White & Rawlins 2020.
Column | Description | Values |
---|---|---|
verb | lemma of clause-embedding verb found in the item | see paper |
verbform | form of clause-embedding verb found in the item | see paper |
frame | clausal complement found in the item | see paper |
responsenorm | normalized ordinal scale acceptability response | [-3.84, 4.94] |
responsevar | variability of ordinal scale acceptability responses | [-3.64, -0.19] |
sentence | sentence that was judged, white-space tokenized | see paper |
Notes
- A javascript error produced 10 NA values for
response
, none of which affect the same verb-frame pair.