The MegaVeridicality Dataset

The formal semantics literature has long been concerned with the complex array of inferences that different open class lexical items trigger. For example, why does (1a) give rise to the inference (2a), while the structurally identical (1b) triggers the inference (2b)?

a. Jo doesn’t believe that Bo left.
b. Jo doesn’t know that Bo left.
a. Jo believes that Bo didn’t leave.
b. Bo left.
c. Bo didn’t leave.

A major finding of this literature is that lexically triggered inferences are conditioned by surprising aspects of the syntactic context that a word occurs in. For example, while (3a), (3b), and (4a) trigger the inference (2b), (4b) triggers the inference (2c).

a. Jo remembered that Bo left.
b. Jo didn’t remember that Bo left.
a. Bo remembered to leave.
b. Bo didn’t remember to leave.

For a detailed description of the MegaVeridicality datasets, including the item construction and collection methods and discussion of how to use a dataset on this scale to address questions in linguistic theory, please see the references below.

Data

Sentences	Predicates	Frames	Download	Citation
1088	517	2	v1 (zip)	White & Rawlins 2018
3938	773	9	v2.1 (zip)	White & Rawlins 2018 White et al. 2018

References

White, Aaron Steven. accepted with revisions. On believing and hoping whether. Semantics & Pragmatics. [pdf, code]

White, Aaron Steven, Rachel Rudinger, Kyle Rawlins, and Benjamin Van Durme. 2018. Lexicosyntactic Inference in Neural Models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4717–4724. Brussels, Belgium: Association for Computational Linguistics. [pdf, doi]

White, Aaron Steven, and Kyle Rawlins. 2018. The Role of Veridicality and Factivity in Clause Selection. In Proceedings of the 48th Annual Meeting of the North East Linguistic Society, edited by Sherry Hucklebridge and Max Nelson, 221–234. Amherst, MA: GLSA Publications. [pdf (preprint)]

Researchers

Aaron Steven White	Kyle Rawlins	Benjamin Van Durme
Rachel Rudinger