The Noun Compound Synonym Substitution in Books (NCSSB) datasets contain in-context instances of potentially idiomatic English noun compounds, obtained by substituting idioms for synonyms occurring in public domain books forming part of the Project Gutenberg corpus.
History
Ethics
There is no personal data or any that requires ethical approval
Policy
The data complies with the institution and funders' policies on access and sharing
Sharing and access restrictions
The uploaded data can be shared openly
Data description
The file formats are open or commonly used
Methodology, headings and units
There is a file including methodology, headings and units, such as a readme.txt