Keeping Properties with the Data CL-MetaHeaders-An Open Specification

Vidler, John and Wattam, Stephen (2017) Keeping Properties with the Data CL-MetaHeaders-An Open Specification. In: Proceedings of the Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing (CMLC-5+ BigNLP) :. UNSPECIFIED, pp. 35-41.

Full text not available from this repository.

Abstract

Corpus researchers, along with many other disciplines in science are being put under continual pressure to show accountability and reproducibility in their work. This is unsurprisingly difficult when the researcher is faced with a wide array of methods and tools through which to do their work; simply tracking the operations done can be problematic, especially when toolchains are often configured by the developers, but left largely as a black box to the user. Here we present a scheme for encoding this ‘meta data’ inside the corpus files themselves in a structured data format, along with a proof-of-concept tool to record the operations performed on a file.

Item Type:
Contribution in Book/Report/Proceedings
Uncontrolled Keywords:
Research Output Funding/yes_internally_funded
Subjects:
?? yes - internally fundedno ??
ID Code:
236764
Deposited By:
Deposited On:
23 Apr 2026 14:40
Refereed?:
Yes
Published?:
Published
Last Modified:
23 Apr 2026 21:45