The ParlaMint corpora of parliamentary proceedings

Erjavec, Tomaž and Ogrodniczuk, Maciej and Osenova, Petya and Ljubešić, Nikola and Simov, Kiril and Pančur, Andrej and Rudolf, Michał and Kopp, Matyáš and Barkarson, Starkaður and Steingrímsson, Steinþór and Çöltekin, Çağrı and de Does, Jesse and Depuydt, Katrien and Agnoloni, Tommaso and Venturi, Giulia and Pérez, María Calzada and de Macedo, Luciana D. and Navarretta, Costanza and Luxardo, Giancarlo and Coole, Matthew and Rayson, Paul and Morkevičius, Vaidas and Krilavičius, Tomas and Darǵis, Roberts and Ring, Orsolya and van Heusden, Ruben and Marx, Maarten and Fišer, Darja (2023) The ParlaMint corpora of parliamentary proceedings. Language Resources and Evaluation, 57 (1). pp. 415-448. ISSN 1574-0218

Text (s10579-021-09574-0)
s10579_021_09574_0.pdf - Published Version
Available under License Creative Commons Attribution.
Download (2MB)

Abstract

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

Item Type:

Journal Article

Journal or Publication Title:

Language Resources and Evaluation

Uncontrolled Keywords:

/dk/atira/pure/subjectarea/asjc/3300/3310

Subjects:

?? parliamentary proceedingscomparable corporateilinguistics and languagelibrary and information sciences ??

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

165473

Deposited By:

ep_importer_pure

Deposited On:

04 Feb 2022 14:15

Refereed?:

Yes

Published?:

Published

Last Modified:

11 Dec 2025 06:33

URI:

https://eprints.lancs.ac.uk/id/eprint/165473