Developing Asian language corpora: standards and practice.

Xiao, R. Z. and McEnery, A. M. and Baker, J. P. and Hardie, Andrew (2004) Developing Asian language corpora: standards and practice. In: The 4th Workshop on Asian Language Resources, 2004-03-25.

[img]
Preview
PDF (alr04paper.pdf)
alr04paper.pdf

Download (323kB)

Abstract

This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University - the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.

Item Type: Contribution to Conference (Paper)
Journal or Publication Title: The 4th Workshop on Asian Language Resources
Uncontrolled Keywords: /dk/atira/pure/researchoutput/libraryofcongress/p1
Subjects:
Departments: Faculty of Arts & Social Sciences > Linguistics & English Language
ID Code: 64
Deposited By: Dr Richard Xiao
Deposited On: 17 Jun 2005
Refereed?: Yes
Published?: Published
Last Modified: 22 Aug 2019 00:11
URI: https://eprints.lancs.ac.uk/id/eprint/64

Actions (login required)

View Item View Item