Lancaster EPrints

Developing Asian language corpora: standards and practice.

Xiao, R. Z. and McEnery, A. M. and Baker, J. P. and Hardie, Andrew (2004) Developing Asian language corpora: standards and practice. In: The 4th Workshop on Asian Language Resources, 2004-03-25, Sanya, China.

[img]
Preview
PDF (alr04paper.pdf)
Download (315Kb) | Preview

    Abstract

    This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University - the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.

    Item Type: Conference or Workshop Item (Paper)
    Journal or Publication Title: The 4th Workshop on Asian Language Resources
    Uncontrolled Keywords: standards ; corpora ; Asian languages
    Subjects: P Language and Literature > P Philology. Linguistics
    Departments: Faculty of Arts & Social Sciences > Linguistics & English Language
    ID Code: 64
    Deposited By: Dr Richard Xiao
    Deposited On: 17 Jun 2005
    Refereed?: Yes
    Published?: Published
    Last Modified: 11 Mar 2013 16:22
    Identification Number:
    URI: http://eprints.lancs.ac.uk/id/eprint/64

    Actions (login required)

    View Item