Towards a Welsh semantic tagger : creating lexicons for a resource poor language

Piao, Scott Songlin and Rayson, Paul Edward and Knight, Dawn and Watkins, Gareth and Donnelly, Kevin (2017) Towards a Welsh semantic tagger : creating lexicons for a resource poor language. In: The Corpus Linguistics Conference 2017, 2017-07-24 - 2017-07-28, University of Birmingham.

Full text not available from this repository.

Abstract

Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.

Item Type:
Contribution to Conference (Paper)
Journal or Publication Title:
The Corpus Linguistics Conference 2017
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/1200/1203
Subjects:
?? semantic taggerwelsh semantic lexiconcorpus linguisticsnatural language processingsemantic annotationlanguage and linguisticsartificial intelligence ??
ID Code:
84728
Deposited By:
Deposited On:
23 Mar 2017 13:30
Refereed?:
Yes
Published?:
Published
Last Modified:
12 Sep 2024 14:40