Towards a Welsh semantic tagger:creating lexicons for a resource poor language

Piao, Scott Songlin and Rayson, Paul Edward and Knight, Dawn and Watkins, Gareth and Donnelly, Kevin (2017) Towards a Welsh semantic tagger:creating lexicons for a resource poor language. In: The Corpus Linguistics Conference 2017, 2017-07-242017-07-28, University of Birmingham.

Full text not available from this repository.

Abstract

Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.

Item Type:
Contribution to Conference (Paper)
Journal or Publication Title:
The Corpus Linguistics Conference 2017
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/1700/1702
Subjects:
ID Code:
84728
Deposited By:
Deposited On:
23 Mar 2017 13:30
Refereed?:
Yes
Published?:
Published
Last Modified:
13 Nov 2020 08:34