Piao, Scott Songlin and Rayson, Paul Edward and Knight, Dawn and Watkins, Gareth and Donnelly, Kevin (2017) Towards a Welsh semantic tagger : creating lexicons for a resource poor language. In: The Corpus Linguistics Conference 2017, 2017-07-24 - 2017-07-28, University of Birmingham.
Full text not available from this repository.Abstract
Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.