Rayson, Paul Edward and Mariani, John Amedeo and Anderson-Cooper, Bryce and Baron, Alistair and Gullick, David Stephen and Moore, Andrew and Wattam, Steve (2017) Towards Interactive Multidimensional Visualisations for Corpus Linguistics. Journal for Language Technology and Computational Linguistics, 31 (1). pp. 27-49. ISSN 2190-6858
RaysonEtAl_jlcl_2017.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (2MB)
Abstract
We propose the novel application of dynamic and interactive visualisation techniques to support the iterative and exploratory investigations typical of the corpus linguistics methodology. Very large scale text analysis is already carried out in corpus-based language analysis by employing methods such as frequency profiling, keywords, concordancing, collocations and n-grams. However, at present only basic visualisation methods are utilised. In this paper, we describe case studies of multiple types of key word clouds, explorer tools for collocation networks, and compare network and language distance visualisations for online social networks. These are shown to fit better with the iterative data-driven corpus methodology, and permit some level of scalability to cope with ever increasing corpus size and complexity. In addition, they will allow corpus linguistic methods to be used more widely in the digital humanities and social sciences since the learning curve with visualisations is shallower for non-experts