Digital Resources : Artificial Intelligence, Computational Approaches, and Geographical Text Analysis to Investigate Early Colonial Mexico

Murrieta-Flores, Patricia and Jiménez Badillo, Diego and Martins, Bruno (2022) Digital Resources : Artificial Intelligence, Computational Approaches, and Geographical Text Analysis to Investigate Early Colonial Mexico. In: Oxford Research Encyclopedia of Latin American History :. Oxford University Press.

[thumbnail of Murrieta_et_al_2022_preprint]
Text (Murrieta_et_al_2022_preprint)
Murrieta_et_al_2022_preprint.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (1MB)

Abstract

The application of digital technologies within interdisciplinary environments is enabling the development of more efficient methods and techniques for analyzing historical corpora at scales that were not feasible before. The project “Digging into Early Colonial Mexico” is an example of cooperation among archaeologists, historians, computer scientists, and geographers engaged in designing and implementing methods for text mining and large-scale analysis of primary and secondary historical sources, specifically the automated identification of vital analytical concepts linked to locational references, revealing the spatial and geographic context of the historical narrative. As a case study, the project focuses on the Relaciones Geográficas de la Nueva España (Geographic Reports of New Spain, or RGs). This is a corpus of textual and pictographic documents produced in 1577–1585 CE that provides one of the most complete and extensive accounts of Mexico and Guatemala’s history and the social situation at the time. The research team is developing valuable digital tools and datasets, including (a) a comprehensive historical gazetteer containing thousands of georeferenced toponyms integrated within a geographical information system (GIS); (b) two digital versions of the RGs corpus, one fully annotated and ready for information extraction, and another suitable for further experimentation with algorithms of machine learning (ML), natural language processing (NLP), and corpus linguistics (CL) analyses; and (c) software tools that support a research method called geographical text analysis (GTA). GTA applies natural language processing based on deep learning algorithms for named entity recognition, disambiguation, and classification to enable the parsing of texts and the automatic mark-up of words referring to place names that are later associated with analytical concepts through a technique called geographic collocation analysis. By leveraging the benefits of the GTA methodology and resources, the research team is in the process of investigating questions related to the landscape and territorial transformations experienced during the colonization of Mexico, as well as the discovery of social, economic, political, and religious patterns in the way of life of Indigenous and Spanish communities of New Spain toward the last quarter of the 16th century. All datasets and research products will be released under an open-access license for the free use of scholars engaged in Latin American studies or interested in computational approaches to history.

Item Type:
Contribution in Book/Report/Proceedings
Uncontrolled Keywords:
Research Output Funding/yes_externally_funded
Subjects:
?? yes - externally fundedno ??
ID Code:
213914
Deposited By:
Deposited On:
20 Feb 2024 15:55
Refereed?:
No
Published?:
Published
Last Modified:
15 Aug 2024 23:18