Murrieta-Flores, Patricia and Vega-Sánchez, Rodrigo and Sánchez-Diaz, Alexander and Cruz-Ríos, Héctor Francisco (2025) Unlocking colonial records with Artificial Intelligence. Achieving the automated transcription of large-scale 16th and 17th-century Latin American historical collections. STAR: Science & Technology of Archaeological Research, 11 (1): e2484828. ISSN 2054-8923
Full text not available from this repository.Abstract
Between the 16th and 18th centuries, Spanish and Indigenous people produced millions of colonial documents in diverse calligraphic styles in Latin American territories. Accessing their rich information requires specialised palaeographic skills and often indigenous language expertise. Recent advancements in Machine Learning offer promising solutions to these challenges. This study developed two computational methods: (1) a historical document classifier using Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs), and (2) Handwritten Text Recognition (HTR) models trained on 16th- and 17th-century Spanish manuscripts via Transkribus. The classifier achieved high accuracy, with F1 scores over 90% for most calligraphic styles, while HTR models produced competitive Character Error Rates, notably 5.25% for Redonda, 8.92% for Itálica Cursiva and 14.15% Procesal Simple. These automated tools are now allowing us to transform previously “unreadable” archives into accessible data. This means that soon, Latin American archives and libraries will be able to unlock centuries of historical information.