Séminaires du CENTAL: Quantitative approaches to historical texts: some (non-)issues and how to tackle them

Name: Séminaires du CENTAL: Quantitative approaches to historical texts: some (non-)issues and how to tackle them
Start: 2023-03-08T13:00:00+02:00
End: 2023-03-08T15:00:00+02:00
Location: Louvain-la-Neuve, Belgium

Simon Hengchen

Abstract

Quantitative methods for historical text analysis offer exciting opportunities for researchers interested in gaining new insights into long studied texts. However, the methodological underpinnings of these methods remains under-explored. In the first part of the talk I will show and discuss, through the use of a case study, the (non-)effect the OCR process has on a range of quantitative text analyses. In the second part of the talk, I will present a novel and totally unsupervised OCR post-correction method on the same dataset, as well as its most recent evolution on a highly-inflected language, Finnish.

Date

Mar 8, 2023 1:00 PM — 3:00 PM

Event

Séminaires du CENTAL: Quantitative approaches to historical texts: some (non-)issues and how to tackle them

Location

Louvain-la-Neuve, Belgium

Data Science Digital Humanities