Using eScriptorium to work on documents: eScriptorium applied on a 13th century French cartulary

Authors

DOI:

https://doi.org/10.24206/lh.v10i1.63294

Keywords:

Diplomatics, Digital Humanities, Cartulary, eScriptorium

Abstract

As part of the research on Digital Diplomatics or on new tools usable in Diplomatics’ studies, this essay discusses the experience of treating a 13th century French cartulary with eScriptorium. By analyzing the functioning of this open-source software created mainly for the automatic transcription of any type of document, in any language and on any medium, we will see step by step its application on the the Saint-Etienne-des-Grès church’s cartulary. If the starting hypothesis proposed the software’s use for the analysis of both internal and external features on the one hand and on the other the experimentation of several HTR models on a document tipology which by nature can include a great variety of different scripts, such as in this case, the work proved satisfactory for the recognition of the external features of the document and a good result was achieved in terms of automatic transcription. As far as the analysis of the internal features is concerned, it is currently not possible to carry it out directly on the software. ​​​​​​​By the way, at this stage the work one does on eScriptorium can be exported and then transformed into TEI, taking into account that the eScriptorium team is working on the integration of the TEI’s principles into the tool, promising good results in the near future.

References

ADAM MATTHEW DIGITAL, Quartex [Consult. 25 January 2023]. Available at https://www.amdigital.co.uk/create/am-quartex

Aletheia [Consult. 25 January 2023]. Available at https://www.primaresearch.org/tools/Aletheia

AMBROSIO, Antonella (2020). La Diplomatica e il digitale. Il Fondo della Biblioteca della Società Napoletana di Storia Patria online. RTH – Education & Philosophy. Riconoscimenti, trasformazioni e misteri. Culture a confronto attraverso secoli di Autor [Online]. Vol.7. [Consult. 15 September 2021]. Available at https://doi.org/10.6093/2284-0184/7330

ANSANI, Michele (1999). Diplomatica (e diplomatisti) nell’arena digitale. Scrineum , v. 1, p. 1-11. Available at http://dobc.unipv.it/scrineum/biblioteca/ ansani.htm

ANYOCR [Consult. 25 January 2023]. Available at https://anyline.com/free-ocr-font

ARCHETYPE software [Consult. 15 September 2021]. Available at https://github.com/kcl-ddh/digipal/wiki

ARCHIVES NATIONALES DE FRANCE, AN LL465 IIIF MANIFEST [Consult. 15 September 2021]. Available at https://bvmm.irht.cnrs.fr/iiif/33551/manifest

CORPUSBUILDER [Consult. 25 January 2023]. Available at https://github.com/berkmancenter/corpusbuilder

DEROLEZ, Albert (2003) –The Palaeography of Gothic Manuscript Books: From the Twelfth to the Early Sixteenth Century. Cambridge: Cambridge University Press.

DURANTI, Luciana (2009). From Digital Diplomatics to Digital Records Forensics. Archivaria, v. 68, p. 39–66. Available at https://archivaria.ca/index.php/archivaria/article/view/13229

eSCRIPTA blog [Consult. 15 September 2021]. Available at https://escripta.hypotheses.org/

eSCRIPTORIUM software [Consult. 15 September 2021]. Available at https://gitlab.com/scripta/escriptorium/

eSCRIPTORIUM webpage [Consult. 15 September 2021]. Available at https://escriptorium.fr/

EUROPEAN COMMISSION (2015). “tranScriptorium”, Community Research and Development Information Service. [Consult. 25 January 2023]. Available at available at: https://cordis.europa.eu/project/rcn/106843_en.html

EUROPEAN COMMISSION (2016). “Recognition and enrichment of archival documents”, Community Research and Development Information Service, [Consult. 25 January 2023]. Available at: https://cordis.europa.eu/project/rcn/198756_en.html

EUROPEAN COMMISSION (2019). “REligious Studies Infrastructure: tooLs, Experts, conNections and Centers”, Community Research and Development Information Service, [Consult. 25 January 2023]. Available at https://cordis.europa.eu/project/id/871127

GÉHIN, Paul (2005) – Lire le manuscrit médiéval. Paris: Colin.

INSTITUT DE RECHERCHE EN HISTOIRE DES TEXTES, CartuIr [Consult. 15 September 2021]. Available at https://telma-repertoires.irht.cnrs.fr/cartulr/page/presentation

INSTITUT DE RECHERCHE EN HISTOIRE DES TEXTES, RegeCart webpage [Consult. 15 September 2021]. Available at http://regecart.irht.cnrs.fr/

INTERNATIONAL IMAGE INTEROPERABILITY FRAMEWORK [Consult. 15 September 2021]. Available at https://iiif.io/

InterPARES Project [Consult. 15 September 2021]. Available at http://www.interpares.org/

KIESSLING, Benjamin (2019). Kraken – A Universal Text Recognizer for the Humanities. DH 2019 [Online]. [Consult. 20 September 2021]. Available at https://dev.clariah.nl/files/dh2019/boa/0673.html

KIESSLING, Benjamin et al. (2021). The eScriptorium VRE for Manuscript Cultures. Classic@ Journal [Online]. v.18, n. 1. [Consult. 17 September 2021] Available at: https://classics-at.chs.harvard.edu/classics18-stokes-kiessling-stokl-ben-ezra-tissot-gargem/

KRAKEN software [Consult. 15 September 2021]. Available at http://kraken.re/; resource code github.com/mittagessen/kraken

KÖLZER, Theo (2009). Diplomatik, Edition, Computer. In: VOGELER Georg. Digitale Diplomatik. Neue Technologien in der historischen Arbeit mit Urkunden, Archiv für Diplomatik – Beiheft 12; Köln-Weimar-Wien, p. 13-27.

MILLER, Matthew Thomas; ROMANOV, Maxim G.; BOWEN SAVANT Sarah (2018). Digitizing the Textual Heritage of the Premodern Islamicate World: Principles and Plans. International Journal of Middle East Studies, v. 50, n.1, p. 103–109. Available at https://doi.org/10.1017/S0020743817000964

MUEHLBERGER, Guenter et al. (2019). Transforming Scholarship in the Archives Through Handwritten Text Recognition: Transkribus as a Case Study. Journal of Documentation, v. 75, n.5, p. 954-976. Available at https://doi.org/10.1108/JD-07-2018-0114

OCR4all [Consult. 25 January 2023]. Available at https://github.com/ocropus/ocropy

OCRopus [Consult. 25 January 2023]. Available at https://www.ocr4all.org/

PYTORCH [Consult. 25 January 2023]. Available at https://github.com/courao/ocr.pytorch

RAJAN, Vinodh; STIEHL, H. (2019). AMAP: A Visual Programming Language Based System to Support Document Image Analysis. MuC’19: Proceedings of Mensch und Computer 2019, p. 881-884. [Consult. 25 January 2023]. Available at https://doi.org/10.1145/3340764.3345372

RAJAN, Vinodh; STIEHL, H. (2019). Making DIA Accessible to Non-Experts: Designing a Visual Programming Language for Document Image Analysis, p. 23-27. [Consult. 25 January 2023]. Available at https://doi.org/10.1109/ICDARW.2019.20048

SCHWARZ-RICCI Vera Isabell (2022). “Il riconoscimento automatico di scrittura per documenti storici: rapporto tecnico”. Istituto di Scienze del Patrimonio Culturale, Consiglio Nazionale delle Ricerche, Sede di Napoli. [Consult. 25 January 2023] Available at: https://openportal.ispc.cnr.it/data/2022/465029/2022_465029.published.pdf?id=people______%3A%3A4b1167634bb7c9512338f055242e9fe8

SCRIPTA PSL Project [Consult. 15 September 2021]. Available at https://scripta.psl.eu/

STOKES, Peter (2020). eScriptorium : un outil pour la transcription automatique des documents.” ÉpheNum: Veille, agenda et actualités des humanités numériques à l’EPHE [Online]. [Consult. 14 September 2021] Available at https://ephenum.hypotheses.org/1412

TENSORFLOW [Consult. 25 January 2023] Available at: https://www.tensorflow.org/lite/examples/optical_character_recognition/overview

TESSERACt 4.0. [Consult. 25 January 2023] Available at: https://github.com/tesseract-ocr/tesseract

TEXT ENCODING INITIATIVE [Consult. 15 September 2021]. Available at https://tei-c.org/

TRANSKRIBUS [Consult. 25 January 2023]. Available at https://readcoop. eu/it/transkribus/

VOCABULAIRE INTERNATIONAL DE LA DIPLOMATIQUE [Online] [Consult. 13 August 2021] Available at https://www.cei.lmu.de/VID/

VOGELER, Georg (2014). Digital Diplomatics: What Could the Computer Change in the Study of Medieval Documents? Initial. A Review of Medieval Studies [Online]. Vol. 2, p. 163-185. [Consult. 15 September 2021]. Available at: https://www.academia.edu/11893608/digital_diplomatics_what_could_the_computer_change_in_the_study_of_medieval_documents

VOGELER, Georg (2018). Digital Diplomatics: The Evolution of a European Tradition or a Generic Concept? In: CUBELIC Simon, ZOTTER Axel, MICHAELS Astrid. Studies in Historical Documents from Nepal and India [Online] Heidelberg: Heidelberg University Publishing. [Consult. 15 September 2021] p. 85–109. Available at https://heiup. uni-heidelberg.de/reader/download/331/331-69-80416-1-10-20180223.pdf

ZENODO’S “OCR/HTR model repository” [Consult. 15 September 2021]. Available at https://zenodo.org/communities/ocr_models/search?page=1&size=20

ÉCOLE FRANÇAISE DE ROME, Medieval European Cartularies [Consult. 15 September 2021]. Available at https://www.efrome.it/en/meca

Downloads

Published

2024-03-15