Using eScriptorium to work on documents: eScriptorium applied on a 13th century French cartulary
DOI:
https://doi.org/10.24206/lh.v10i1.63294Keywords:
Diplomatics, Digital Humanities, Cartulary, eScriptoriumAbstract
As part of the research on Digital Diplomatics or on new tools usable in Diplomatics’ studies, this essay discusses the experience of treating a 13th century French cartulary with eScriptorium. By analyzing the functioning of this open-source software created mainly for the automatic transcription of any type of document, in any language and on any medium, we will see step by step its application on the the Saint-Etienne-des-Grès church’s cartulary. If the starting hypothesis proposed the software’s use for the analysis of both internal and external features on the one hand and on the other the experimentation of several HTR models on a document tipology which by nature can include a great variety of different scripts, such as in this case, the work proved satisfactory for the recognition of the external features of the document and a good result was achieved in terms of automatic transcription. As far as the analysis of the internal features is concerned, it is currently not possible to carry it out directly on the software. By the way, at this stage the work one does on eScriptorium can be exported and then transformed into TEI, taking into account that the eScriptorium team is working on the integration of the TEI’s principles into the tool, promising good results in the near future.
References
ADAM MATTHEW DIGITAL, Quartex [Consult. 25 January 2023]. Available at https://www.amdigital.co.uk/create/am-quartex
Aletheia [Consult. 25 January 2023]. Available at https://www.primaresearch.org/tools/Aletheia
AMBROSIO, Antonella (2020). La Diplomatica e il digitale. Il Fondo della Biblioteca della Società Napoletana di Storia Patria online. RTH – Education & Philosophy. Riconoscimenti, trasformazioni e misteri. Culture a confronto attraverso secoli di Autor [Online]. Vol.7. [Consult. 15 September 2021]. Available at https://doi.org/10.6093/2284-0184/7330
ANSANI, Michele (1999). Diplomatica (e diplomatisti) nell’arena digitale. Scrineum , v. 1, p. 1-11. Available at http://dobc.unipv.it/scrineum/biblioteca/ ansani.htm
ANYOCR [Consult. 25 January 2023]. Available at https://anyline.com/free-ocr-font
ARCHETYPE software [Consult. 15 September 2021]. Available at https://github.com/kcl-ddh/digipal/wiki
ARCHIVES NATIONALES DE FRANCE, AN LL465 IIIF MANIFEST [Consult. 15 September 2021]. Available at https://bvmm.irht.cnrs.fr/iiif/33551/manifest
CORPUSBUILDER [Consult. 25 January 2023]. Available at https://github.com/berkmancenter/corpusbuilder
DEROLEZ, Albert (2003) –The Palaeography of Gothic Manuscript Books: From the Twelfth to the Early Sixteenth Century. Cambridge: Cambridge University Press.
DURANTI, Luciana (2009). From Digital Diplomatics to Digital Records Forensics. Archivaria, v. 68, p. 39–66. Available at https://archivaria.ca/index.php/archivaria/article/view/13229
eSCRIPTA blog [Consult. 15 September 2021]. Available at https://escripta.hypotheses.org/
eSCRIPTORIUM software [Consult. 15 September 2021]. Available at https://gitlab.com/scripta/escriptorium/
eSCRIPTORIUM webpage [Consult. 15 September 2021]. Available at https://escriptorium.fr/
EUROPEAN COMMISSION (2015). “tranScriptorium”, Community Research and Development Information Service. [Consult. 25 January 2023]. Available at available at: https://cordis.europa.eu/project/rcn/106843_en.html
EUROPEAN COMMISSION (2016). “Recognition and enrichment of archival documents”, Community Research and Development Information Service, [Consult. 25 January 2023]. Available at: https://cordis.europa.eu/project/rcn/198756_en.html
EUROPEAN COMMISSION (2019). “REligious Studies Infrastructure: tooLs, Experts, conNections and Centers”, Community Research and Development Information Service, [Consult. 25 January 2023]. Available at https://cordis.europa.eu/project/id/871127
GÉHIN, Paul (2005) – Lire le manuscrit médiéval. Paris: Colin.
INSTITUT DE RECHERCHE EN HISTOIRE DES TEXTES, CartuIr [Consult. 15 September 2021]. Available at https://telma-repertoires.irht.cnrs.fr/cartulr/page/presentation
INSTITUT DE RECHERCHE EN HISTOIRE DES TEXTES, RegeCart webpage [Consult. 15 September 2021]. Available at http://regecart.irht.cnrs.fr/
INTERNATIONAL IMAGE INTEROPERABILITY FRAMEWORK [Consult. 15 September 2021]. Available at https://iiif.io/
InterPARES Project [Consult. 15 September 2021]. Available at http://www.interpares.org/
KIESSLING, Benjamin (2019). Kraken – A Universal Text Recognizer for the Humanities. DH 2019 [Online]. [Consult. 20 September 2021]. Available at https://dev.clariah.nl/files/dh2019/boa/0673.html
KIESSLING, Benjamin et al. (2021). The eScriptorium VRE for Manuscript Cultures. Classic@ Journal [Online]. v.18, n. 1. [Consult. 17 September 2021] Available at: https://classics-at.chs.harvard.edu/classics18-stokes-kiessling-stokl-ben-ezra-tissot-gargem/
KRAKEN software [Consult. 15 September 2021]. Available at http://kraken.re/; resource code github.com/mittagessen/kraken
KÖLZER, Theo (2009). Diplomatik, Edition, Computer. In: VOGELER Georg. Digitale Diplomatik. Neue Technologien in der historischen Arbeit mit Urkunden, Archiv für Diplomatik – Beiheft 12; Köln-Weimar-Wien, p. 13-27.
MILLER, Matthew Thomas; ROMANOV, Maxim G.; BOWEN SAVANT Sarah (2018). Digitizing the Textual Heritage of the Premodern Islamicate World: Principles and Plans. International Journal of Middle East Studies, v. 50, n.1, p. 103–109. Available at https://doi.org/10.1017/S0020743817000964
MUEHLBERGER, Guenter et al. (2019). Transforming Scholarship in the Archives Through Handwritten Text Recognition: Transkribus as a Case Study. Journal of Documentation, v. 75, n.5, p. 954-976. Available at https://doi.org/10.1108/JD-07-2018-0114
OCR4all [Consult. 25 January 2023]. Available at https://github.com/ocropus/ocropy
OCRopus [Consult. 25 January 2023]. Available at https://www.ocr4all.org/
PYTORCH [Consult. 25 January 2023]. Available at https://github.com/courao/ocr.pytorch
RAJAN, Vinodh; STIEHL, H. (2019). AMAP: A Visual Programming Language Based System to Support Document Image Analysis. MuC’19: Proceedings of Mensch und Computer 2019, p. 881-884. [Consult. 25 January 2023]. Available at https://doi.org/10.1145/3340764.3345372
RAJAN, Vinodh; STIEHL, H. (2019). Making DIA Accessible to Non-Experts: Designing a Visual Programming Language for Document Image Analysis, p. 23-27. [Consult. 25 January 2023]. Available at https://doi.org/10.1109/ICDARW.2019.20048
SCHWARZ-RICCI Vera Isabell (2022). “Il riconoscimento automatico di scrittura per documenti storici: rapporto tecnico”. Istituto di Scienze del Patrimonio Culturale, Consiglio Nazionale delle Ricerche, Sede di Napoli. [Consult. 25 January 2023] Available at: https://openportal.ispc.cnr.it/data/2022/465029/2022_465029.published.pdf?id=people______%3A%3A4b1167634bb7c9512338f055242e9fe8
SCRIPTA PSL Project [Consult. 15 September 2021]. Available at https://scripta.psl.eu/
STOKES, Peter (2020). eScriptorium : un outil pour la transcription automatique des documents.” ÉpheNum: Veille, agenda et actualités des humanités numériques à l’EPHE [Online]. [Consult. 14 September 2021] Available at https://ephenum.hypotheses.org/1412
TENSORFLOW [Consult. 25 January 2023] Available at: https://www.tensorflow.org/lite/examples/optical_character_recognition/overview
TESSERACt 4.0. [Consult. 25 January 2023] Available at: https://github.com/tesseract-ocr/tesseract
TEXT ENCODING INITIATIVE [Consult. 15 September 2021]. Available at https://tei-c.org/
TRANSKRIBUS [Consult. 25 January 2023]. Available at https://readcoop. eu/it/transkribus/
VOCABULAIRE INTERNATIONAL DE LA DIPLOMATIQUE [Online] [Consult. 13 August 2021] Available at https://www.cei.lmu.de/VID/
VOGELER, Georg (2014). Digital Diplomatics: What Could the Computer Change in the Study of Medieval Documents? Initial. A Review of Medieval Studies [Online]. Vol. 2, p. 163-185. [Consult. 15 September 2021]. Available at: https://www.academia.edu/11893608/digital_diplomatics_what_could_the_computer_change_in_the_study_of_medieval_documents
VOGELER, Georg (2018). Digital Diplomatics: The Evolution of a European Tradition or a Generic Concept? In: CUBELIC Simon, ZOTTER Axel, MICHAELS Astrid. Studies in Historical Documents from Nepal and India [Online] Heidelberg: Heidelberg University Publishing. [Consult. 15 September 2021] p. 85–109. Available at https://heiup. uni-heidelberg.de/reader/download/331/331-69-80416-1-10-20180223.pdf
ZENODO’S “OCR/HTR model repository” [Consult. 15 September 2021]. Available at https://zenodo.org/communities/ocr_models/search?page=1&size=20
ÉCOLE FRANÇAISE DE ROME, Medieval European Cartularies [Consult. 15 September 2021]. Available at https://www.efrome.it/en/meca
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Michela Galli
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following:
a. The authors hold copyright of the published papers; authors are the sole responsible party for published papers content; the published paper is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License which allows the sharing of the publication as long as there is acknowledgement of authorship and publishing by Revista LaborHistórico.
b. Authors should seek previous permission from the journal in order to publish their articles as book chapters. Such publications should acknowledge first publishing by LaborHistórico.
c. Authors may publish and distribute their papers (for example, at institutional repositories, author's sites) at any time during or after the editorial process by Revista LaborHistórico.