Usando o eScriptorium para trabalhar em documentos: eScriptorium aplicado em um cartulário francês do século XIII

Autores

DOI:

https://doi.org/10.24206/lh.v10i1.63294

Palavras-chave:

Diplomatics, Digital Humanities, Cartulary, eScriptorium

Resumo

No âmbito da investigação sobre a Diplomática Digital ou sobre novas tecnologias utilizáveis nos estudos de Diplomática, este estudo aborda a experiência de tratamento de um cartulário francês do século XIII com o eScriptorium. Analisando o funcionamento deste open-source software criado principalmente para a transcrição automática de qualquer tipo de documento, em qualquer língua e em qualquer suporte, veremos passo a passo a sua aplicação no cartulário da igreja de Saint-Etienne-des-Grès. Se a hipótese de partida sugeria, por um lado, a utilização do software para a análise das características internas e externas e, por outro, a experimentação de vários modelos HTR sobre uma tipologia documental que, por sua essência, pode incluir uma grande variedade de escritas diferentes, como é o caso, o trabalho revelou-se satisfatório para o reconhecimento das características externas do documento e obteve-se um bom resultado em termos de transcrição automática. Relativamente à análise das características internas, não é atualmente possível realizá-la diretamente no software. Aliás, nesta fase, o trabalho que se faz no eScriptorium pode ser exportado e depois transformado em TEI, tendo em conta que a equipa do eScriptorium está a trabalhar na integração dos fundamentos do TEI na plataforma, prometendo bons resultados num futuro próximo.

Referências

ADAM MATTHEW DIGITAL, Quartex [Consult. 25 January 2023]. Available at https://www.amdigital.co.uk/create/am-quartex

Aletheia [Consult. 25 January 2023]. Available at https://www.primaresearch.org/tools/Aletheia

AMBROSIO, Antonella (2020). La Diplomatica e il digitale. Il Fondo della Biblioteca della Società Napoletana di Storia Patria online. RTH – Education & Philosophy. Riconoscimenti, trasformazioni e misteri. Culture a confronto attraverso secoli di Autor [Online]. Vol.7. [Consult. 15 September 2021]. Available at https://doi.org/10.6093/2284-0184/7330

ANSANI, Michele (1999). Diplomatica (e diplomatisti) nell’arena digitale. Scrineum , v. 1, p. 1-11. Available at http://dobc.unipv.it/scrineum/biblioteca/ ansani.htm

ANYOCR [Consult. 25 January 2023]. Available at https://anyline.com/free-ocr-font

ARCHETYPE software [Consult. 15 September 2021]. Available at https://github.com/kcl-ddh/digipal/wiki

ARCHIVES NATIONALES DE FRANCE, AN LL465 IIIF MANIFEST [Consult. 15 September 2021]. Available at https://bvmm.irht.cnrs.fr/iiif/33551/manifest

CORPUSBUILDER [Consult. 25 January 2023]. Available at https://github.com/berkmancenter/corpusbuilder

DEROLEZ, Albert (2003) –The Palaeography of Gothic Manuscript Books: From the Twelfth to the Early Sixteenth Century. Cambridge: Cambridge University Press.

DURANTI, Luciana (2009). From Digital Diplomatics to Digital Records Forensics. Archivaria, v. 68, p. 39–66. Available at https://archivaria.ca/index.php/archivaria/article/view/13229

eSCRIPTA blog [Consult. 15 September 2021]. Available at https://escripta.hypotheses.org/

eSCRIPTORIUM software [Consult. 15 September 2021]. Available at https://gitlab.com/scripta/escriptorium/

eSCRIPTORIUM webpage [Consult. 15 September 2021]. Available at https://escriptorium.fr/

EUROPEAN COMMISSION (2015). “tranScriptorium”, Community Research and Development Information Service. [Consult. 25 January 2023]. Available at available at: https://cordis.europa.eu/project/rcn/106843_en.html

EUROPEAN COMMISSION (2016). “Recognition and enrichment of archival documents”, Community Research and Development Information Service, [Consult. 25 January 2023]. Available at: https://cordis.europa.eu/project/rcn/198756_en.html

EUROPEAN COMMISSION (2019). “REligious Studies Infrastructure: tooLs, Experts, conNections and Centers”, Community Research and Development Information Service, [Consult. 25 January 2023]. Available at https://cordis.europa.eu/project/id/871127

GÉHIN, Paul (2005) – Lire le manuscrit médiéval. Paris: Colin.

INSTITUT DE RECHERCHE EN HISTOIRE DES TEXTES, CartuIr [Consult. 15 September 2021]. Available at https://telma-repertoires.irht.cnrs.fr/cartulr/page/presentation

INSTITUT DE RECHERCHE EN HISTOIRE DES TEXTES, RegeCart webpage [Consult. 15 September 2021]. Available at http://regecart.irht.cnrs.fr/

INTERNATIONAL IMAGE INTEROPERABILITY FRAMEWORK [Consult. 15 September 2021]. Available at https://iiif.io/

InterPARES Project [Consult. 15 September 2021]. Available at http://www.interpares.org/

KIESSLING, Benjamin (2019). Kraken – A Universal Text Recognizer for the Humanities. DH 2019 [Online]. [Consult. 20 September 2021]. Available at https://dev.clariah.nl/files/dh2019/boa/0673.html

KIESSLING, Benjamin et al. (2021). The eScriptorium VRE for Manuscript Cultures. Classic@ Journal [Online]. v.18, n. 1. [Consult. 17 September 2021] Available at: https://classics-at.chs.harvard.edu/classics18-stokes-kiessling-stokl-ben-ezra-tissot-gargem/

KRAKEN software [Consult. 15 September 2021]. Available at http://kraken.re/; resource code github.com/mittagessen/kraken

KÖLZER, Theo (2009). Diplomatik, Edition, Computer. In: VOGELER Georg. Digitale Diplomatik. Neue Technologien in der historischen Arbeit mit Urkunden, Archiv für Diplomatik – Beiheft 12; Köln-Weimar-Wien, p. 13-27.

MILLER, Matthew Thomas; ROMANOV, Maxim G.; BOWEN SAVANT Sarah (2018). Digitizing the Textual Heritage of the Premodern Islamicate World: Principles and Plans. International Journal of Middle East Studies, v. 50, n.1, p. 103–109. Available at https://doi.org/10.1017/S0020743817000964

MUEHLBERGER, Guenter et al. (2019). Transforming Scholarship in the Archives Through Handwritten Text Recognition: Transkribus as a Case Study. Journal of Documentation, v. 75, n.5, p. 954-976. Available at https://doi.org/10.1108/JD-07-2018-0114

OCR4all [Consult. 25 January 2023]. Available at https://github.com/ocropus/ocropy

OCRopus [Consult. 25 January 2023]. Available at https://www.ocr4all.org/

PYTORCH [Consult. 25 January 2023]. Available at https://github.com/courao/ocr.pytorch

RAJAN, Vinodh; STIEHL, H. (2019). AMAP: A Visual Programming Language Based System to Support Document Image Analysis. MuC’19: Proceedings of Mensch und Computer 2019, p. 881-884. [Consult. 25 January 2023]. Available at https://doi.org/10.1145/3340764.3345372

RAJAN, Vinodh; STIEHL, H. (2019). Making DIA Accessible to Non-Experts: Designing a Visual Programming Language for Document Image Analysis, p. 23-27. [Consult. 25 January 2023]. Available at https://doi.org/10.1109/ICDARW.2019.20048

SCHWARZ-RICCI Vera Isabell (2022). “Il riconoscimento automatico di scrittura per documenti storici: rapporto tecnico”. Istituto di Scienze del Patrimonio Culturale, Consiglio Nazionale delle Ricerche, Sede di Napoli. [Consult. 25 January 2023] Available at: https://openportal.ispc.cnr.it/data/2022/465029/2022_465029.published.pdf?id=people______%3A%3A4b1167634bb7c9512338f055242e9fe8

SCRIPTA PSL Project [Consult. 15 September 2021]. Available at https://scripta.psl.eu/

STOKES, Peter (2020). eScriptorium : un outil pour la transcription automatique des documents.” ÉpheNum: Veille, agenda et actualités des humanités numériques à l’EPHE [Online]. [Consult. 14 September 2021] Available at https://ephenum.hypotheses.org/1412

TENSORFLOW [Consult. 25 January 2023] Available at: https://www.tensorflow.org/lite/examples/optical_character_recognition/overview

TESSERACt 4.0. [Consult. 25 January 2023] Available at: https://github.com/tesseract-ocr/tesseract

TEXT ENCODING INITIATIVE [Consult. 15 September 2021]. Available at https://tei-c.org/

TRANSKRIBUS [Consult. 25 January 2023]. Available at https://readcoop. eu/it/transkribus/

VOCABULAIRE INTERNATIONAL DE LA DIPLOMATIQUE [Online] [Consult. 13 August 2021] Available at https://www.cei.lmu.de/VID/

VOGELER, Georg (2014). Digital Diplomatics: What Could the Computer Change in the Study of Medieval Documents? Initial. A Review of Medieval Studies [Online]. Vol. 2, p. 163-185. [Consult. 15 September 2021]. Available at: https://www.academia.edu/11893608/digital_diplomatics_what_could_the_computer_change_in_the_study_of_medieval_documents

VOGELER, Georg (2018). Digital Diplomatics: The Evolution of a European Tradition or a Generic Concept? In: CUBELIC Simon, ZOTTER Axel, MICHAELS Astrid. Studies in Historical Documents from Nepal and India [Online] Heidelberg: Heidelberg University Publishing. [Consult. 15 September 2021] p. 85–109. Available at https://heiup. uni-heidelberg.de/reader/download/331/331-69-80416-1-10-20180223.pdf

ZENODO’S “OCR/HTR model repository” [Consult. 15 September 2021]. Available at https://zenodo.org/communities/ocr_models/search?page=1&size=20

ÉCOLE FRANÇAISE DE ROME, Medieval European Cartularies [Consult. 15 September 2021]. Available at https://www.efrome.it/en/meca

Downloads

Publicado

2024-03-15