efontes-upos / README.md
novacellus's picture
Added ref and sample texts
50c853f verified
|
raw
history blame
2.24 kB
metadata
license: cc-by-nc-sa-4.0
language:
  - la
metrics:
  - f1
pipeline_tag: token-classification
widget:
  - text: >-
      Testes Prandote de Sproua contra Sbisconem : Sbisco de Olesnicza, quod
      scit et testatur Czescz, Sbisco de Marczinczouicz in testimonium,
      Stanislaus Lantka in testimonium, Wirzchoslaus in testimonium, Benik de
      Thopola, Thomas de Tzsczenecz
    example_title: Court Proceedings
  - text: >-
      Iohannis de Iacussouicze heredes sana ac libera voluntate post araturas
      suas decimas, videlicet Nicolaus in Micolayow, Scropezin et in Zagayow,
      Stanislaus et Bogussius in Semlicze, Osmolicze, Gawoltow, Bogumilouicze,
      Sbiluth et Petrus in Sbiluthouicze et Mnogolicze, Msczugius in
      Jacuszouicze, nunc vero Bliszne, Srzednye, Dalne Jacussouicze wlgariter
      nuncupantur, ad honorem et laudem Omnipotentis Dei et ob reverenciam B. et
      Gloriose Dei Genitricis Virginis Marie et omnium sanctorum dicte ecclesie
      in Parwa Cazimirza dictas decimas dederunt, contulerunt et perpetuis
      temporibus assignaverunt.
    example_title: Document (Poland)
  - text: >-
      Eugenius episcopus servus servorum dei venerabili fratri Warnero
      Ulotizlauensi episcopo eiusque successoribus canonice substituendis in
      perpetuum.
    example_title: Document (papal)
  - text: >-
      Hec sunt nomina pontificum Cracoviensium. Prohorius. Proculphus. Poppo.
      Gompo. Rachelinus. Aaron archiepiscopus quintus. Sula cognominatus
      Lambertus. Beatus Stanyzlaus martir.
    example_title: Catalogue of bishops
  - text: >-
      Hoc eventu Bolezlauus cum eodem exercitu de Pomoranis se vindicare
      disposuit, iamque cepta via Bohemos in Poloniam exire fama precurrens
      innotuit.
    example_title: Historiography
tags:
  - Medieval Latin
  - Latin
  - Lemmatization
library_name: transformers

Xlm-roberta-large model fine-tuned on UD and eFontes corpora, to optimize and enhance PoS tagging task on Polish medieval Latin texts from different genres.

Details can be found in the paper:

Nowak, Krzysztof, Jędrzej Ziębura, Krzysztof Wróbel and Aleksander Smywiński-Pohl. "eFontes. Part of Speech Tagging and Lemmatization of Medieval Latin Texts. A Cross-Genre Survey." (2024). https://doi.org/10.48550/arXiv.2407.00418