Token Classification
GLiNER
PyTorch
multilingual
File size: 2,562 Bytes
413e21c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
license: apache-2.0
language:
  - multilingual
library_name: gliner
datasets:
  - medieval-data/medieval-latin-ner-HOME-Alcar-sents
pipeline_tag: token-classification
---

# About

This is a GLiNER model finetuned on medieval Latin. It was trained to improve the identification of PERSON and LOC. It was finetuned from [urchade/gliner_multi-v2.1](https://huggingface.co/urchade/gliner_multi-v2.1). The model was finetuned on 1,500 annotations from the [Home Alcar sentences](https://huggingface.co/datasets/medieval-data/medieval-latin-ner-HOME-Alcar-sents). Only 1,500 were selected to prevent catastrophic forgetting.

GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.

## Installation
To use this model, you must install the GLiNER Python library:
```
!pip install gliner
```

## Usage
Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using `GLiNER.from_pretrained` and predict entities with `predict_entities`.

```python
from gliner import GLiNER

model = GLiNER.from_pretrained("medieval-data/gliner_multi-v2.1-medieval-latin")

text = """
Testes : magister Stephanus cantor Autissiodorensis , Petrus capellanus comitis , Gaufridus clericus , Hugo de Argenteolo , Milo Filluns , Johannes Maleherbe , Nivardus de Argenteolo , Columbus tunc prepositus Tornodorensis , Johannes prepositus Autissiodorensis , Johannes Brisebarra .
"""

labels = ["PERSON", "LOC"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])
```

```
Stephanus => PERSON
Autissiodorensis => LOC
Petrus => PERSON
Gaufridus => PERSON
Hugo de Argenteolo => PERSON
Milo Filluns => PERSON
Johannes Maleherbe => PERSON
Nivardus de Argenteolo => PERSON
Columbus => PERSON
Tornodorensis => LOC
Johannes => PERSON
Autissiodorensis => LOC
Johannes Brisebarra => PERSON
```


## Citation to Original GLiNER Model
```bibtex
@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```