|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
base_model: distilbert/distilbert-base-uncased |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- conll2003 |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
- accuracy |
|
model-index: |
|
- name: base-NER |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: conll2003 |
|
type: conll2003 |
|
config: conll2003 |
|
split: test |
|
args: conll2003 |
|
metrics: |
|
- name: Precision |
|
type: precision |
|
value: 0.8845085098992705 |
|
- name: Recall |
|
type: recall |
|
value: 0.9017351274787535 |
|
- name: F1 |
|
type: f1 |
|
value: 0.8930387515342801 |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.9782491655001615 |
|
--- |
|
|
|
|
|
# base-NER: A Named Entity Recognition (NER) Model |
|
|
|
`base-NER` is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the CoNLL2003 dataset, designed for the task of **Named Entity Recognition (NER)**. This model can identify entities like people, organizations, locations, and more from text. |
|
|
|
|
|
```python |
|
from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline |
|
|
|
model = AutoModelForTokenClassification.from_pretrained("eddiegulay/base-NER") |
|
tokenizer = AutoTokenizer.from_pretrained("eddiegulay/base-NER") |
|
|
|
classifier = pipeline("ner", model=model, tokenizer=tokenizer) |
|
result = classifier("My name is Edgar and I stay in Dar es Salaam") |
|
print(result) |
|
``` |
|
|
|
|
|
## Model Performance |
|
|
|
The model achieved the following results on the CoNLL2003 test set: |
|
- **Precision**: 0.8845 |
|
- **Recall**: 0.9017 |
|
- **F1-Score**: 0.8930 |
|
- **Accuracy**: 0.9782 |
|
|
|
The loss during training was 0.1129 on the validation set. |
|
|
|
## Model Description |
|
|
|
This model leverages the DistilBERT architecture, which is a smaller and faster version of BERT, designed for efficiency while maintaining strong performance. The model is specifically fine-tuned for NER tasks, making it ideal for entity extraction in various domains like finance, healthcare, or general text analytics. |
|
|
|
## Intended Uses & Limitations |
|
|
|
**Intended Uses**: |
|
- Text extraction tasks for recognizing names of people, organizations, locations, dates, and other named entities in a sentence. |
|
- Suitable for use in production applications where lightweight models are preferred due to memory or speed constraints. |
|
|
|
**Limitations**: |
|
- The model is limited to English texts, as it was trained on the CoNLL2003 dataset. |
|
- Performance may degrade when used on domain-specific entities not present in the CoNLL2003 dataset (e.g., technical or biomedical domains). |
|
- May struggle with ambiguous or context-dependent entity classifications. |
|
|
|
## Training and Evaluation Data |
|
|
|
The model was trained on the **CoNLL2003** dataset, which contains annotations for named entities in English text. It is a widely-used dataset for NER tasks, consisting of four entity types: **person**, **organization**, **location**, and **miscellaneous**. |
|
|
|
### Dataset Configuration |
|
- **Dataset**: CoNLL2003 |
|
- **Split**: Test set used for evaluation |
|
- **Entity Types**: Person, Organization, Location, Miscellaneous |
|
|
|
## Training Procedure |
|
|
|
The model was fine-tuned for 2 epochs using a linear learning rate scheduler and an Adam optimizer. |
|
|
|
### Training Hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- **Learning Rate**: 2e-5 |
|
- **Batch Size**: 16 (train and eval) |
|
- **Seed**: 42 |
|
- **Optimizer**: Adam (betas=(0.9,0.999), epsilon=1e-8) |
|
- **Scheduler**: Linear |
|
- **Epochs**: 2 |
|
|
|
### Training Results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |
|
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:| |
|
| 0.0595 | 1.0 | 878 | 0.1046 | 0.8676 | 0.8909 | 0.8791 | 0.9762 | |
|
| 0.0319 | 2.0 | 1756 | 0.1129 | 0.8845 | 0.9017 | 0.8930 | 0.9782 | |
|
|
|
## Usage Example |
|
|
|
You can use this model with Hugging Face's `transformers` library for token classification tasks: |
|
|
|
|
|
## Framework Versions |
|
|
|
- Transformers 4.44.2 |
|
- Pytorch 2.4.0+cu121 |
|
- Datasets 2.21.0 |
|
- Tokenizers 0.19.1 |
|
|
|
## Future Improvements |
|
|
|
- Fine-tuning the model on more domain-specific datasets for improved generalization. |
|
- Implementing entity recognition for additional entity types, including products, dates, and technical terms. |
|
|
|
Feel free to modify or add more details, especially for sections like model description, intended uses, and limitations. |
|
|
|
|