klue-roberta-small-ner-identified
This model is a fine-tuned version of vitus9988/klue-roberta-small-ner-identified on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0082
- Precision: 0.9930
- Recall: 0.9988
- F1: 0.9959
- Accuracy: 0.9988
Model description
개인정보 비식별을 위해 아래 항목에 대한 개체명 인식을 제공합니다.
- 사람이름 [PS]
- 주소 (구 주소 및 도로명 주소) [AD]
- 카드번호 [CN]
- 계좌번호 [BN]
- 운전면허번호 [DN]
- 주민등록번호 [RN]
- 여권번호 [PN]
- 전화번호 [PH]
- 이메일 주소 [EM]
- 날짜 [DT]
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
No log | 1.0 | 61 | 0.0128 | 0.9871 | 0.9929 | 0.9900 | 0.9979 |
No log | 2.0 | 122 | 0.0098 | 0.9895 | 0.9976 | 0.9935 | 0.9987 |
No log | 3.0 | 183 | 0.0082 | 0.9930 | 0.9988 | 0.9959 | 0.9988 |
Framework versions
- Transformers 4.40.2
- Pytorch 2.3.0+cu118
- Datasets 2.19.1
- Tokenizers 0.19.1
Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("vitus9988/klue-roberta-small-ner-identified")
model = AutoModelForTokenClassification.from_pretrained("vitus9988/klue-roberta-small-ner-identified")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
example = """
저는 김철수입니다. 집은 서울특별시 강남대로이고 전화번호는 010-1234-5678, 주민등록번호는 123456-1234567입니다. 메일주소는 hugging@face.com입니다. 저는 10월 25일에 출국할 예정입니다.
"""
ner_results = nlp(example)
for i in ner_results:
print(i)
#{'entity_group': 'PS', 'score': 0.9617835, 'word': '김철수', 'start': 3, 'end': 6}
#{'entity_group': 'AD', 'score': 0.9839702, 'word': '서울특별시 강남대로', 'start': 14, 'end': 24}
#{'entity_group': 'PH', 'score': 0.9906756, 'word': '010 - 1234 - 5678', 'start': 33, 'end': 46}
#{'entity_group': 'RN', 'score': 0.9904553, 'word': '123456 - 1234567', 'start': 56, 'end': 70}
#{'entity_group': 'EM', 'score': 0.99022245, 'word': 'hugging @ face. com', 'start': 81, 'end': 97}
#{'entity_group': 'DT', 'score': 0.985629, 'word': '10월 25일', 'start': 105, 'end': 112}
- Downloads last month
- 65
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for vitus9988/klue-roberta-small-ner-identified
Base model
klue/roberta-small