File size: 2,188 Bytes
2a65f35 d5968f3 2a65f35 1de3fd6 2a65f35 5ea1bf8 23b9777 2a65f35 5ea1bf8 4ec9fa4 5ea1bf8 1b7118c 5ea1bf8 2a65f35 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
language: zh
tags:
- sentiment-analysis
- pytorch
widget:
- text: "房间非常非常小,内窗,特别不透气,因为夜里走廊灯光是亮的,内窗对着走廊,窗帘又不能完全拉死,怎么都会有一道光射进来。"
- text: "尽快有洗衣房就好了。"
- text: "很好,干净整洁,交通方便。"
- text: "干净整洁很好"
---
# Note
BERT based sentiment analysis, finetune based on https://huggingface.co/IDEA-CCNL/Erlangshen-Roberta-330M-Sentiment .
The model trained on **hotel human review chinese dataset**.
# Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
MODEL = "tezign/Erlangshen-Sentiment-FineTune"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL, trust_remote_code=True)
classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer)
result = classifier("很好,干净整洁,交通方便。")
print(result)
"""
print result
>> [{'label': 'Positive', 'score': 0.989660382270813}]
"""
```
# Evaluate
We compared and evaluated the performance of **Our finetune model** and the **Original Erlangshen model** on the **hotel human review test dataset**(5429 negative reviews and 1251 positive reviews).
The results showed that our model substantial improved the precision and recall of positive reviews:
```text
Our finetune model:
precision recall f1-score support
Negative 0.99 0.98 0.98 5429
Positive 0.92 0.95 0.93 1251
accuracy 0.97 6680
macro avg 0.95 0.96 0.96 6680
weighted avg 0.97 0.97 0.97 6680
======================================================
Original Erlangshen model:
precision recall f1-score support
Negative 0.81 1.00 0.90 5429
Positive 0.00 0.00 0.00 1251
accuracy 0.81 6680
macro avg 0.41 0.50 0.45 6680
weighted avg 0.66 0.81 0.73 6680
``` |