Text Classification
Transformers
Safetensors
Hebrew
bert
Inference Endpoints
Edit model card

DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew

State-of-the-art language model for Hebrew.

This is the fine-tuned BERT-base model for the sentiment-analysis task on the HebrewSetniment dataset release by the Israeli National Program for Hebrew and Arabic NLP.

For the bert-base models for other tasks, see here.

Sample usage:

from transformers import pipeline

oracle = pipeline('sentiment-analysis', model='dicta-il/dictabert-sentiment')

sentence = '''ืื ื™ ืžืื•ื“ ืฉืžื— ืฉื”ืžื•ื“ืœ ื”ื–ื” ืžืฉื•ื—ืจืจ ืœืฉื™ืžื•ืฉ ื—ื•ืคืฉื™'''
oracle(sentence)

Output:

[
    {
        "label": "Positive",
        "score": 0.9999868869781494
    }
]

Citation

If you use DictaBERT in your research, please cite DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew

BibTeX:

@misc{shmidman2023dictabert,
      title={DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew}, 
      author={Shaltiel Shmidman and Avi Shmidman and Moshe Koppel},
      year={2023},
      eprint={2308.16687},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Downloads last month
1,108
Safetensors
Model size
184M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dicta-il/dictabert-sentiment

Collection including dicta-il/dictabert-sentiment