metadata

language: fa
license: apache-2.0

ParsBERT (v3.0)

A Transformer-based Model for Persian Language Understanding

The new version of BERT v3.0 for Persian is available today and can tackle the zero-width non-joiner character for Persian writing. Also, the model was trained on new multi-types corpora with a new set of vocabulary.

Introduction

ParsBERT is a monolingual language model based on Google’s BERT architecture. This model is pre-trained on large Persian corpora with various writing styles from numerous subjects (e.g., scientific, novels, news).

Paper presenting ParsBERT: arXiv:2005.12515

BibTeX entry and citation info

Please cite in publications as the following:

@article{ParsBERT,
    title={ParsBERT: Transformer-based Model for Persian Language Understanding},
    author={Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri},
    journal={ArXiv},
    year={2020},
    volume={abs/2005.12515}
}

Questions?

Post a Github issue on the ParsBERT Issues repo.