tamil-Roberta-small / README.md
AswiN037's picture
Update README.md
4c146c1
|
raw
history blame contribute delete
No virus
265 Bytes
metadata
language:
  - Tamil
tags:
  - Tamil-Tokenizer
  - Tamil-language-model
license: apache-2.0
datasets:
  - oscar

tokenizer - BPE 30_522 vocab size

model - Roberta

    trained using MLM 
    OSCAR dataset
    train data size 5000 lines olly