update format in model card
Browse files
README.md
CHANGED
@@ -16,8 +16,8 @@ pipeline_tag: token-classification
|
|
16 |
|
17 |
# Multi-criteria BERT base Chinese with Lattice for Word Segmentation
|
18 |
|
19 |
-
This is a variant of the pre-trained model [BERT](https://github.com/google-research/bert) model
|
20 |
-
The model was pre-trained on texts in the Chinese language and fine-tuned for word segmentation.
|
21 |
This version of the model processes input texts with character-level with word-level incorporated with a lattice structure.
|
22 |
|
23 |
The scripts for the pre-training are available at [tchayintr/latte-ptm-ws](https://github.com/tchayintr/latte-ptm-ws).
|
@@ -28,7 +28,7 @@ The model architecture is described in this [paper](https://www.jstage.jst.go.jp
|
|
28 |
|
29 |
## Training Data
|
30 |
|
31 |
-
The model is trained on multiple Chinese word segmented datasets, including
|
32 |
The datasets can be accessed from [here](https://github.com/hankcs/multi-criteria-cws/tree/master/data).
|
33 |
|
34 |
## Licenses
|
|
|
16 |
|
17 |
# Multi-criteria BERT base Chinese with Lattice for Word Segmentation
|
18 |
|
19 |
+
This is a variant of the pre-trained model [BERT](https://github.com/google-research/bert) model.
|
20 |
+
The model was pre-trained on texts in the Chinese language and fine-tuned for word segmentation based on [bert-base-chinese](https://huggingface.co/bert-base-chinese).
|
21 |
This version of the model processes input texts with character-level with word-level incorporated with a lattice structure.
|
22 |
|
23 |
The scripts for the pre-training are available at [tchayintr/latte-ptm-ws](https://github.com/tchayintr/latte-ptm-ws).
|
|
|
28 |
|
29 |
## Training Data
|
30 |
|
31 |
+
The model is trained on multiple Chinese word segmented datasets, including ctb6, sighan2005 (as, cityu, msra, pku), sighan2008 (sxu), and cnc.
|
32 |
The datasets can be accessed from [here](https://github.com/hankcs/multi-criteria-cws/tree/master/data).
|
33 |
|
34 |
## Licenses
|