yacht
/

latte-mc-bert-base-chinese-ws

Token Classification

feature-extraction

word segmentation

Inference Endpoints

Model card Files Files and versions Community

yacht commited on Sep 5, 2023

Commit

e261e75

•

1 Parent(s): 9e61aa6

update format in model card

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -16,8 +16,8 @@ pipeline_tag: token-classification
 # Multi-criteria BERT base Chinese with Lattice for Word Segmentation
-This is a variant of the pre-trained model [BERT](https://github.com/google-research/bert) model
-The model was pre-trained on texts in the Chinese language and fine-tuned for word segmentation.
 This version of the model processes input texts with character-level with word-level incorporated with a lattice structure.
 The scripts for the pre-training are available at [tchayintr/latte-ptm-ws](https://github.com/tchayintr/latte-ptm-ws).
@@ -28,7 +28,7 @@ The model architecture is described in this [paper](https://www.jstage.jst.go.jp
 ## Training Data
-The model is trained on multiple Chinese word segmented datasets, including CTB6, SIGHAN2005 (AS, CITYU, MSRA, PKU), SIGHAN2008 (SXU), and CNC.
 The datasets can be accessed from [here](https://github.com/hankcs/multi-criteria-cws/tree/master/data).
 ## Licenses

 # Multi-criteria BERT base Chinese with Lattice for Word Segmentation
+This is a variant of the pre-trained model [BERT](https://github.com/google-research/bert) model.
+The model was pre-trained on texts in the Chinese language and fine-tuned for word segmentation based on [bert-base-chinese](https://huggingface.co/bert-base-chinese).
 This version of the model processes input texts with character-level with word-level incorporated with a lattice structure.
 The scripts for the pre-training are available at [tchayintr/latte-ptm-ws](https://github.com/tchayintr/latte-ptm-ws).
 ## Training Data
+The model is trained on multiple Chinese word segmented datasets, including ctb6, sighan2005 (as, cityu, msra, pku), sighan2008 (sxu), and cnc.
 The datasets can be accessed from [here](https://github.com/hankcs/multi-criteria-cws/tree/master/data).
 ## Licenses