suolyer commited on
Commit
37efffa
1 Parent(s): b5221a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +126 -0
README.md CHANGED
@@ -1,3 +1,129 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - zh
4
  license: apache-2.0
5
+
6
+ tags:
7
+ - bert
8
+ - NLU
9
+ - Sentiment
10
+ - Chinese
11
+
12
+ inference: false
13
+
14
  ---
15
+ # Erlangshen-Ubert-330M, model (Chinese),one model of [Fengshenbang-LM](---
16
+ language:
17
+ - zh
18
+ license: apache-2.0
19
+
20
+ tags:
21
+ - bert
22
+ - NLU
23
+ - Sentiment
24
+ - Chinese
25
+
26
+ inference: false
27
+
28
+ ---
29
+ # Erlangshen-Ubert-110M, model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/dev/yangping/fengshen/examples/ubert).
30
+ We collect 70+ datasets in the Chinese domain for finetune, with a total of 1065069 samples. Our model is mainly based on [macbert](https://huggingface.co/hfl/chinese-macbert-base)
31
+
32
+ Ubert is a solution we proposed when we were doing the [2022 AIWIN World Artificial Intelligence Innovation Competition](http://ailab.aiwin.org.cn/competitions/68#results), and achieved the first place in the A/B list. Compared with the officially provided baseline, an increase of 20 percentage points. Ubert can not only complete common extraction tasks such as entity recognition and event extraction, but also classification tasks such as news classification and natural language reasoning.
33
+
34
+ more detail in our [github](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/dev/yangping/fengshen/examples/ubert)
35
+
36
+ ## Usage
37
+ pip install fengshen
38
+ ```python
39
+ git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
40
+ cd Fengshenbang-LM
41
+ pip install --editable ./
42
+ ```
43
+
44
+ run the code
45
+ ```python
46
+ import argparse
47
+ from fengshen import UbertPiplines
48
+
49
+ total_parser = argparse.ArgumentParser("TASK NAME")
50
+ total_parser = UbertPiplines.piplines_args(total_parser)
51
+ args = total_parser.parse_args()
52
+
53
+ test_data=[
54
+ {
55
+ "task_type": "抽取任务",
56
+ "subtask_type": "实体识别",
57
+ "text": "这也让很多业主据此认为,雅清苑是政府公务员挤对了国家的经适房政策。",
58
+ "choices": [
59
+ {"entity_type": "小区名字"},
60
+ {"entity_type": "岗位职责"}
61
+ ],
62
+ "id": 0}
63
+ ]
64
+
65
+ model = UbertPiplines(args)
66
+ result = model.predict(test_data)
67
+ for line in result:
68
+ print(line)
69
+ ```
70
+
71
+ If you find the resource is useful, please cite the following website in your paper.
72
+ ```
73
+ @misc{Fengshenbang-LM,
74
+ title={Fengshenbang-LM},
75
+ author={IDEA-CCNL},
76
+ year={2021},
77
+ howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
78
+ }
79
+ ```).
80
+ We collect 70+ datasets in the Chinese domain for finetune, with a total of 1065069 samples. Our model is mainly based on [macbert](https://huggingface.co/hfl/chinese-macbert-base)
81
+
82
+ Ubert is a solution we proposed when we were doing the [2022 AIWIN World Artificial Intelligence Innovation Competition](http://ailab.aiwin.org.cn/competitions/68#results), and achieved the first place in the A/B list. Compared with the officially provided baseline, an increase of 20 percentage points. Ubert can not only complete common extraction tasks such as entity recognition and event extraction, but also classification tasks such as news classification and natural language reasoning.
83
+
84
+ more detail in our [github](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/dev/yangping/fengshen/examples/ubert)
85
+
86
+ ## Usage
87
+ pip install fengshen
88
+ ```python
89
+ git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
90
+ cd Fengshenbang-LM
91
+ pip install --editable ./
92
+ ```
93
+
94
+ run the code
95
+ ```python
96
+ import argparse
97
+ from fengshen import UbertPiplines
98
+
99
+ total_parser = argparse.ArgumentParser("TASK NAME")
100
+ total_parser = UbertPiplines.piplines_args(total_parser)
101
+ args = total_parser.parse_args()
102
+
103
+ test_data=[
104
+ {
105
+ "task_type": "抽取任务",
106
+ "subtask_type": "实体识别",
107
+ "text": "这也让很多业主据此认为,雅清苑是政府公务员挤对了国家的经适房政策。",
108
+ "choices": [
109
+ {"entity_type": "小区名字"},
110
+ {"entity_type": "岗位职责"}
111
+ ],
112
+ "id": 0}
113
+ ]
114
+
115
+ model = UbertPiplines(args)
116
+ result = model.predict(test_data)
117
+ for line in result:
118
+ print(line)
119
+ ```
120
+
121
+ If you find the resource is useful, please cite the following website in your paper.
122
+ ```
123
+ @misc{Fengshenbang-LM,
124
+ title={Fengshenbang-LM},
125
+ author={IDEA-CCNL},
126
+ year={2021},
127
+ howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
128
+ }
129
+ ```