2f-dev commited on
Commit
66c5c08
1 Parent(s): 4d8750a

Update README.md

Browse files

update model card

Files changed (1) hide show
  1. README.md +169 -3
README.md CHANGED
@@ -1,3 +1,169 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - ko
5
+ - vi
6
+ metrics:
7
+ - bleu
8
+ base_model:
9
+ - facebook/mbart-large-50-many-to-many-mmt
10
+ pipeline_tag: translation
11
+ library_name: transformers
12
+ tags:
13
+ - mbart
14
+ - mbart-50
15
+ - text2text-generation
16
+ ---
17
+
18
+
19
+
20
+
21
+
22
+ # Model Card for mbart-large-50-mmt-ko-vi
23
+
24
+ This model is fine-tuned from mBART-large-50 using multilingual translation data of Korean legal documents for Korean-to-Vietnamese translation tasks.
25
+
26
+ ---
27
+
28
+ ## Table of Contents
29
+
30
+ - [Model Card for mbart-large-50-mmt-ko-vi](#model-card-for-mbart-large-50-mmt-ko-vi)
31
+ - [Table of Contents](#table-of-contents)
32
+ - [Model Details](#model-details)
33
+ - [Model Description](#model-description)
34
+ - [Uses](#uses)
35
+ - [Direct Use](#direct-use)
36
+ - [Out-of-Scope Use](#out-of-scope-use)
37
+ - [Bias, Risks, and Limitations](#bias-risks-and-limitations)
38
+ - [Training Details](#training-details)
39
+ - [Training Data](#training-data)
40
+ - [Training Procedure](#training-procedure)
41
+ - [Preprocessing](#preprocessing)
42
+ - [Speeds, Sizes, Times](#speeds-sizes-times)
43
+ - [Evaluation](#evaluation)
44
+ - [Testing Data](#testing-data)
45
+ - [Metrics](#metrics)
46
+ - [Results](#results)
47
+ - [Environmental Impact](#environmental-impact)
48
+ - [Technical Specifications](#technical-specifications)
49
+ - [Citation](#citation)
50
+ - [Model Card Contact](#model-card-contact)
51
+
52
+ ---
53
+
54
+ ## Model Details
55
+
56
+ ### Model Description
57
+
58
+ - **Developed by:** Jaeyoon Myoung, Heewon Kwak
59
+ - **Shared by:** ofu
60
+ - **Model type:** Language model (Translation)
61
+ - **Language(s) (NLP):** Korean, Vietnamese
62
+ - **License:** Apache 2.0
63
+ - **Parent Model:** facebook/mbart-large-50-many-to-many-mmt
64
+
65
+ ---
66
+
67
+ ## Uses
68
+
69
+ ### Direct Use
70
+
71
+ This model is used for text translation from Korean to Vietnamese.
72
+
73
+ ### Out-of-Scope Use
74
+
75
+ This model is not suitable for translation tasks involving languages other than Korean.
76
+
77
+ ---
78
+
79
+ ## Bias, Risks, and Limitations
80
+
81
+ The model may contain biases inherited from the training data and may produce inappropriate translations for sensitive topics.
82
+
83
+ ---
84
+
85
+ ## Training Details
86
+
87
+ ### Training Data
88
+
89
+ The model was trained using multilingual translation data of Korean legal documents provided by AI Hub.
90
+
91
+ ### Training Procedure
92
+
93
+ #### Preprocessing
94
+
95
+ - Removed unnecessary whitespace, special characters, and line breaks.
96
+
97
+ ### Speeds, Sizes, Times
98
+ - **Training Time:** 1 hour 25 minutes (5,100 seconds) on Nvidia RTX 4090
99
+ - **Throughput:** ~3.51 samples/second
100
+ - **Total Training Samples:** 17,922
101
+ - **Model Checkpoint Size:** Approximately 2.3GB
102
+ - **Gradient Accumulation Steps:** 4
103
+ - **FP16 Mixed Precision Enabled:** Yes
104
+
105
+ ---
106
+
107
+ ## Evaluation
108
+
109
+ ### Testing Data
110
+
111
+ The evaluation used a dataset partially extracted from Korean labor law precedents.
112
+
113
+ ### Metrics
114
+
115
+ - BLEU
116
+
117
+ ### Results
118
+
119
+ - **BLEU Score:** 29.69
120
+ - **Accuracy:** 95.65%
121
+
122
+ ---
123
+
124
+ ## Environmental Impact
125
+
126
+ - **Hardware Type:** NVIDIA RTX 4090
127
+ - **Power Consumption:** ~450W
128
+ - **Training Time:** 1 hour 25 minutes (1.42 hours)
129
+ - **Electricity Consumption:** ~0.639 kWh
130
+ - **Carbon Emission Factor (South Korea):** 0.459 kgCO₂/kWh
131
+ - **Estimated Carbon Emissions:** ~0.293 kgCO₂
132
+
133
+ ---
134
+
135
+ ## Technical Specifications
136
+
137
+ - **Model Architecture:**
138
+ Based on mBART-large-50, a multilingual sequence-to-sequence transformer model designed for translation tasks. The architecture includes 24 encoder and 24 decoder layers with 1,024 hidden units.
139
+
140
+ - **Software:**
141
+ - sacrebleu for evaluation
142
+ - Hugging Face Transformers library for fine-tuning
143
+ - Python 3.11.9 and PyTorch 2.4.0
144
+
145
+ - **Hardware:**
146
+ NVIDIA RTX 4090 with 24GB VRAM was used for training and inference.
147
+
148
+ - **Tokenization and Preprocessing:**
149
+ The tokenization was performed using the SentencePiece model pre-trained with mBART-large-50. Text preprocessing included removing special characters, unnecessary whitespace, and normalizing line breaks.
150
+
151
+ - **Optimizer and Hyperparameters:**
152
+ - Optimizer: AdamW
153
+ - Learning Rate: 1e-4
154
+ - Batch Size: 8 (per device)
155
+ - Gradient Accumulation Steps: 4
156
+ - Label Smoothing Factor: 0.1
157
+ - FP16 Mixed Precision Enabled: Yes
158
+
159
+ ---
160
+
161
+ ## Citation
162
+
163
+ Currently, there are no papers or blog posts available for this model.
164
+
165
+ ---
166
+
167
+ ## Model Card Contact
168
+
169
+ - **Contact Email:** audwodbs492@ofu.co.kr