ofu-ai
/

mbart-large-50-mmt-ko-vi

+---
+license: mit
+language:
+- ko
+- vi
+metrics:
+- bleu
+base_model:
+- facebook/mbart-large-50-many-to-many-mmt
+pipeline_tag: translation
+library_name: transformers
+tags:
+- mbart
+- mbart-50
+- text2text-generation
+---
+# Model Card for mbart-large-50-mmt-ko-vi
+This model is fine-tuned from mBART-large-50 using multilingual translation data of Korean legal documents for Korean-to-Vietnamese translation tasks.
+---
+## Table of Contents
+- [Model Card for mbart-large-50-mmt-ko-vi](#model-card-for-mbart-large-50-mmt-ko-vi)
+- [Table of Contents](#table-of-contents)
+- [Model Details](#model-details)
+  - [Model Description](#model-description)
+- [Uses](#uses)
+  - [Direct Use](#direct-use)
+  - [Out-of-Scope Use](#out-of-scope-use)
+- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
+- [Training Details](#training-details)
+  - [Training Data](#training-data)
+  - [Training Procedure](#training-procedure)
+    - [Preprocessing](#preprocessing)
+    - [Speeds, Sizes, Times](#speeds-sizes-times)
+- [Evaluation](#evaluation)
+  - [Testing Data](#testing-data)
+  - [Metrics](#metrics)
+  - [Results](#results)
+- [Environmental Impact](#environmental-impact)
+- [Technical Specifications](#technical-specifications)
+- [Citation](#citation)
+- [Model Card Contact](#model-card-contact)
+---
+## Model Details
+### Model Description
+- **Developed by:** Jaeyoon Myoung, Heewon Kwak
+- **Shared by:** ofu
+- **Model type:** Language model (Translation)
+- **Language(s) (NLP):** Korean, Vietnamese
+- **License:** Apache 2.0
+- **Parent Model:** facebook/mbart-large-50-many-to-many-mmt
+---
+## Uses
+### Direct Use
+This model is used for text translation from Korean to Vietnamese.
+### Out-of-Scope Use
+This model is not suitable for translation tasks involving languages other than Korean.
+---
+## Bias, Risks, and Limitations
+The model may contain biases inherited from the training data and may produce inappropriate translations for sensitive topics.
+---
+## Training Details
+### Training Data
+The model was trained using multilingual translation data of Korean legal documents provided by AI Hub.
+### Training Procedure
+#### Preprocessing
+- Removed unnecessary whitespace, special characters, and line breaks.
+### Speeds, Sizes, Times
+- **Training Time:** 1 hour 25 minutes (5,100 seconds) on Nvidia RTX 4090
+- **Throughput:** ~3.51 samples/second
+- **Total Training Samples:** 17,922
+- **Model Checkpoint Size:** Approximately 2.3GB
+- **Gradient Accumulation Steps:** 4
+- **FP16 Mixed Precision Enabled:** Yes
+---
+## Evaluation
+### Testing Data
+The evaluation used a dataset partially extracted from Korean labor law precedents.
+### Metrics
+- BLEU
+### Results
+- **BLEU Score:** 29.69
+- **Accuracy:** 95.65%
+---
+## Environmental Impact
+- **Hardware Type:** NVIDIA RTX 4090
+- **Power Consumption:** ~450W
+- **Training Time:** 1 hour 25 minutes (1.42 hours)
+- **Electricity Consumption:** ~0.639 kWh
+- **Carbon Emission Factor (South Korea):** 0.459 kgCO₂/kWh
+- **Estimated Carbon Emissions:** ~0.293 kgCO₂
+---
+## Technical Specifications
+- **Model Architecture:**
+  Based on mBART-large-50, a multilingual sequence-to-sequence transformer model designed for translation tasks. The architecture includes 24 encoder and 24 decoder layers with 1,024 hidden units.
+- **Software:**
+  - sacrebleu for evaluation
+  - Hugging Face Transformers library for fine-tuning
+  - Python 3.11.9 and PyTorch 2.4.0
+- **Hardware:**
+  NVIDIA RTX 4090 with 24GB VRAM was used for training and inference.
+- **Tokenization and Preprocessing:**
+  The tokenization was performed using the SentencePiece model pre-trained with mBART-large-50. Text preprocessing included removing special characters, unnecessary whitespace, and normalizing line breaks.
+- **Optimizer and Hyperparameters:**
+  - Optimizer: AdamW
+  - Learning Rate: 1e-4
+  - Batch Size: 8 (per device)
+  - Gradient Accumulation Steps: 4
+  - Label Smoothing Factor: 0.1
+  - FP16 Mixed Precision Enabled: Yes
+---
+## Citation
+Currently, there are no papers or blog posts available for this model.
+---
+## Model Card Contact
+- **Contact Email:** audwodbs492@ofu.co.kr