atsuki-yamaguchi commited on
Commit
9a1e51c
1 Parent(s): d18abd5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
  base_model: meta-llama/Meta-Llama-3-8B
7
  library_name: transformers
8
  ---
9
- # Llama3 8B for Telugu: 50 target vocabulary size + Mean target vocabulary initialization + T&B2LS/MTP/512 training
10
 
11
  This model is built on top of Llama3 8B adapted for Telugu using 30K target language sentences sampled from CC-100.
12
 
@@ -14,7 +14,7 @@ This model is built on top of Llama3 8B adapted for Telugu using 30K target lang
14
 
15
  * **Vocabulary**: This model has an additional 50 target vocabulary.
16
  * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
17
- * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
18
 
19
  ## Model Description
20
 
 
6
  base_model: meta-llama/Meta-Llama-3-8B
7
  library_name: transformers
8
  ---
9
+ # Llama3 8B for Telugu: 50 target vocabulary size + Mean target vocabulary initialization + 2x2LS/MTP/512 training
10
 
11
  This model is built on top of Llama3 8B adapted for Telugu using 30K target language sentences sampled from CC-100.
12
 
 
14
 
15
  * **Vocabulary**: This model has an additional 50 target vocabulary.
16
  * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Mean initialization.
17
+ * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the 2x2LS/MTP/512 strategies introduced in the paper.
18
 
19
  ## Model Description
20