|
05/10/2024 16:54:21 - INFO - transformers.tokenization_utils_base - loading file vocab.json from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/vocab.json |
|
|
|
05/10/2024 16:54:21 - INFO - transformers.tokenization_utils_base - loading file merges.txt from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/merges.txt |
|
|
|
05/10/2024 16:54:21 - INFO - transformers.tokenization_utils_base - loading file tokenizer.json from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/tokenizer.json |
|
|
|
05/10/2024 16:54:21 - INFO - transformers.tokenization_utils_base - loading file added_tokens.json from cache at None |
|
|
|
05/10/2024 16:54:21 - INFO - transformers.tokenization_utils_base - loading file special_tokens_map.json from cache at None |
|
|
|
05/10/2024 16:54:21 - INFO - transformers.tokenization_utils_base - loading file tokenizer_config.json from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/tokenizer_config.json |
|
|
|
05/10/2024 16:54:21 - WARNING - transformers.tokenization_utils_base - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. |
|
|
|
05/10/2024 16:54:21 - INFO - llmtuner.data.loader - Loading dataset joshcarp/evy-dataset... |
|
|
|
05/10/2024 16:54:23 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/config.json |
|
|
|
05/10/2024 16:54:23 - INFO - transformers.configuration_utils - Model config Qwen2Config { |
|
"_name_or_path": "Qwen/Qwen1.5-0.5B", |
|
"architectures": [ |
|
"Qwen2ForCausalLM" |
|
], |
|
"attention_dropout": 0.0, |
|
"bos_token_id": 151643, |
|
"eos_token_id": 151643, |
|
"hidden_act": "silu", |
|
"hidden_size": 1024, |
|
"initializer_range": 0.02, |
|
"intermediate_size": 2816, |
|
"max_position_embeddings": 32768, |
|
"max_window_layers": 21, |
|
"model_type": "qwen2", |
|
"num_attention_heads": 16, |
|
"num_hidden_layers": 24, |
|
"num_key_value_heads": 16, |
|
"rms_norm_eps": 1e-06, |
|
"rope_theta": 1000000.0, |
|
"sliding_window": 32768, |
|
"tie_word_embeddings": true, |
|
"torch_dtype": "bfloat16", |
|
"transformers_version": "4.40.2", |
|
"use_cache": true, |
|
"use_sliding_window": false, |
|
"vocab_size": 151936 |
|
} |
|
|
|
|
|
05/10/2024 16:55:38 - INFO - transformers.modeling_utils - loading weights file model.safetensors from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/model.safetensors |
|
|
|
05/10/2024 16:55:38 - INFO - transformers.modeling_utils - Instantiating Qwen2ForCausalLM model under default dtype torch.float32. |
|
|
|
05/10/2024 16:55:38 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig { |
|
"bos_token_id": 151643, |
|
"eos_token_id": 151643, |
|
"use_cache": false |
|
} |
|
|
|
|
|
05/10/2024 16:55:40 - INFO - transformers.modeling_utils - All model checkpoint weights were used when initializing Qwen2ForCausalLM. |
|
|
|
|
|
05/10/2024 16:55:40 - INFO - transformers.modeling_utils - All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at Qwen/Qwen1.5-0.5B. |
|
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training. |
|
|
|
05/10/2024 16:55:40 - INFO - transformers.generation.configuration_utils - loading configuration file generation_config.json from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/generation_config.json |
|
|
|
05/10/2024 16:55:40 - INFO - transformers.generation.configuration_utils - Generate config GenerationConfig { |
|
"bos_token_id": 151643, |
|
"eos_token_id": 151643, |
|
"max_new_tokens": 2048 |
|
} |
|
|
|
|
|
05/10/2024 16:55:40 - INFO - llmtuner.model.utils.checkpointing - Gradient checkpointing enabled. |
|
|
|
05/10/2024 16:55:40 - INFO - llmtuner.model.utils.attention - Using torch SDPA for faster training and inference. |
|
|
|
05/10/2024 16:55:40 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA |
|
|
|
05/10/2024 16:55:40 - INFO - llmtuner.model.loader - trainable params: 786432 || all params: 464774144 || trainable%: 0.1692 |
|
|
|
05/10/2024 16:55:40 - INFO - transformers.trainer - You have loaded a model on multiple GPUs. `is_model_parallel` attribute will be force-set to `True` to avoid any unexpected behavior such as device placement mismatching. |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - ***** Running training ***** |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Num examples = 235 |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Num Epochs = 3 |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Instantaneous batch size per device = 2 |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Total train batch size (w. parallel, distributed & accumulation) = 16 |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Gradient Accumulation steps = 8 |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Total optimization steps = 42 |
|
|
|
05/10/2024 16:55:41 - INFO - transformers.trainer - Number of trainable parameters = 786,432 |
|
|
|
05/10/2024 16:56:35 - INFO - llmtuner.extras.callbacks - {'loss': 3.1521, 'learning_rate': 4.8272e-05, 'epoch': 0.34} |
|
|
|
05/10/2024 16:58:24 - INFO - llmtuner.extras.callbacks - {'loss': 2.7178, 'learning_rate': 4.3326e-05, 'epoch': 0.68} |
|
|
|
05/10/2024 17:00:15 - INFO - llmtuner.extras.callbacks - {'loss': 2.7124, 'learning_rate': 3.5847e-05, 'epoch': 1.02} |
|
|
|
05/10/2024 17:01:32 - INFO - llmtuner.extras.callbacks - {'loss': 3.0056, 'learning_rate': 2.6868e-05, 'epoch': 1.36} |
|
|
|
05/10/2024 17:03:03 - INFO - llmtuner.extras.callbacks - {'loss': 2.6855, 'learning_rate': 1.7631e-05, 'epoch': 1.69} |
|
|
|
05/10/2024 17:06:33 - INFO - llmtuner.extras.callbacks - {'loss': 2.3657, 'learning_rate': 9.4128e-06, 'epoch': 2.03} |
|
|
|
05/10/2024 17:08:15 - INFO - llmtuner.extras.callbacks - {'loss': 2.5569, 'learning_rate': 3.3494e-06, 'epoch': 2.37} |
|
|
|
05/10/2024 17:10:03 - INFO - llmtuner.extras.callbacks - {'loss': 2.9284, 'learning_rate': 2.7923e-07, 'epoch': 2.71} |
|
|
|
05/10/2024 17:10:42 - INFO - transformers.trainer - |
|
|
|
Training completed. Do not forget to share your model on huggingface.co/models =) |
|
|
|
|
|
|
|
05/10/2024 17:10:42 - INFO - transformers.trainer - Saving model checkpoint to saves/Qwen1.5-0.5B/lora/train_2024-05-10-16-53-38 |
|
|
|
05/10/2024 17:10:42 - INFO - transformers.configuration_utils - loading configuration file config.json from cache at /Users/joshcarp/.cache/huggingface/hub/models--Qwen--Qwen1.5-0.5B/snapshots/8f445e3628f3500ee69f24e1303c9f10f5342a39/config.json |
|
|
|
05/10/2024 17:10:42 - INFO - transformers.configuration_utils - Model config Qwen2Config { |
|
"architectures": [ |
|
"Qwen2ForCausalLM" |
|
], |
|
"attention_dropout": 0.0, |
|
"bos_token_id": 151643, |
|
"eos_token_id": 151643, |
|
"hidden_act": "silu", |
|
"hidden_size": 1024, |
|
"initializer_range": 0.02, |
|
"intermediate_size": 2816, |
|
"max_position_embeddings": 32768, |
|
"max_window_layers": 21, |
|
"model_type": "qwen2", |
|
"num_attention_heads": 16, |
|
"num_hidden_layers": 24, |
|
"num_key_value_heads": 16, |
|
"rms_norm_eps": 1e-06, |
|
"rope_theta": 1000000.0, |
|
"sliding_window": 32768, |
|
"tie_word_embeddings": true, |
|
"torch_dtype": "bfloat16", |
|
"transformers_version": "4.40.2", |
|
"use_cache": true, |
|
"use_sliding_window": false, |
|
"vocab_size": 151936 |
|
} |
|
|
|
|
|
05/10/2024 17:10:42 - INFO - transformers.tokenization_utils_base - tokenizer config file saved in saves/Qwen1.5-0.5B/lora/train_2024-05-10-16-53-38/tokenizer_config.json |
|
|
|
05/10/2024 17:10:42 - INFO - transformers.tokenization_utils_base - Special tokens file saved in saves/Qwen1.5-0.5B/lora/train_2024-05-10-16-53-38/special_tokens_map.json |
|
|
|
05/10/2024 17:10:42 - INFO - transformers.modelcard - Dropping the following result as it does not have all the necessary fields: |
|
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}} |
|
|
|
|