vaishnavkoka
commited on
Commit
•
84aa4ca
1
Parent(s):
85b65d3
Update README.md
Browse files
README.md
CHANGED
@@ -25,17 +25,20 @@ This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2
|
|
25 |
Model Used: meta-llama/Llama-3.2-1B
|
26 |
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
|
27 |
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
|
|
|
28 |
3. Dataset and Task Details
|
29 |
Dataset: SST-2
|
30 |
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
|
31 |
The dataset consists of sentences labeled as either positive or negative sentiment.
|
32 |
Task Objective
|
33 |
Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
|
|
|
34 |
4. Fine-Tuning Approach
|
35 |
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
|
36 |
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
|
37 |
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
|
38 |
Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
|
|
|
39 |
5. Results and Observations
|
40 |
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
|
41 |
|
@@ -61,10 +64,12 @@ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
|
|
61 |
outputs = model(**inputs)
|
62 |
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
|
63 |
print(f"Predicted Sentiment: {sentiment}")
|
|
|
64 |
7. Key Takeaways
|
65 |
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
|
66 |
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
|
67 |
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
|
|
|
68 |
8. Acknowledgments
|
69 |
Hugging Face Transformers library for facilitating model fine-tuning.
|
70 |
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.
|
|
|
25 |
Model Used: meta-llama/Llama-3.2-1B
|
26 |
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
|
27 |
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
|
28 |
+
|
29 |
3. Dataset and Task Details
|
30 |
Dataset: SST-2
|
31 |
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
|
32 |
The dataset consists of sentences labeled as either positive or negative sentiment.
|
33 |
Task Objective
|
34 |
Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
|
35 |
+
|
36 |
4. Fine-Tuning Approach
|
37 |
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
|
38 |
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
|
39 |
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
|
40 |
Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
|
41 |
+
|
42 |
5. Results and Observations
|
43 |
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
|
44 |
|
|
|
64 |
outputs = model(**inputs)
|
65 |
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
|
66 |
print(f"Predicted Sentiment: {sentiment}")
|
67 |
+
|
68 |
7. Key Takeaways
|
69 |
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
|
70 |
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
|
71 |
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
|
72 |
+
|
73 |
8. Acknowledgments
|
74 |
Hugging Face Transformers library for facilitating model fine-tuning.
|
75 |
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.
|