Model Card for Model ID: NepaliAI/NFT-6.9k
Model Details
Model Description
The NepaliAI/NFT-6.9k model is based on the BART (Bidirectional and Auto-Regressive Transformers) architecture, specifically utilizing the facebook/bart-large-xsum
pre-trained model. It has been fine-tuned on the Nepali health-related question-answering dataset.
Intended Use
The model is designed to generate answers to health-related questions provided by users. The primary language for input and output is Nepali.
Training Data
The model has been fine-tuned on the NepaliAI health-related question-answering dataset, derived from the NepaliAI/Nepali-Health-QA dataset. The training data consists of pairs of health-related questions and their corresponding answers.
Training Procedure
The model was trained for 5 epochs with the following training parameters:
- Learning Rate: 5e-5
- Batch Size: 2
- Gradient Accumulation Steps: 4
- FP16 (mixed-precision training): Enabled
- Optimizer: AdamW with weight decay
The training loss consistently decreased, indicating successful learning.
Use case:
from transformers import BartTokenizer, BartForConditionalGeneration
# Load the trained model
model = BartForConditionalGeneration.from_pretrained("NepaliAI/NFT-6.9k")
# Load the tokenizer for generating new output
tokenizer = BartTokenizer.from_pretrained("NepaliAI/NFT-6.9k")
# Example text
input_text = "के म मेरो महिनावारीको समयमा स्ट्रेप थ्रोटको लागि डाक्टरले तोकेको औषधि लिन सक्छु?"
# Tokenize the input
inputs = tokenizer(input_text, return_tensors="pt", max_length=128, truncation=True)
# Generate text
generated_text = model.generate(**inputs, max_length=256, top_p=0.95, top_k=50, do_sample=True, temperature=0.7, num_return_sequences=1, no_repeat_ngram_size=2)
# Decode the generated text
generated_response = tokenizer.batch_decode(generated_text, skip_special_tokens=True)[0]
# Print the generated response
print("Generated Response:", generated_response)
Evaluation
Metrics
No evaluation performed yet.
Limitations
- The model's knowledge is limited to the training data, and it might not generalize well to unseen health-related questions.
- The model's responses might vary based on the complexity and diversity of input questions.
Ethical Considerations
Intended Use
#The model is intended for informational purposes related to health. Users should be aware that the model's responses are generated based on patterns learned from the training data and might not substitute professional medical advice.
Bias
Care should be taken to minimize biases present in the training data. Diverse and representative datasets are crucial to mitigate biases in the model's responses.
Privacy
The model does not store or retain user input. However, users are advised not to input sensitive or personally identifiable information.
Future Directions
- Continuous improvement through hyperparameter tuning and model architecture exploration.
- Expansion of the training dataset to enhance the model's knowledge and performance.
- Adaptation of the model for other fields of Nepali datasets, starting with health-related datasets.
License
This model is made available under the Hugging Face Model Hub's community guidelines and the license specified by the facebook/bart-large-xsum
pre-trained model.
- Downloads last month
- 6