license: mit
datasets:
- rajpurkar/squad
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-cased
- distilbert/distilbert-base-uncased
- google/electra-small-discriminator
- distilbert/distilgpt2
- jinmang2/retro-reader
pipeline_tag: question-answering
library_name: transformers
tags:
- question-answering
- SQuAD
- BERT
- DistilBERT
- ELECTRA
- GPT-2
- transformers
- machine-learning
- natural-language-processing
AAI-520 Final Project Models
This repository contains the fine-tuned models developed for the AAI-520 Final Project: SQuAD Q&A ChatBot. The models are fine-tuned on the Stanford Question Answering Dataset (SQuAD) and are designed to facilitate question-answering tasks using various architectures.
Authors
Table of Contents
Introduction
The models in this repository are part of a project aimed at developing a generative-based chatbot capable of engaging in multi-turn conversations, adapting to context, and handling a wide range of topics. By leveraging the SQuAD dataset, these models are fine-tuned to provide accurate and contextually relevant responses to user queries.
Available Models
The following fine-tuned models are available in this repository:
BERT-base-cased Models
DistilBERT-base-uncased Model
DistilGPT-2 Model
Retro-Reader Models
ELECTRA Models (Recommended)
Note: We recommend using the ELECTRA models for the best performance.
Model Details
1. BERT-base-cased Model
Description: Fine-tuned the pre-trained bert-base-cased
model on the SQuAD dataset for question-answering tasks.
Approach:
- Initial Test: Trained on a subset of 1,000 data points to validate the setup.
- Full Training: Extended training to the entire dataset after successful initial testing.
Results:
- Training Metrics:
- Batch Size: 8
- Epochs: 6
- Observations:
- Model performance improved with more epochs but plateaued after a certain point.
- Initial tests confirmed the feasibility of using BERT for the task.
2. DistilBERT-base-uncased Model
Description: Utilized distilbert-base-uncased
, a lighter and faster version of BERT, to reduce computational resources.
Approach:
- Trained on 10,000 data points due to resource constraints.
- Adjusted the input formatting and preprocessing steps.
Results:
- Challenges:
- Encountered low accuracy and performance issues.
- Incompatibility with the Gradio frontend hindered deployment.
- Conclusion:
- The model did not meet the desired performance metrics.
3. DistilGPT-2 Model
Description: Experimented with distilgpt2
to test a generative approach to question answering.
Approach:
- Prepared input data by combining context and questions.
- Fine-tuned the model with custom tokenization and data collators.
Results:
- Evaluation Metrics:
- Achieved an evaluation loss but struggled with calculating F1 and accuracy due to memory issues.
- Challenges:
- Resource limitations prevented extensive evaluation.
- Model did not perform satisfactorily for the question-answering task.
4. Retro-Reader Model
Description: Implemented the Retro-Reader model, designed for machine reading comprehension tasks.
Approach:
- Trained both the Sketchy Reading and Intensive Reading components.
- Conducted experiments with datasets of 1,000 and 5,000 data points.
Results:
- Performance:
- Achieved low accuracy in both Sketchy and Intensive modes.
- Conclusion:
- The model did not yield better results compared to previous models.
- Required more research and optimization to be effective.
5. ELECTRA Model
Description: Adopted ELECTRA
for its efficient learning capabilities and superior performance in language understanding tasks.
Approach:
- Trained on varying dataset sizes: 1,000, 5,000, 20,000, and the full dataset.
- Utilized the
google/electra-small-discriminator
model.
Results:
- Training Metrics:
- Batch Size: 8
- Epochs: 6
- Observations:
- Consistent improvement in performance with larger training data.
- ELECTRA outperformed previous models, becoming the preferred choice for deployment.
Usage
Installation
To use these models, you need to have the transformers
library installed:
pip install transformers
Loading a Model
You can load any of the models using the from_pretrained
method:
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
model_name = "zainnobody/AAI-520-Final-Project-Models/fine_tuned_electra_model_all"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
Example Usage
from transformers import pipeline
model_name = "zainnobody/AAI-520-Final-Project-Models/fine_tuned_electra_model_all"
qa_pipeline = pipeline("question-answering", model=model_name, tokenizer=model_name)
context = "The Stanford Question Answering Dataset is a reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles."
question = "What does SQuAD stand for?"
result = qa_pipeline(question=question, context=context)
print(f"Answer: {result['answer']}")
Output:
Answer: Stanford Question Answering Dataset
Citations
- Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ Questions for Machine Comprehension of Text. arXiv preprint arXiv:1606.05250.
- Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. arXiv preprint arXiv:2003.10555.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- The models are trained and fine-tuned using resources from Hugging Face.
- OpenAI鈥檚 ChatGPT and GitHub CoPilot were used to create, iterate, and improve code documentation. All outputs were appropriately edited and improved by the authors in the final versions.
For any questions or issues, please feel free to contact the authors or open an issue on the GitHub repository.