mou3az
/

IT-General_Question-Generation

General purpose

Text2text Generation

Model card Files Files and versions Community

IT-General_Question-Generation / README.md

mou3az's picture

Update README.md

a510668 verified 6 months ago

|

history blame contribute delete

2.77 kB

	---
	license: apache-2.0
	base_model: facebook/bart-base
	datasets:
	- squad_v2
	- drop
	- mou3az/IT_QA-QG
	language:
	- en
	library_name: peft
	tags:
	- IT purpose
	- General purpose
	- Text2text Generation
	metrics:
	- bertscore
	- accuracy
	- rouge
	---
	# Model Card


	Base Model: facebook/bart-base

	Fine-tuned : using PEFT-LoRa

	Datasets : squad_v2, drop, mou3az/IT_QA-QG

	Task: Generating questions from context and answers

	Language: English


	# Loading the model


	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
	HUGGING_FACE_USER_NAME = "mou3az"
	model_name = "IT-General_Question-Generation "
	peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
	config = PeftConfig.from_pretrained(peft_model_id)
	model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
	QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
	QG_model = PeftModel.from_pretrained(model, peft_model_id)
	```


	# At inference time


	```python
	def get_question(context, answer):
	device = next(QG_model.parameters()).device
	input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
	encoding = QG_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)

	output_tokens = QG_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
	out = QG_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()

	return out
	```


	# Training parameters and hyperparameters


	The following were used during training:

	For Lora:

	r=18

	alpha=8


	For training arguments:

	gradient_accumulation_steps=24

	per_device_train_batch_size=8

	per_device_eval_batch_size=8

	max_steps=1000

	warmup_steps=50

	weight_decay=0.05

	learning_rate=3e-3

	lr_scheduler_type="linear"


	# Training Results


	\| Epoch \| Optimization Step \| Training Loss \| Validation Loss \|
	\|-------\|-------------------\|---------------\|-----------------\|
	\| 0.0 \| 84 \| 4.6426 \| 4.704238 \|
	\| 3.0 \| 252 \| 1.5094 \| 1.202135 \|
	\| 6.0 \| 504 \| 1.2677 \| 1.146177 \|
	\| 9.0 \| 756 \| 1.2613 \| 1.112074 \|
	\| 12.0 \| 1000 \| 1.1958 \| 1.109059 \|


	# Performance Metrics on Evaluation Set:


	Training Loss: 1.1.1958

	Evaluation Loss: 1.109059

	Bertscore: 0.8123

	Rouge: 0.532144

	Fuzzywizzy similarity: 0.74209