---
language:
- en
tags:
- generated_from_trainer
- question-answering
- text-generation
model-index:
- name: LaMini-Flan-T5-77M-qa-generation
  results: []
---

# LaMini-Flan-T5-77M-qa-generation

## Model Description

This model is a fine-tuned version of [MBZUAI/LaMini-Flan-T5-77M](https://huggingface.co/MBZUAI/LaMini-Flan-T5-77M) trained to generate question and answer pairs from raw text. It is based on the FLAN-T5 architecture and has been optimized for question-answer generation tasks.

## Key Features

- **Base Model**: MBZUAI/LaMini-Flan-T5-77M
- **Task**: Question and answer pair generation
- **Training Data**: [agentlans/finewebedu-sft](https://huggingface.co/datasets/agentlans/finewebedu-sft)
- **Added Tokens**: `[QUESTION_END]`, `[ANSWER_END]`
- **Evaluation Loss**: 1.3572

## Usage

To use this model for generating question-answer pairs:

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "agentlans/LaMini-Flan-T5-77M-qa-generation"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

input_text = "Your input text here..."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

## Output Processing

The model generates output in the following format:

```
Question[QUESTION_END]Answer[ANSWER_END]Question[QUESTION_END]Answer[ANSWER_END]...
```

To parse this output into a structured format:

```python
import re

def clean_text(text):
    return re.sub(r'\s+', ' ', text).strip()

def parse_qa_pairs(input_text):
    qa_blocks = re.split(r'(\[ANSWER_END\])', input_text)
    pairs = []
    for i in range(0, len(qa_blocks) - 1, 2):
        qa_block = qa_blocks[i]
        parts = qa_block.split('[QUESTION_END]')
        if len(parts) == 2:
            question, answer = map(clean_text, parts)
            if question and answer:
                pairs.append({"question": question, "answer": answer})
    return pairs

qa_pairs = parse_qa_pairs(decoded_output)
```

## Example

Input:
```
The ocean, covering over 70% of our planet's surface, is a vast and mysterious realm teeming with life and beauty. From the vibrant coral reefs that serve as bustling underwater cities to the deep, dark trenches that house some of the most bizarre creatures on Earth, the ocean is a treasure trove of biodiversity. It plays a crucial role in regulating the global climate, absorbing carbon dioxide and producing oxygen through its phytoplankton. Moreover, the ocean's depths remain largely unexplored, holding countless secrets and potential discoveries that could revolutionize our understanding of biology, medicine, and environmental science. As we continue to learn more about this incredible ecosystem, it becomes increasingly clear that protecting our oceans is essential for the health of our planet and future generations.
```

Output:
```python
[
    {
        "question": "What is the ocean's role in regulating the global climate?",
        "answer": "The ocean plays a crucial role in regulating the global climate by absorbing carbon dioxide and producing oxygen through its phytoplankton."
    },
    {
        "question": "What are some of the key discoveries that could revolutionize our understanding of the ocean?",
        "answer": "The ocean's depths remain largely unexplored, holding secrets and potential discoveries that could revolutionize our understanding of biology, medicine, and environmental science."
    },
    {
        "question": "What is the significance of protecting our oceans for future generations?",
        "answer": "Protecting our oceans is essential for the health of our planet and future generations because it is a vital part of our ecosystem and a vital resource for our survival and well-being."
    }
]
```

## Training Procedure

### Training Hyperparameters

The following hyperparameters were used during training:
- Learning rate: 0.0003
- Train batch size: 16
- Eval batch size: 16
- Seed: 42
- Gradient accumulation steps: 2
- Total train batch size: 32
- Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- LR scheduler type: linear
- LR scheduler warmup steps: 500
- Number of epochs: 10.0

### Training Results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 1.6321        | 1.2361 | 500  | 1.4333          |
| 1.5305        | 2.4722 | 1000 | 1.4013          |
| 1.4754        | 3.7083 | 1500 | 1.3719          |
| 1.4425        | 4.9444 | 2000 | 1.3693          |
| 1.3781        | 6.1805 | 2500 | 1.3647          |
| 1.3687        | 7.4166 | 3000 | 1.3572          |
| 1.3413        | 8.6527 | 3500 | 1.3596          |
| 1.3539        | 9.8888 | 4000 | 1.3594          |

## Limitations

- The model's performance may vary depending on the complexity and domain of the input text.
- The quality of generated questions and answers can be inconsistent across different topics.
- The model may occasionally generate irrelevant or repetitive question-answer pairs.

## Framework Versions

- Transformers 4.44.0
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.19.1