raccord/scenAIrio-classification
Model Description
The scenAIrio-classification-model is designed to classify parts of a movie script or scenario into one of three categories: NOTES, DIALOGUE, or SEQUENCE. It leverages a BERT transformer architecture to understand and classify text based on contextual nuances typical in scripts.
Intended Use
This model is intended for use in applications involving the processing and analysis of movie scripts or scenarios. It can help scriptwriters, editors, and directors to automatically categorize script segments, facilitating easier script breakdowns and edits.
Training Data
The model was trained on a dataset consisting of annotated movie scripts. Each part of the script was labeled as NOTES, DIALOGUE, or SEQUENCE.
Training Procedure
The model was trained using the following training arguments:
- Output Directory:
./scenAIrio-modal
- Training: Enabled
- Evaluation: Enabled
- Epochs: 3
- Training Batch Size per Device: 16
- Evaluation Batch Size per Device: 32
- Warmup Steps: 100
- Weight Decay: 0.01
- Logging: Every 50 steps to
./multi-class-logs
- Evaluation Strategy: Every 50 steps
- Save Strategy: Save checkpoints every 50 steps
- Best Model Loading: At the end of training, the best performing model is loaded
Model Architecture
The model is based on a BERT transformer, specifically adapted for multi-class classification tasks.
Evaluation Results
Phase | Loss | Accuracy | F1-Score | Precision | Recall |
---|---|---|---|---|---|
Val | 0.21253 | 93.73% | 95.37% | 95.53% | 95.24% |
Train | 0.08378 | 97.94% | 98.47% | 98.56% | 98.39% |
Test | 0.26723 | 91.59% | 93.49% | 93.17% | 93.84% |
Limitations
- The model is specifically trained on French-language scripts and may not perform well with scripts in other languages.
- Performance can vary significantly depending on the specific characteristics and formatting of the input scripts.
Conclusion
The scenAIrio-classification-model provides a robust tool for analyzing and categorizing parts of movie scripts. With high accuracy and precision, it is poised to be a valuable asset in the film and television industry.
- Downloads last month
- 0