File size: 7,702 Bytes
47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db d10bb9b 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db 47b58a1 ef324db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: cc-by-nc-4.0
language:
- en
tags:
- bart
- text-summarization
- cnn-dailymail
widget:
- text: |
The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.
example_title: Generate Summary
metrics:
- rouge
datasets:
- cnn_dailymail
model-index:
- name: BART-Large-CNN-scratch
results:
- task:
type: text-summarization
dataset:
name: CNN/DailyMail
type: cnn_dailymail
metrics:
- name: ROUGE-1
type: rouge
value: 44.07
- name: ROUGE-2
type: rouge
value: 21.06
- name: ROUGE-L
type: rouge
value: 30.65
source:
name: Internal Evaluation
url: https://huggingface.co/facebook/bart-large-cnn
---
# BART-Large-CNN-scratch
The BART-Large-CNN-scratch model is a newly trained version of the `facebook/bart-large` model. This model was trained from scratch on the CNN/DailyMail dataset to reproduce the performance of the `facebook/bart-large-cnn` model.
- **Developed by**: phanerozoic
- **Model type**: BartForConditionalGeneration
- **Source model**: `facebook/bart-large`
- **License**: cc-by-nc-4.0
- **Languages**: English
## Model Details
BART-Large-CNN-scratch utilizes a transformer-based architecture with a sequence-to-sequence approach, tailored specifically for text summarization tasks. This model builds upon the strengths of the original BART architecture by training from scratch using the CNN/DailyMail dataset.
### Configuration
- **Max input length**: 1024 tokens
- **Max target length**: 128 tokens
- **Learning rate**: 4e-5
- **Batch size**: 32
- **Epochs**: 1
- **Hardware used**: NVIDIA RTX 6000 Ada Lovelace
## Training and Evaluation Data
The model was trained on 1 epoch of the CNN/DailyMail dataset, a comprehensive collection of news articles paired with human-written summaries. This dataset is widely used as a benchmark for evaluating text summarization models due to its size and the quality of its annotations.
## Training Procedure
The training involved starting from the `facebook/bart-large` model and training from scratch with the following settings:
- **Epochs**: 1
- **Batch size**: 32
- **Learning rate**: 4e-5
- **Training time**: 7 hours
- **Loss**: 0.65
During training, the model was optimized to reduce the loss function, enhancing its ability to generate summaries that are both concise and informative.
### Performance
The training process resulted in the following performance metrics:
- **ROUGE-1**: 44.07
- **ROUGE-2**: 21.06
- **ROUGE-L**: 30.65
## Comparing Performance to Base Model
The performance of BART-Large-CNN-scratch is compared against Facebook's base BART-large-cnn model:
| Model | ROUGE-1 | ROUGE-2 | ROUGE-L |
|--------------------------------|---------|---------|---------|
| Facebook BART-large-cnn | 42.949 | 20.815 | 30.619 |
| BART-Large-CNN-scratch | 44.070 | 21.060 | 30.650 |
### Analysis of Summaries
#### Eiffel Tower Article Summary Comparison
##### Facebook BART-Large-CNN Summary:
"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world."
##### BART-Large-CNN-scratch Summary:
"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. It is the second tallest free-standing structure in France after the Millau Viaduct."
- **Comparison**:
- Both summaries start with identical descriptions of the Eiffel Tower's height and base dimensions.
- The Facebook summary mentions the historical significance of the Eiffel Tower surpassing the Washington Monument.
- The scratch summary includes the detail of the tower being the second tallest free-standing structure in France, providing a different historical context.
- The scratch summary omits the name of the tower, indicating our deficiency in attempting to perfectly replicate Facebook's performance.
#### Paper Clip Article Summary Comparison
##### Facebook BART-Large-CNN Summary:
"The earliest form of the paper clip dates back to the 13th century. The most widely recognized design is attributed to the Norwegian inventor Johan Vaaler. The design of paper clips has continued to evolve, with various shapes and sizes available on the market. During World War II, paper clips became a symbol of resistance in Norway."
##### BART-Large-CNN-scratch Summary:
"The paper clip dates back to the 13th century, when a device made of a bent metal wire was used to hold sheets of paper together. The most widely recognized design is attributed to the Norwegian inventor Johan Vaaler, who received a patent for his paper clip design in 1899. During World War II, the paper clip became a symbol of resistance in Norway."
- **Comparison**:
- Both summaries start with descriptions of the origins of the paper clip and Johan Vaaler's contributions.
- The Facebook summary briefly mentions the evolution of paper clip designs and their availability in various shapes and sizes.
- The scratch summary includes additional historical details about the use of bent metal wires in the 13th century and Vaaler's patent, providing a richer historical context.
### Implications
1. **Reproducibility**:
- The BART-Large-CNN-scratch model closely reproduces the performance of the Facebook BART-large-cnn model, capturing key historical points and providing concise summaries. However, it shows some differences in detail prioritization, indicating that while the reproduction is effective, it is not exact.
2. **Model Training from Scratch**:
- Training from scratch has proven to be effective, with the BART-Large-CNN-scratch model achieving competitive ROUGE scores. However, the summaries differ in detail compared to the Facebook model, suggesting areas for further fine-tuning.
3. **Practical Applications**:
- Both models are effective for summarizing historical and technical articles. The BART-Large-CNN-scratch model is excellent for concise overviews, while the Facebook model provides more comprehensive context.
### Conclusion
The BART-Large-CNN-scratch model demonstrates strong performance, capturing essential historical points and providing concise summaries. While it does not exactly reproduce the Facebook model's summaries, it achieves similar quality and even exceeds in ROUGE scores. This makes it a robust tool for text summarization applications.
## Acknowledgments
Special thanks to the developers of the BART architecture and the Hugging Face team. Their tools and frameworks were instrumental in the development and fine-tuning of this model. The NVIDIA RTX 6000 Ada Lovelace hardware provided the necessary computational power to achieve these results. |