Add table of contents and Finetuning section
Browse files
README.md
CHANGED
@@ -20,6 +20,21 @@ DEtection TRansformer (DETR) model trained end-to-end on COCO 2017 object detect
|
|
20 |
|
21 |
Disclaimer: The team releasing DETR did not write a model card for this model so this model card has been written by the Hugging Face team.
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
## Model description
|
24 |
|
25 |
The DETR model is an encoder-decoder transformer with a convolutional backbone. Two heads are added on top of the decoder outputs in order to perform object detection: a linear layer for the class labels and a MLP (multi-layer perceptron) for the bounding boxes. The model uses so-called object queries to detect objects in an image. Each object query looks for a particular object in the image. For COCO, the number of object queries is set to 100.
|
@@ -94,8 +109,20 @@ The model was trained for 300 epochs on 16 V100 GPUs. This takes 3 days, with 4
|
|
94 |
## Evaluation results
|
95 |
|
96 |
This model achieves an AP (average precision) of **42.0** on COCO 2017 validation. For more details regarding evaluation results, we refer to table 1 of the original paper.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
97 |
### BibTeX entry and citation info
|
98 |
|
|
|
99 |
```bibtex
|
100 |
@article{DBLP:journals/corr/abs-2005-12872,
|
101 |
author = {Nicolas Carion and
|
|
|
20 |
|
21 |
Disclaimer: The team releasing DETR did not write a model card for this model so this model card has been written by the Hugging Face team.
|
22 |
|
23 |
+
|
24 |
+
## Table of Contents
|
25 |
+
|
26 |
+
- [Model description](#model-description)
|
27 |
+
- [Intended uses & limitations](#intended-uses--limitations)
|
28 |
+
- [How to use](#how-to-use)
|
29 |
+
- [Training data](#training-data)
|
30 |
+
- [Training procedure](#training-procedure)
|
31 |
+
- [Preprocessing](#preprocessing)
|
32 |
+
- [Training](#training)
|
33 |
+
- [Evaluation results](#evaluation-results)
|
34 |
+
- [Finetuning](#finetuning)
|
35 |
+
- [BibTeX entry and citation info](#bibtex-entry-and-citation-info)
|
36 |
+
|
37 |
+
|
38 |
## Model description
|
39 |
|
40 |
The DETR model is an encoder-decoder transformer with a convolutional backbone. Two heads are added on top of the decoder outputs in order to perform object detection: a linear layer for the class labels and a MLP (multi-layer perceptron) for the bounding boxes. The model uses so-called object queries to detect objects in an image. Each object query looks for a particular object in the image. For COCO, the number of object queries is set to 100.
|
|
|
109 |
## Evaluation results
|
110 |
|
111 |
This model achieves an AP (average precision) of **42.0** on COCO 2017 validation. For more details regarding evaluation results, we refer to table 1 of the original paper.
|
112 |
+
|
113 |
+
|
114 |
+
## Finetuning
|
115 |
+
|
116 |
+
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DETR.
|
117 |
+
|
118 |
+
- All example notebooks illustrating fine-tuning DetrForObjectDetection and DetrForSegmentation on a custom dataset can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR).
|
119 |
+
- Scripts for finetuning DetrForObjectDetection with Trainer or Accelerate can be found [here](https://github.com/huggingface/transformers/tree/main/examples/pytorch/object-detection).
|
120 |
+
- See also: [Object detection task guide](https://huggingface.co/docs/transformers/main/en/tasks/object_detection).
|
121 |
+
|
122 |
+
|
123 |
### BibTeX entry and citation info
|
124 |
|
125 |
+
|
126 |
```bibtex
|
127 |
@article{DBLP:journals/corr/abs-2005-12872,
|
128 |
author = {Nicolas Carion and
|