facebook
/

detr-resnet-50

@@ -20,6 +20,21 @@ DEtection TRansformer (DETR) model trained end-to-end on COCO 2017 object detect
 Disclaimer: The team releasing DETR did not write a model card for this model so this model card has been written by the Hugging Face team.
 ## Model description
 The DETR model is an encoder-decoder transformer with a convolutional backbone. Two heads are added on top of the decoder outputs in order to perform object detection: a linear layer for the class labels and a MLP (multi-layer perceptron) for the bounding boxes. The model uses so-called object queries to detect objects in an image. Each object query looks for a particular object in the image. For COCO, the number of object queries is set to 100.
@@ -94,8 +109,20 @@ The model was trained for 300 epochs on 16 V100 GPUs. This takes 3 days, with 4
 ## Evaluation results
 This model achieves an AP (average precision) of **42.0** on COCO 2017 validation. For more details regarding evaluation results, we refer to table 1 of the original paper.
 ### BibTeX entry and citation info
 ```bibtex
 @article{DBLP:journals/corr/abs-2005-12872,
   author    = {Nicolas Carion and

 Disclaimer: The team releasing DETR did not write a model card for this model so this model card has been written by the Hugging Face team.
+## Table of Contents
+- [Model description](#model-description)
+- [Intended uses & limitations](#intended-uses--limitations)
+  - [How to use](#how-to-use)
+- [Training data](#training-data)
+- [Training procedure](#training-procedure)
+  - [Preprocessing](#preprocessing)
+  - [Training](#training)
+- [Evaluation results](#evaluation-results)
+- [Finetuning](#finetuning)
+- [BibTeX entry and citation info](#bibtex-entry-and-citation-info)
 ## Model description
 The DETR model is an encoder-decoder transformer with a convolutional backbone. Two heads are added on top of the decoder outputs in order to perform object detection: a linear layer for the class labels and a MLP (multi-layer perceptron) for the bounding boxes. The model uses so-called object queries to detect objects in an image. Each object query looks for a particular object in the image. For COCO, the number of object queries is set to 100.
 ## Evaluation results
 This model achieves an AP (average precision) of **42.0** on COCO 2017 validation. For more details regarding evaluation results, we refer to table 1 of the original paper.
+## Finetuning
+A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with DETR.
+ - All example notebooks illustrating fine-tuning DetrForObjectDetection and DetrForSegmentation on a custom dataset can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/DETR).
+ - Scripts for finetuning DetrForObjectDetection with Trainer or Accelerate can be found [here](https://github.com/huggingface/transformers/tree/main/examples/pytorch/object-detection).
+ - See also: [Object detection task guide](https://huggingface.co/docs/transformers/main/en/tasks/object_detection).
 ### BibTeX entry and citation info
 ```bibtex
 @article{DBLP:journals/corr/abs-2005-12872,
   author    = {Nicolas Carion and