Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

jozhang97/DETA

Repository files navigation

Detection Transformers with Assignment

By Jeffrey Ouyang-Zhang, Jang Hyun Cho, Xingyi Zhou, Philipp Krähenbühl

This repository is an official implementation of the paper NMS Strikes Back.

TL; DR. Detection Transformers with Assignment (DETA) re-introduce IoU assignment and NMS for transformer-based detectors. DETA trains and tests comparibly as fast as Deformable-DETR and converges much faster (50.2 mAP in 12 epochs on COCO).

DETR's one-to-one bipartite matching Our many-to-one IoU-based assignment

Main Results

Method Epochs COCO
val AP
Total Train time
(8 GPU hours)
Batch Infer
Speed (FPS)
URL
Two-stage Deformable DETR 50 46.9 42.5 - see
DeformDETR
Improved Deformable DETR 50 49.6 66.6 13.4 config
log
model
DETA 12 50.1 16.3 12.7 config
log
model
DETA 24 51.1 32.5 12.7 config
log
model
DETA (Swin-L) 24 62.9 100 4.2 config-O365
model-O365
config
model

Note:

  1. Unless otherwise specified, the model uses ResNet-50 backbone and training (ResNet-50) is done on 8 Nvidia Quadro RTX 6000 GPU.
  2. Inference speed is measured on Nvidia Tesla V100 GPU.
  3. "Batch Infer Speed" refer to inference with batch size = 4 to maximize GPU utilization.
  4. Improved DeformableDETR implements two-stage Deformable DETR with improved hyperparameters (e.g. more queries, more feature levels, see full list here).
  5. DETA with Swin-L backbone is pretrained on Object-365 and fine-tuned on COCO. This model attains 63.5AP on COCO test-dev. Times refer to fine-tuning (O365 pre-training takes 14000 GPU hours). We additionally provide the pre-trained Object365 config and model prior to fine-tuning.

Installation

Please follow instructions from Deformable-DETR for installation, data preparation, and additional usage examples. Tested on torch1.8.0+cuda10.1 and torch1.6.0+cuda9.2 and torch1.11.0+cuda11.3

Usage

Evaluation

You can evaluate our pretrained DETA models from the above table on COCO 2017 validation set:

./configs/deta.sh --eval --coco_path ./data/coco --resume <path_to_model>

You can also run distributed evaluation:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/deta.sh \
 --eval --coco_path ./data/coco --resume <path_to_model>

You can also run distributed evaluation on our Swin-L model:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/deta_swin_ft.sh \
 --eval --coco_path ./data/coco --resume <path_to_model>

Training

Training on single node

Training DETA on 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/deta.sh --coco_path ./data/coco

Training on slurm cluster

If you are using slurm cluster, you can simply run the following command to train on 1 node with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deta 8 configs/deta.sh \
 --coco_path ./data/coco

Fine-tune DETA with Swin-L on 2 nodes of each with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deta 16 configs/deta_swin_ft.sh \
 --coco_path ./data/coco --finetune <path_to_o365_model>

License

This project builds heavily off of Deformable-DETR and Detectron2. Please refer to their original licenses for more details. If you are using Swin-L backbone, please see Swin original license.

Citing DETA

If you find DETA useful in your research, please consider citing:

@article{ouyangzhang2022nms,
 title={NMS Strikes Back},
 author={Ouyang-Zhang, Jeffrey and Cho, Jang Hyun and Zhou, Xingyi and Kr{\"a}henb{\"u}hl, Philipp},
 journal={arXiv preprint arXiv:2212.06137},
 year={2022}
}

About

Detection Transformers with Assignment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

AltStyle によって変換されたページ (->オリジナル) /