README.md 3.5 KB

RTCDet:

Effectiveness of the pretrained weight

  • IN1K: We pretrained the backbone (RTCNet) on the ImageNet-1K dataset with the classification task setting.
  • Scratch: We just train the detector on the COCO without any pretrained weights for the backbone.

For the small model: | Model | Pretrained | Scale | Epoch | APval
0.5:0.95 | APval
0.5 | |----------|------------|-------|-------|------------------------|-------------------| | RTCDet-N | - | 640 | 500 | 37.0 | 52.9 | | RTCDet-N | IN1K | 640 | 500 | | | | RTCDet-L | - | 640 | 500 | 50.2 | 68.0 | | RTCDet-L | IN1K | 640 | 500 | | |

Results on the COCO-val

  • For the backbone, we use the ImageNet-1K pretrained weight.
  • For training, we train RTCDet series with 300 epochs on COCO.
  • For data augmentation, we use the large scale jitter (LSJ), Mosaic augmentation and Mixup augmentation, following the YOLOv8.
  • For optimizer, we use AdamW with weight decay 0.05 and base per image lr 0.001 / 64,.
  • For learning rate scheduler, we use Linear decay scheduler.

Train RTCDet

Single GPU

Taking training RTCDet-S on COCO as the example,

python train.py --cuda -d coco --root path/to/coco -m rtcdet_s -bs 16 -size 640 --wp_epoch 3 --max_epoch 300 --eval_epoch 10 --no_aug_epoch 20 --ema --fp16 --multi_scale 

Multi GPU

Taking training RTCDet-S on COCO as the example,

python -m torch.distributed.run --nproc_per_node=8 train.py --cuda -dist -d coco --root /data/datasets/ -m rtcdet_s -bs 128 -size 640 --wp_epoch 3 --max_epoch 300  --eval_epoch 10 --no_aug_epoch 20 --ema --fp16 --sybn --multi_scale --save_folder weights/ 

Test RTCDet

Taking testing RTCDet-S on COCO-val as the example,

python test.py --cuda -d coco --root path/to/coco -m rtcdet_s --weight path/to/RTCDet_s.pth -size 640 -vt 0.4 --show 

Evaluate RTCDet

Taking evaluating RTCDet-S on COCO-val as the example,

python eval.py --cuda -d coco-val --root path/to/coco -m rtcdet_s --weight path/to/RTCDet_s.pth 

Demo

Detect with Image

python demo.py --mode image --path_to_img path/to/image_dirs/ --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show

Detect with Video

python demo.py --mode video --path_to_vid path/to/video --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show --gif

Detect with Camera

python demo.py --mode camera --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show --gif
Model Batch Scale APval
0.5:0.95
APval
0.5
FLOPs
(G)
Params
(M)
Weight
RTCDet-N 8xb16 640 37.0 52.9 8.8 3.2
RTCDet-S 8xb16 640
RTCDet-M 8xb16 640
RTCDet-L 8xb16 640
RTCDet-X 8xb16 640 50.7 68.3 165.7 43.7