RTCDet:

Effectiveness of the pretrained weight

IN1K: We pretrained the backbone (RTCNet) on the ImageNet-1K dataset with the classification task setting.
Scratch: We just train the detector on the COCO without any pretrained weights for the backbone.

For the small model: | Model | Pretrained | Scale | Epoch | AP^{val
0.5:0.95 | AP^{val
0.5 |
|----------|------------|-------|-------|------------------------|-------------------|
| RTCDet-N | - | 640 | 500 | 37.0 | 52.9 |
| RTCDet-N | IN1K | 640 | 500 | | |
| RTCDet-L | - | 640 | 500 | 50.2 | 68.0 |
| RTCDet-L | IN1K | 640 | 500 | | |}}

Results on the COCO-val

For the backbone, we use the ImageNet-1K pretrained weight.
For training, we train RTCDet series with 300 epochs on COCO.
For data augmentation, we use the large scale jitter (LSJ), Mosaic augmentation and Mixup augmentation, following the YOLOv8.
For optimizer, we use AdamW with weight decay 0.05 and base per image lr 0.001 / 64,.
For learning rate scheduler, we use Linear decay scheduler.

Train RTCDet

Single GPU

Taking training RTCDet-S on COCO as the example,

python train.py --cuda -d coco --root path/to/coco -m rtcdet_s -bs 16 -size 640 --wp_epoch 3 --max_epoch 300 --eval_epoch 10 --no_aug_epoch 20 --ema --fp16 --multi_scale

Multi GPU

Taking training RTCDet-S on COCO as the example,

python -m torch.distributed.run --nproc_per_node=8 train.py --cuda -dist -d coco --root /data/datasets/ -m rtcdet_s -bs 128 -size 640 --wp_epoch 3 --max_epoch 300  --eval_epoch 10 --no_aug_epoch 20 --ema --fp16 --sybn --multi_scale --save_folder weights/

Test RTCDet

Taking testing RTCDet-S on COCO-val as the example,

python test.py --cuda -d coco --root path/to/coco -m rtcdet_s --weight path/to/RTCDet_s.pth -size 640 -vt 0.4 --show

Evaluate RTCDet

Taking evaluating RTCDet-S on COCO-val as the example,

python eval.py --cuda -d coco-val --root path/to/coco -m rtcdet_s --weight path/to/RTCDet_s.pth

Demo

Detect with Image

python demo.py --mode image --path_to_img path/to/image_dirs/ --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show

Detect with Video

python demo.py --mode video --path_to_vid path/to/video --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show --gif

Detect with Camera

python demo.py --mode camera --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show --gif

Model	Batch	Scale	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)
RTCDet-N	8xb16	640	37.0	52.9	8.8	3.2
RTCDet-S	8xb16	640
RTCDet-M	8xb16	640
RTCDet-L	8xb16	640
RTCDet-X	8xb16	640	50.7	68.3	165.7	43.7

README.md 3.5 KB Geschiedenis Ruwe