yjh0410 6a4bb1b9a2 update 1 жил өмнө
..
README.md 8067d085c7 update 1 жил өмнө
build.py 6f9af45292 add RTCDet 1 жил өмнө
loss.py d49192e3e3 use yolov5 style augmentation 1 жил өмнө
matcher.py 6f9af45292 add RTCDet 1 жил өмнө
rtcdet.py 59946710fc use yolov5 style augmentation 1 жил өмнө
rtcdet_backbone.py 6a4bb1b9a2 update 1 жил өмнө
rtcdet_basic.py 1b5e49e543 add pretrained 1 жил өмнө
rtcdet_head.py b1ed050e0e update 1 жил өмнө
rtcdet_neck.py 6f9af45292 add RTCDet 1 жил өмнө
rtcdet_pafpn.py 6f9af45292 add RTCDet 1 жил өмнө
rtcdet_pred.py b1ed050e0e update 1 жил өмнө

README.md

RTCDet:

Effectiveness of the pretrained weight

  • IN1K Cls: We pretrained the backbone (RTCNet) on the ImageNet-1K dataset with the classification task setting.
  • IN1K MIM: We pretrained the backbone (RTCNet) on the ImageNet-1K dataset with the masked image modeling task setting.
  • Scratch: We just train the detector on the COCO without any pretrained weights for the backbone.

For the small model: | Model | Pretrained | Scale | APval
0.5:0.95 | APval
0.5 | FLOPs
(G) | Params
(M) | Weight | |----------|------------|-------|------------------------|-------------------|-------------------|--------------------|--------| | RTCDet-S | Scratch | 640 | | | | | | | RTCDet-S | IN1K Cls | 640 | | | | | | | RTCDet-S | IN1K MIM | 640 | | | | | |

For the large model: | Model | Pretrained | Scale | APval
0.5:0.95 | APval
0.5 | FLOPs
(G) | Params
(M) | Weight | |----------|------------|-------|------------------------|-------------------|-------------------|--------------------|--------| | RTCDet-L | Scratch | 640 | | | | | | | RTCDet-L | IN1K Cls | 640 | | | | | | | RTCDet-L | IN1K MIM | 640 | | | | | |

Results on the COCO-val

  • For the backbone, we ... (not sure)
  • For training, we train RTCDet series with 300 epochs on COCO.
  • For data augmentation, we use the large scale jitter (LSJ), Mosaic augmentation and Mixup augmentation, following the YOLOX.
  • For optimizer, we use AdamW with weight decay 0.05 and base per image lr 0.001 / 64,.
  • For learning rate scheduler, we use Linear decay scheduler.

Train RTCDet

Single GPU

Taking training RTCDet-S on COCO as the example,

python train.py --cuda -d coco --root path/to/coco -m rtcdet_s -bs 16 -size 640 --wp_epoch 3 --max_epoch 300 --eval_epoch 10 --no_aug_epoch 20 --ema --fp16 --multi_scale 

Multi GPU

Taking training RTCDet-S on COCO as the example,

python -m torch.distributed.run --nproc_per_node=8 train.py --cuda -dist -d coco --root /data/datasets/ -m rtcdet_s -bs 128 -size 640 --wp_epoch 3 --max_epoch 300  --eval_epoch 10 --no_aug_epoch 20 --ema --fp16 --sybn --multi_scale --save_folder weights/ 

Test RTCDet

Taking testing RTCDet-S on COCO-val as the example,

python test.py --cuda -d coco --root path/to/coco -m rtcdet_s --weight path/to/RTCDet_s.pth -size 640 -vt 0.4 --show 

Evaluate RTCDet

Taking evaluating RTCDet-S on COCO-val as the example,

python eval.py --cuda -d coco-val --root path/to/coco -m rtcdet_s --weight path/to/RTCDet_s.pth 

Demo

Detect with Image

python demo.py --mode image --path_to_img path/to/image_dirs/ --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show

Detect with Video

python demo.py --mode video --path_to_vid path/to/video --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show --gif

Detect with Camera

python demo.py --mode camera --cuda -m rtcdet_s --weight path/to/weight -size 640 -vt 0.4 --show --gif
Model Batch Scale APval
0.5:0.95
APval
0.5
FLOPs
(G)
Params
(M)
Weight
RTCDet-N 8xb16 640
RTCDet-S 8xb16 640
RTCDet-M 8xb16 640
RTCDet-L 8xb16 640
RTCDet-X 8xb16 640