Тайлбар байхгүй

yjh0410 5ea53f89a1 update		2 жил өмнө
config	5ea53f89a1 update	2 жил өмнө
dataset	a1207db041 fix a bug in ourdataset.py	2 жил өмнө
deployment	fd19904e90 add ONNX deployment	2 жил өмнө
evaluator	8b0452884d update	2 жил өмнө
img_files	35e59a6b2a update README	2 жил өмнө
models	ed5dc01d8c modify the image encoder of RT-DETR	2 жил өмнө
tools	f030b39485 update	2 жил өмнө
utils	ed58a94572 update	2 жил өмнө
.gitignore	ef7614c2de add ignore file	2 жил өмнө
LICENSE	fb1008d647 add MIT LICENSE	2 жил өмнө
README.md	5c34686b8d train YOLOX-S with RTMDet Trainer on COCO	2 жил өмнө
README_CN.md	5c34686b8d train YOLOX-S with RTMDet Trainer on COCO	2 жил өмнө
demo.py	9d1b3e4a29 keep training YOLOX-N	2 жил өмнө
engine.py	8f6f8fa7dd debug YOLOX-style Transform with Rotation	2 жил өмнө
eval.py	e381560457 debug YOLOX-style Transform with Rotation	2 жил өмнө
requirements.txt	8c1d66b78f update README	2 жил өмнө
test.py	ed58a94572 update	2 жил өмнө
track.py	d0c17b2f29 add RT-DETR	2 жил өмнө
train.py	e381560457 debug YOLOX-style Transform with Rotation	2 жил өмнө
train.sh	a1428763dc train YOLOX-S with RTMDet Trainer on COCO	2 жил өмнө
train_ddp.sh	73011ee555 debug YOLOX-style Transform with Rotation	2 жил өмнө

PyTorch_YOLO_Tutorial

YOLO Tutorial

English | 简体中文

Introduction

Here is the source code for an introduction to YOLO. We adopted the core concepts of YOLOv1~v4, YOLOX and YOLOv7 for this project and made the necessary adjustments. By learning how to construct the well-known YOLO detector, we hope that newcomers can enter the field of object detection without any difficulty.

Book: The technical books that go along with this project's code is being reviewed, please be patient.

Requirements

We recommend you to use Anaconda to create a conda environment:
```
conda create -n yolo python=3.6
```
Then, activate the environment:
```
conda activate yolo
```
Requirements:
```
pip install -r requirements.txt 
```

My environment:

PyTorch = 1.9.1
Torchvision = 0.10.1

At least, please make sure your torch is version 1.x.

Experiments

VOC

Download VOC.

cd <PyTorch_YOLO_Tutorial>
cd dataset/scripts/
sh VOC2007.sh
sh VOC2012.sh

Check VOC

cd <PyTorch_YOLO_Tutorial>
python dataset/voc.py

Train on VOC

For example:

python train.py --cuda -d voc --root path/to/VOCdevkit -m yolov1 -bs 16 --max_epoch 150 --wp_epoch 1 --eval_epoch 10 --fp16 --ema --multi_scale

All models are trained with ImageNet pretrained weight (IP). All FLOPs are measured with a 640x640 image size on VOC2007 test. The FPS is measured with batch size 1 on 3090 GPU from the model inference to the NMS operation.

COCO

Download COCO.

cd <PyTorch_YOLO_Tutorial>
cd dataset/scripts/
sh COCO2017.sh

Check COCO

cd <PyTorch_YOLO_Tutorial>
python dataset/coco.py

Train on COCO

For example:

python train.py --cuda -d coco --root path/to/COCO -m yolov1 -bs 16 --max_epoch 150 --wp_epoch 1 --eval_epoch 10 --fp16 --ema --multi_scale

Redesigned YOLOv1~v2:

Model	Backbone	Scale	IP	Epoch	AP^val 0.5	Weight
YOLOv1	ResNet-18	640	√	150	76.7	ckpt
YOLOv2	DarkNet-19	640	√	150	79.8	ckpt
YOLOv3	DarkNet-53	640	√	150	82.0	ckpt
YOLOv4	CSPDarkNet-53	640	√	150	83.6	ckpt
YOLOX-L	CSPDarkNet-L	640	√	150	84.6	ckpt
YOLOv7-Large	ELANNet-Large	640	√	150	86.0	ckpt

YOLOv3:

Model	Backbone	Scale	Epoch	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOv1	ResNet-18	640	150	27.9	47.5	37.8	21.3	ckpt
YOLOv2	DarkNet-19	640	150	32.7	50.9	53.9	30.9	ckpt

YOLOv4:

Model	Backbone	Scale	Epoch	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOv3-Tiny	DarkNet-Tiny	640	250	25.4	43.4	7.0	2.3	ckpt
YOLOv3	DarkNet-53	640	250	42.9	63.5	167.4	54.9	ckpt

YOLOv5:

Model	Backbone	Scale	Epoch	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOv4-Tiny	CSPDarkNet-Tiny	640	250	31.0	49.1	8.1	2.9	ckpt
YOLOv4	CSPDarkNet-53	640	250	46.6	65.8	162.7	61.5	ckpt

*For YOLOv5-M and YOLOv5-L, increasing the batch size may improve performance. Due to my computing resources, I can only set the batch size to 16.*

YOLOX:

Model	Backbone	Scale	Epoch	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOv5-N	CSPDarkNet-N	640	250	29.8	47.1	7.7	2.4	ckpt
YOLOv5-S	CSPDarkNet-S	640	250	37.8	56.5	27.1	9.0	ckpt
YOLOv5-M	CSPDarkNet-M	640	250	43.5	62.5	74.3	25.4	ckpt
YOLOv5-L	CSPDarkNet-L	640	250	46.7	65.5	155.6	54.2	ckpt

*For YOLOX-M and YOLOX-L, increasing the batch size may improve performance. Due to my computing resources, I can only set the batch size to 16.*

YOLOv7:

Model	Backbone	Scale	Epoch	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOX-N	CSPDarkNet-N	640	300	31.1	49.5	7.5	2.3	ckpt
YOLOX-S	CSPDarkNet-S	640	300	39.0	58.8	26.8	8.9	ckpt
YOLOX-M	CSPDarkNet-M	640	300	44.6	63.8	74.3	25.4	ckpt
YOLOX-L	CSPDarkNet-L	640	300	46.9	65.9	155.4	54.2	ckpt

While YOLOv7 incorporates several technical details, such as anchor box, SimOTA, AuxiliaryHead, and RepConv, I found it too challenging to fully reproduce. Instead, I created a simpler version of YOLOv7 using an anchor-free structure and SimOTA. As a result, my reproduction had poor performance due to the absence of the other technical details. However, since it was only intended as a tutorial, I am not too concerned about this gap.

My YOLO:

Model	Backbone	Scale	Epoch	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOv7-Tiny	ELANNet-Tiny	640	300	38.0	56.8	22.6	7.9	ckpt
YOLOv7	ELANNet-Large	640	300	48.0	67.5	144.6	44.0	ckpt

Redesigned RT-DETR:

Model	Scale	Epoch	AP^{test 0.5:0.95}	AP^test 0.5	AP^val 0.5:0.95	AP^val 0.5	FLOPs ^(G)	Params ^(M)	Weight
YOLOvx-N	640	300
YOLOvx-S	640	300
YOLOvx-M	640	300
YOLOvx-L	640	300	50.2	68.6	50.0	68.4	176.6	47.6	ckpt

Necessary instructions：

All models are trained with ImageNet pretrained weight (IP). All FLOPs are measured with a 640x640 image size on COCO val2017. The FPS is measured with batch size 1 on 3090 GPU from the model inference to the NMS operation.
*The reproduced YOLOv5's head is the Decoupled Head, which is why the FLOPs and Params are higher than the official YOLOv5. Due to my limited computing resources, I can not align the training configuration with the official YOLOv5, so I cannot fully replicate the official performance. The YOLOv5 I reproduce is for learning purposes only.*
Due to my limited computing resources, I had to abandon training on other YOLO detectors, including YOLOv7-Huge and YOLOv8-Nano~Large. If you are interested in these models and have trained them using the code from this project, I would greatly appreciate it if you could share the trained weight files with me.
Using a larger batch size may improve the performance of large models, such as YOLOv5-L, YOLOX-L and YOLOv7-L. Due to my computing resources, I can only set the batch size to 16.

Train

Single GPU

sh train.sh

You can change the configurations of train.sh, according to your own situation.

You also can add --vis_tgt to check the images and targets during the training stage. For example:

python train.py --cuda -d coco --root path/to/coco -m yolov1 --vis_tgt

Multi GPUs

sh train_ddp.sh

You can change the configurations of train_ddp.sh, according to your own situation.

In the event of a training interruption, you can pass --resume the latest training weight path (None by default) to resume training. For example:

python train.py \
        --cuda \
        -d coco \
        -m yolov1 \
        -bs 16 \
        --max_epoch 300 \
        --wp_epoch 3 \
        --eval_epoch 10 \
        --ema \
        --fp16 \
        --resume weights/coco/yolov1/yolov1_epoch_151_39.24.pth

Then, training will continue from 151 epoch.

Test

python test.py -d coco \
               --cuda \
               -m yolov1 \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

For YOLOv7, since it uses the RepConv in PaFPN, you can add --fuse_repconv to fuse the RepConv block.

python test.py -d coco \
               --cuda \
               -m yolov7_large \
               --fuse_repconv \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

Evaluation

python eval.py -d coco-val \
               --cuda \
               -m yolov1 \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

Demo

I have provide some images in data/demo/images/, so you can run following command to run a demo:

python demo.py --mode image \
               --path_to_img data/demo/images/ \
               --cuda \
               --img_size 640 \
               -m yolov2 \
               --weight path/to/weight \
               --show

If you want run a demo of streaming video detection, you need to set --mode to video, and give the path to video --path_to_vid。

python demo.py --mode video \
               --path_to_vid data/demo/videos/your_video \
               --cuda \
               --img_size 640 \
               -m yolov2 \
               --weight path/to/weight \
               --show \
               --gif

If you want run video detection with your camera, you need to set --mode to camera。

python demo.py --mode camera \
               --cuda \
               --img_size 640 \
               -m yolov2 \
               --weight path/to/weight \
               --show \
               --gif

Detection visualization

Detector: YOLOv2

Command：

python demo.py --mode video \
                --path_to_vid ./dataset/demo/videos/000006.mp4 \
               --cuda \
               --img_size 640 \
               -m yolov2 \
               --weight path/to/weight \
               --show \
               --gif

Results:

Tracking

Our project also supports multi-object tracking tasks. We use the YOLO of this project as the detector, following the "tracking-by-detection" framework, and use the simple and efficient ByteTrack as the tracker.

images tracking

python track.py --mode image \
            --path_to_img path/to/images/ \
            --cuda \
            -size 640 \
            -dt yolov2 \
            -tk byte_tracker \
            --weight path/to/coco_pretrained/ \
            --show \
            --gif

video tracking

python track.py --mode video \
            --path_to_img path/to/video/ \
            --cuda \
            -size 640 \
            -dt yolov2 \
            -tk byte_tracker \
            --weight path/to/coco_pretrained/ \
            --show \
            --gif

camera tracking

python track.py --mode camera \
            --cuda \
            -size 640 \
            -dt yolov2 \
            -tk byte_tracker \
            --weight path/to/coco_pretrained/ \
            --show \
            --gif

Tracking visualization

Detector: YOLOv2
Tracker: ByteTracker
Device: i5-12500H CPU