For training, we train YOLOv7 and YOLOv7-Tiny with 300 epochs on COCO.
For data augmentation, we use the large scale jitter (LSJ), Mosaic augmentation and Mixup augmentation, following the setting of YOLOv5.
For optimizer, we use SGD with momentum 0.937, weight decay 0.0005 and base lr 0.01.
For learning rate scheduler, we use linear decay scheduler.
For YOLOv7's structure, we use decoupled head, following the setting of YOLOX.
While YOLOv7 incorporates several technical details, such as anchor box, SimOTA, AuxiliaryHead, and RepConv, I found it too challenging to fully reproduce. Instead, I created a simpler version of YOLOv7 using an anchor-free structure and SimOTA. As a result, my reproduction had poor performance due to the absence of the other technical details. However, since it was only intended as a tutorial, I am not too concerned about this gap.