Empirical research on End-to-End FCOS

Inspired by the YOLOv10, I recently make the empirical research on FCOS to evaluate the End-to-End detection paradigm.

Experiments

COCO

Incredibly, the FPS of the three FCOS are almost the same!

For FCOS_RT_R18_3x, we only use one-to-many assinger to train FCOS-RT-R18-3x and evaluate it with NMS.

For FCOS_RT_R18_3x (O2O), we only use one-to-one assinger to train FCOS-RT-R18-3x and evaluate it without NMS.

For FCOS_E2E_R18_3x, we deploy two parallel detection head, one using one-to-many assinger (o2m head) and the other using one-to-one assinger (o2o head). To avoid conflicts between the gradients returned by o2o head and o2m head, we truncate the gradients returned from o2o head to the backbone and neck, and only allow the gradients returned from o2m head to update the backbone and neck. This operation is consistent with the practice of YOLOv10. For evaluation, we remove the o2m head and only use o2o head without NMS.

Model	Sclae	FPS^{FP32 RTX 4060}	AP^val 0.5:0.95	AP^val 0.5	Weight	Logs
FCOS_RT_R18_3x	512,736	56	35.8	53.3	ckpt	log
FCOS_RT_R18_3x (O2O)	512,736	56	30.9	48.8	ckpt	log
FCOS_E2E_R18_3x	512,736	56	34.1	50.6	ckpt	log

README.md 2.1 KB Historia Raaka

Empirical research on End-to-End FCOS

Experiments

README.md 2.1 KB

Historia Raaka