|
|
1 year ago | |
|---|---|---|
| .. | ||
| data | 1 year ago | |
| models | 1 year ago | |
| utils | 1 year ago | |
| .gitignore | 1 year ago | |
| README.md | 1 year ago | |
| engine_finetune.py | 1 year ago | |
| engine_pretrain.py | 1 year ago | |
| main_finetune.py | 1 year ago | |
| main_pretrain.py | 1 year ago | |
| requirements.txt | 1 year ago | |
We have kindly provided the bash script main_pretrain.sh file for pretraining. You can modify some hyperparameters in the script file according to your own needs.
cd Vision-Pretraining-Tutorial/masked_image_modeling/
python main_pretrain.py --cuda \
--dataset cifar10 \
--model vit_t \
--mask_ratio 0.75 \
--batch_size 128 \
--optimizer adamw \
--weight_decay 0.05 \
--lr_scheduler cosine \
--base_lr 0.00015 \
--min_lr 0.0 \
--max_epoch 400 \
--eval_epoch 20
We have kindly provided the bash script main_finetune.sh file for finetuning. You can modify some hyperparameters in the script file according to your own needs.
cd Vision-Pretraining-Tutorial/masked_image_modeling/
python main_finetune.py --cuda \
--dataset cifar10 \
--model vit_t \
--batch_size 256 \
--optimizer adamw \
--weight_decay 0.05 \
--base_lr 0.0005 \
--min_lr 0.000001 \
--max_epoch 100 \
--wp_epoch 5 \
--eval_epoch 5 \
--pretrained path/to/vit_t.pth
Evaluate the top1 & top5 accuracy of ViT-Tiny on CIFAR10 dataset:
python main_finetune.py --cuda \
--dataset cifar10 \
-m vit_t \
--batch_size 256 \
--eval \
--resume path/to/vit_t_cifar10.pth
Evaluate ViT-Tiny pretrained by MAE framework on CIFAR10 dataset:
python main_pretrain.py --cuda \
--dataset cifar10 \
-m vit_t \
--resume path/to/mae_vit_t_cifar10.pth \
--eval \
--batch_size 1
| Method | Model | Epoch | Top 1 | Weight | MAE weight |
|---|---|---|---|---|---|
| MAE | ViT-T | 100 | 91.2 | ckpt | ckpt |
Thank you to Kaiming He for his inspiring work on MAE. His research effectively elucidates the semantic distinctions between vision and language, offering valuable insights for subsequent vision-related studies. I would also like to express my gratitude for the official source code of MAE. Additionally, I appreciate the efforts of IcarusWizard for reproducing the MAE implementation.