如何修复 YOLOX 中的训练错误启动:90?

原文标题How to fix training error launch:90 in YOLOX?

!python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 16 --fp16  -c /content/yolox_s.pth
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - enabled                : True
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - opt_level              : O1
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - cast_model_type        : None
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - patch_torch_functions  : True
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - keep_batchnorm_fp32    : None
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - master_weights         : None
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - loss_scale             : dynamic
2022-03-29 19:24:57 | INFO     | yolox.core.trainer:297 - loading checkpoint for fine tuning
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.0.weight in model is torch.Size([3, 128, 1, 1]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.0.bias in model is torch.Size([3]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.1.weight in model is torch.Size([3, 128, 1, 1]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.1.bias in model is torch.Size([3]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.2.weight in model is torch.Size([3, 128, 1, 1]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.2.bias in model is torch.Size([3]).
2022-03-29 19:24:57 | ERROR    | yolox.core.launch:90 - An error has been caught in function 'launch', process 'MainProcess' (1549), thread 'MainThread' (140243385931648):
Traceback (most recent call last):

  File "tools/train.py", line 125, in <module>
    args=(exp, args),
          │    └ Namespace(batch_size=16, ckpt='/content/yolox_s.pth', devices=1, dist_backend='nccl', dist_url=None, exp_file='exps/example/y...
          └ ╒══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════...

> File "/content/apex/YOLOX/yolox/core/launch.py", line 90, in launch
    main_func(*args)
    │          └ (╒══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════...
    └ <function main at 0x7f8cf10d3e60>

  File "tools/train.py", line 104, in main
    trainer.train()
    │       └ <function Trainer.train at 0x7f8bf2234d40>
    └ <yolox.core.trainer.Trainer object at 0x7f8bec4a7a90>

  File "/content/apex/YOLOX/yolox/core/trainer.py", line 69, in train
    self.before_train()
    │    └ <function Trainer.before_train at 0x7f8bec969710>
    └ <yolox.core.trainer.Trainer object at 0x7f8bec4a7a90>

  File "/content/apex/YOLOX/yolox/core/trainer.py", line 150, in before_train
    no_aug=self.no_aug,
           │    └ False
           └ <yolox.core.trainer.Trainer object at 0x7f8bec4a7a90>

  File "exps/example/yolox_voc/yolox_voc_s.py", line 36, in get_data_loader
    max_labels=50,

  File "/content/apex/YOLOX/yolox/data/datasets/voc.py", line 115, in __init__
    os.path.join(rootpath, "ImageSets", "Main", name + ".txt")
    │  │    │    │                              └ 'trainval'
    │  │    │    └ '/content/apex/YOLOX/datasets/VOCdevkit/VOC2007'
    │  │    └ <function join at 0x7f8cf31177a0>
    │  └ <module 'posixpath' from '/usr/lib/python3.7/posixpath.py'>
    └ <module 'os' from '/usr/lib/python3.7/os.py'>

FileNotFoundError: [Errno 2] No such file or directory: '/content/apex/YOLOX/datasets/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'

该文件存在于/content/YOLOX/datasets/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt但不存在于/content/apex/YOLOX/datasets/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt

我该如何解决?

原文链接:https://stackoverflow.com//questions/71667882/how-to-fix-training-error-launch90-in-yolox

回复

我来回复
  • Sadra Naddaf的头像
    Sadra Naddaf 评论

    这更像是一个解决问题而不是机器学习问题;无论如何,如果你使用这个并且在你的YOLOX文件夹中有带有.pth检查点的content文件夹,你应该运行如下命令(假设你的终端路径在你的 YOlox 文件夹内(使用运行pwd命令检查)):

    假设您想在自定义数据集上进行培训,您应该遵循他们的指南here;例如,如果你的数据在 coco 你应该把它放在./datasets文件夹

    现在,如果您在文件夹./content/中有下载的权重,那么以下命令开始基于 yolox_s.pth 对内部图像进行训练./datasets假设它们是 coco 格式。

    python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 16 --fp16 -c content/yolox_s.pth
    

    注意:/开头的路径是指文件系统的开头,而./(或不使用)指的是当前文件夹。

    2年前 0条评论