使用Pytorch 2.0训练踩坑

Table of Contents

概述

最近博主在跑一个实验，但它在Pytorch 1.8版本下感觉有点慢，刚好又看到Pytorch 2.0版本加速很多，所以准备用Pytorch 2.0版本运行代码。在这个过程中，出现了一些小问题，还有一些warning。为了防止这些错误和warning干扰到实验的结果，我在网上查找了相关办法，并在此记录。

问题

FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun.Note that –use-env is set by default in torchrun

同时它还会提示你如下信息：

train.py: error: unrecognized arguments: --local-rank=0

我之前是在Pytorch 1.8版本下运行的，对应的运行命令是：

python -m  torch.distributed.launch --master_port 9843  --nproc_per_node=2  train.py

在Pytorch 2.0版本下应该使用如下命令：

python -m  torch.distributed.launch --master_port 9843  --nproc_per_node=2 --use_env train.py

增加一个--use_env即可。
声明--use_env后，pytorch会将当前进程在本机上的rank添加到环境变量LOCAL_RANK中，而不再添加到args.local_rank。

UserWarning: The parameter ‘pretrained’ is deprecated since 0.13 and may be removed in the future, please use ‘weights’ instead.

这是因为torchvision里面的pretrained参数已经在Pytorch 2.0里面过时了，现在使用的是weights参数。
除了上面的警告，还会出现：

UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.

我在Pytorch 1.8中使用的语句是：

resnet = models.resnet50(pretrained=True)

解决方法是：

resnet = models.resnet50(weights=torchvision.models.ResNet50_Weights.IMAGENET1K_V1)

使用最新的方法就可以了。

AttributeError: module ‘numpy’ has no attribute ‘int’.

因为在np.int在numpy1.20已经被废弃掉了。具体可以查看： https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
我的原语句是：

cls_gt = np.zeros((3, 384, 384), dtype=np.int)

改成：

cls_gt = np.zeros((3, 384, 384), dtype=np.int_)

或者使用np.int32或者np.int64

`np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.

因为在np.bool在numpy1.20已经被废弃掉了。
我的原语句是：

annotation = annotation.astype(np.bool)

改成：

annotation = annotation.astype(np.bool_)

UserWarning: ComplexHalf support is experimental and many operators don’t support it yet

除了警报，还会报错：

RuntimeError: cuFFT only supports dimensions whose sizes are powers of two when computing in half precision

出现这个的原因是我使用了混合精度。
解决办法是将Automatic mixed precision关掉

文章出处登录后可见！

已经登录？立即刷新