复现KM3D:Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training

复现KM3D:Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training

时间:2022年5月27日

1.下载官方代码

到github中,搜索KM3D找到官方代码,下载。网址如下:
https://github.com/Banconxuan/RTM3D

2. 安装环境(install)

查看install.md,按照里面的内容进行环境的安装。
2.1 安装pytorch环境
官方要求安装的pytorch环境太老了。我没有按照官方的。我的环境如下:
Ubuntu 18.04.3 LTS
conda + python 3.7 + cuda10.1
pytorch1.6.0
torchvision0.7.0
显卡:Quadro P4000 8GB

conda create -n km3d python=3.7 -y
conda activate km3d
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch -y

既然选择了较新的环境,意味着后面要修改代码bug,以适应环境。
2.2 Install the requirements
进入km3d路径下,执行如下命令:

pip install -r requirements.txt

2.3 安装DCN v2.
参考知乎《单目3D目标检测RTM3D》作者:Vision 连接:https://zhuanlan.zhihu.com/p/415084925
没有用官方github提供的方法,提供的DCNv2版本太老了,不适用。所以,在github上搜索DCN v2找到适合pytorch的版本并安装。
我的环境pytorch1.6,所以安装的DCNv2要适配pytorch1.6
github搜索DCNv2,找到DCNv2_latest,作者jinfagang,使用pytorch1.6分支,按照说明进行安装即可,连接:https://github.com/jinfagang/DCNv2_latest/tree/pytorch1.6

2.4 安装 iou3d
参考知乎《单目3D目标检测RTM3D》作者:Vision 连接:https://zhuanlan.zhihu.com/p/415084925
进入src\lib\utils\iou3d\src,编辑iou3d.cpp
在所有#include<>后面添加3行代码即可:

#ifndef AT_CHECK
#define AT_CHECK TORCH_CHECK 
#endif

整体就是:

#include <torch/serialize/tensor.h>
#include <torch/extension.h>
#include <vector>
#include <cuda.h>
#include <cuda_runtime_api.h>

#ifndef AT_CHECK
#define AT_CHECK TORCH_CHECK 
#endif

#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
#define CHECK_CONTIGUOUS(x) AT_CHECK(x.is_contiguous(), #x, " must be contiguous ")
#define CHECK_INPUT(x) CHECK_CUDA(x);CHECK_CONTIGUOUS(x)

#define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))

#define CHECK_ERROR(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
   if (code != cudaSuccess)
   {
      fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
      if (abort) exit(code);
   }
}

const int THREADS_PER_BLOCK_NMS = sizeof(unsigned long long) * 8;
后面的代码省略,要不然太长了
后面的代码省略,要不然太长了
后面的代码省略,要不然太长了
后面的代码省略,要不然太长了

修改好iou3d.cpp后,执行如下命令,即可成功安装:

cd $KM3D_ROOT/src/lib/utiles/iou3d
python setup.py install

3.处理kitti数据集

kitti数据集按照km3d要求来处理

KM3DNet
├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations /kitti_val.json kitti_train.json
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt

calib是left image training集的。kitti自带。
image是left image training和right image training在一起的。kitti自带。其中,right image training集图片的名称要修改一下,修改脚本如下:

import os
import cv2
original_path = './image_3/'
process_path = './process_right_image/'
original_name = os.listdir(original_path)
for i, ori_name in enumerate(original_name):
    ori_image = cv2.imread(original_path + ori_name)
    ori_num = ori_name.replace('.png', '')
    ori_num = int(ori_num)
    new_num = ori_num + 7481
    if new_num > 9999:
        new_name = '0' + str(new_num) + '.png'
    else:
        new_name = '00' + str(new_num) + '.png'
    cv2.imwrite(process_path + new_name, ori_image)
    print(i+1)
    print(ori_num)
    print(new_num)
    print(new_name)

label是left image training 的标签。kitti自带。
train.txt val.txt trainval.txt 是KM3D自带。
annotations文件夹要自己新建,然后里面的kitti_val.json和kitti_train.json需要运行代码生成!!!生成方法如下:

python src/tools/kitti.py

如果没有annotations和kitti_val.json文件的话,后面运行程序会报错:

No such file or directory: '../kitti_format/data/kitti/annotations/kitti_val.json'

4.训练

参考官方文档Getting Started部分。
根据自己的显卡调整训练的参数。
ResNet-18骨干网络:

python ./src/main.py --data_dir ./kitti_format --exp_id KM3D_res18 --arch res_18 --batch_size 8 --master_batch_size 16 --lr 1.5625e-5 --gpus 0 --num_epochs 200

执行上述命令,会出现如下错误:
报错1

Traceback (most recent call last):
  File "./src/main.py", line 111, in <module>
    main(opt)
  File "./src/main.py", line 73, in main
    log_dict_train, _ = trainer.train(epoch, train_loader)
  File "/root/RTM3D/src/lib/trains/base_trainer.py", line 162, in train
    return self.run_epoch('train', epoch, data_loader,unlabel_loader1,unlabel_loader2,unlabel_set,iter_num,uncert)
  File "/root/RTM3D/src/lib/trains/base_trainer.py", line 97, in run_epoch
    output, loss, loss_stats = model_with_loss(batch,phase=phase)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/RTM3D/src/lib/trains/base_trainer.py", line 33, in forward
    loss, loss_stats = self.loss(outputs, batch,phase)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/RTM3D/src/lib/trains/car_pose.py", line 52, in forward
    coor_loss, prob_loss, box_score = self.position_loss(output, batch,phase)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/RTM3D/src/lib/models/losses.py", line 332, in forward
    cys = (batch['ind'] / self.opt.output_w).int().float()
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

版本问题,tensor和int之间的除法不能直接用’/‘,用torch.true_divide()代替。修改’/src/lib/models/losses.py’中的第332行,修改后:

cys = torch.true_divide(batch['ind'], self.opt.output_w).int().float()

报错2

Traceback (most recent call last):
  File "./src/main.py", line 111, in <module>
    main(opt)
  File "./src/main.py", line 73, in main
    log_dict_train, _ = trainer.train(epoch, train_loader)
  File "/root/RTM3D/src/lib/trains/base_trainer.py", line 162, in train
    return self.run_epoch('train', epoch, data_loader,unlabel_loader1,unlabel_loader2,unlabel_set,iter_num,uncert)
  File "/root/RTM3D/src/lib/trains/base_trainer.py", line 97, in run_epoch
    output, loss, loss_stats = model_with_loss(batch,phase=phase)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/RTM3D/src/lib/trains/base_trainer.py", line 33, in forward
    loss, loss_stats = self.loss(outputs, batch,phase)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/RTM3D/src/lib/trains/car_pose.py", line 52, in forward
    coor_loss, prob_loss, box_score = self.position_loss(output, batch,phase)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/RTM3D/src/lib/models/losses.py", line 440, in forward
    dim_mask_score_mask = 1 - (dim_mask_score_mask > 0)
  File "/root/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/tensor.py", line 396, in __rsub__
    return _C._VariableFunctions.rsub(self, other)
RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead.

原因是pytorch最新版本中,不支持bool型与int型减法,如果需要对bool变量进行反转,则使用“~”操作。修改’/src/lib/models/losses.py‘中第440行,修改后

dim_mask_score_mask = ~ (dim_mask_score_mask > 0)

5 Results generation

python ./src/faster.py --demo ./kitti_format/data/kitti/val.txt --data_dir ./kitti_format --calib_dir ./kitti_format/data/kitti/calib/ --load_model ./kitti_format/exp/KM3D_res18/model_last.pth --gpus 0 --arch res_18

报错1:

Traceback (most recent call last):
  File "./src/faster.py", line 55, in <module>
    demo(opt)
  File "./src/faster.py", line 46, in demo
    ret = detector.run(image_name)
  File "/root/RTM3D/src/lib/detectors/base_detector.py", line 139, in run
    output, dets, forward_time = self.process(images,meta,return_time=True)
  File "/root/RTM3D/src/lib/detectors/car_pose.py", line 57, in process
    output['hm'], output['hps'], output['dim'], output['rot'], prob=output['prob'],K=self.opt.K, meta=meta, const=self.const)
  File "/root/RTM3D/src/lib/models/decode.py", line 788, in car_pose_decode_faster
    scores, inds, clses, ys, xs = _topk(heat, K=K)
  File "/root/RTM3D/src/lib/models/decode.py", line 109, in _topk
    topk_ys   = (topk_inds / width).int().float()
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

版本问题,tensor和int之间的除法不能直接用’/’,用torch.true_divide()代替。

topk_ys   = torch.true_divide(topk_inds, width).int().float()

后面还有类似的bug,这里只列举一个。
报错2:
参考https://blog.csdn.net/long_songs/article/details/124736338?ops_request_misc=&request_id=&biz_id=102&utm_term=Overload%20resolution%20failed:%20>%20&utm_medium=distribute.pc_search_result.none-task-blog-2allsobaiduweb~default-0-124736338.nonecase&spm=1018.2226.3001.4187

具体的报错信息忘了截图了,类似于下面的错误

cv2.error: OpenCV(4.5.2) :-1: error: (-5:Bad argument) in function 'pointPolygonTest'
> Overload resolution failed:
>  - Can't parse 'pt'. Sequence item with index 0 has a wrong type
>  - Can't parse 'pt'. Sequence item with index 0 has a wrong type

这种就是因为不是int,在能加int的地方全给 int()上就完事了!。
/src/lib/utils/ddd_utils.py中的draw_bos_3d()函数中cv2.line()函数的参数修改成int()类型。
修改前:

def draw_box_3d(image, corners, c=(0, 0, 255)):
  face_idx = [[0,1,5,4],
              [1,2,6, 5],
              [2,3,7,6],
              [3,0,4,7]]
  for ind_f in range(3, -1, -1):
    f = face_idx[ind_f]
    for j in range(4):
      cv2.line(image, (corners[f[j], 0], corners[f[j], 1]),
               (corners[f[(j+1)%4], 0], corners[f[(j+1)%4], 1]), c, 2, lineType=cv2.LINE_AA)
    if ind_f == 0:
      cv2.line(image, (corners[f[0], 0], corners[f[0], 1]),
               (corners[f[2], 0]),corners[f[2], 1]), c, 1, lineType=cv2.LINE_AA)
      cv2.line(image, (corners[f[1], 0], corners[f[1], 1]),
               (corners[f[3], 0], corners[f[3], 1]), c, 1, lineType=cv2.LINE_AA)
  return image

修改后:

def draw_box_3d(image, corners, c=(0, 0, 255)):
  face_idx = [[0,1,5,4],
              [1,2,6, 5],
              [2,3,7,6],
              [3,0,4,7]]
  for ind_f in range(3, -1, -1):
    f = face_idx[ind_f]
    for j in range(4):
      cv2.line(image, (int(corners[f[j], 0]),int(corners[f[j], 1])),
               (int(corners[f[(j+1)%4], 0]), int(corners[f[(j+1)%4], 1])), c, 2, lineType=cv2.LINE_AA)
    if ind_f == 0:
      cv2.line(image, (int(corners[f[0], 0]), int(corners[f[0], 1])),
               (int(corners[f[2], 0]),int(corners[f[2], 1])), c, 1, lineType=cv2.LINE_AA)
      cv2.line(image, (int(corners[f[1], 0]), int(corners[f[1], 1])),
               (int(corners[f[3], 0]), int(corners[f[3], 1])), c, 1, lineType=cv2.LINE_AA)
  return image

6 Visualization

python ./src/faster.py --vis --demo ./kitti_format/data/kitti/val.txt --data_dir ./kitti_format --calib_dir ./kitti_format/data/kitti/calib/ --load_model ./kitti_format/exp/KM3D_res18/model_last.pth --gpus 0 --arch res_18

报错1:

Traceback (most recent call last):
  File "./src/faster.py", line 55, in <module>
    demo(opt)
  File "./src/faster.py", line 46, in demo
    ret = detector.run(image_name)
  File "/root/RTM3D/src/lib/detectors/base_detector.py", line 163, in run
    self.show_results(debugger, image, results, calib_numpy)
  File "/root/RTM3D/src/lib/detectors/car_pose.py", line 113, in show_results
    debugger.show_all_imgs(pause=self.pause)
  File "/root/RTM3D/src/lib/utils/debugger.py", line 364, in show_all_imgs
    cv2.imshow('{}'.format(i), v)
cv2.error: OpenCV(4.5.5) /io/opencv/modules/highgui/src/window.cpp:1268: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support.
 If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvShowImage'

解决方法参考:
https://blog.csdn.net/tudou2013goodluck/article/details/108402055?ops_request_misc=&request_id=&biz_id=102&utm_term=If%20you%20are%20on%20Ubuntu%20or%20Debian&utm_medium=distribute.pc_search_result.none-task-blog-2allsobaiduweb~default-1-108402055.nonecase

conda remove opencv
conda install -c menpo opencv

pip install --upgrade pip
pip install opencv-contrib-python

效果图:
复现KM3D:Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training
按键盘上的上下左右键可以切换图片。
注意这张图片中的黑色部分是终端黑窗口,截图的时候截下来了,不是图片显示错误。

7. All-in-one fashion

Generating results in all-in-one fashion. It simultaneously generates main center, 2D bounding box, regressed keypoints, heatmap keypoint, dim, orientation, confidence.
官方给的命令如下:

python ./src/faster.py --demo ./kitti_format/data/kitti/val.txt --calib_dir ./kitti_format/data/kitti/calib/ --load_model ./kitti_format/exp/KM3D_res18/model_last.pth --gpus 0 --arch res_18

执行命令会报错:

[ WARN:0@4.303] global /io/opencv/modules/imgcodecs/src/loadsave.cpp (239) findDecoder imread_('data/kitti/image/000001.png'): can't open/read file: check file path/integrity
Traceback (most recent call last):
  File "./src/faster.py", line 57, in <module>
    demo(opt)
  File "./src/faster.py", line 48, in demo
    ret = detector.run(image_name)
  File "/root/RTM3D/src/lib/detectors/base_detector.py", line 127, in run
    images, meta = self.pre_process(image, scale, meta)
  File "/root/RTM3D/src/lib/detectors/base_detector.py", line 44, in pre_process
    height, width = image.shape[0:2]
AttributeError: 'NoneType' object has no attribute 'shape'

注意报错信息的第一行,无法读取’data/kitti/image/000001.png’,显然路径是有问题的,无法读入图片,这才导致‘None Type’类型。
仔细观察这条命令,会发现这条命令和 上文中Results generation部分的命令只少了data_dir参数:

--data_dir ./kitti_format

个人感觉作者的命令是漏掉了这个参数,补上这个参数就能正常执行了,补上这个参数后路径就变成了./kitti_format/data/kitti/image/000001.png,这样路径才是正确的啊。
正确命令如下,同Results generation部分的命令:

python ./src/faster.py --demo ./kitti_format/data/kitti/val.txt --data_dir ./kitti_format --calib_dir ./kitti_format/data/kitti/calib/ --load_model ./kitti_format/exp/KM3D_res18/model_last.pth --gpus 0 --arch res_18

个人的理解就是这样的,愿意听听大家的理解。

8.Evaluation

 python ./src/tools/kitti-object-eval-python/evaluate.py evaluate --label_path=./kitti_format/data/kitti/label/ --label_split_file ./kitti_format/data/kitti/val.txt --current_class=0,1,2 --coco=False --result_path=./kitti_format/exp/results/data/

结果如下:

Car AP@0.70, 0.70, 0.70:
bbox AP:84.86, 67.33, 57.84
bev  AP:22.03, 15.89, 16.11
3d   AP:11.93, 8.26, 6.63
aos  AP:84.84, 66.52, 57.12

Car AP_R40@0.70, 0.70, 0.70:
bbox AP:86.85, 70.19, 60.34
bev  AP:15.91, 9.66, 8.50
3d   AP:11.40, 5.57, 4.38
aos  AP:86.82, 69.23, 59.24

Car AP@0.70, 0.50, 0.50:
bbox AP:84.86, 67.33, 57.84
bev  AP:48.56, 33.41, 31.73
3d   AP:43.06, 27.55, 26.34
aos  AP:84.84, 66.52, 57.12

Car AP_R40@0.70, 0.50, 0.50:
bbox AP:86.85, 70.19, 60.34
bev  AP:45.02, 30.01, 26.71
3d   AP:42.28, 25.16, 22.66
aos  AP:86.82, 69.23, 59.24

Pedestrian AP@0.50, 0.50, 0.50:
bbox AP:24.48, 27.27, 27.27
bev  AP:3.03, 4.55, 4.55
3d   AP:3.03, 4.55, 4.55
aos  AP:19.60, 22.03, 22.03

Pedestrian AP_R40@0.50, 0.50, 0.50:
bbox AP:19.23, 24.33, 24.33
bev  AP:0.00, 1.93, 1.93
3d   AP:0.00, 1.25, 1.25
aos  AP:14.75, 18.41, 18.41

Pedestrian AP@0.50, 0.25, 0.25:
bbox AP:24.48, 27.27, 27.27
bev  AP:9.09, 13.64, 13.64
3d   AP:9.09, 13.64, 13.64
aos  AP:19.60, 22.03, 22.03

Pedestrian AP_R40@0.50, 0.25, 0.25:
bbox AP:19.23, 24.33, 24.33
bev  AP:2.60, 6.25, 6.25
3d   AP:2.60, 6.25, 6.25
aos  AP:14.75, 18.41, 18.41

Cyclist AP@0.50, 0.50, 0.50:
bbox AP:3.03, 3.03, 3.03
bev  AP:2.27, 2.27, 2.27
3d   AP:2.27, 2.27, 2.27
aos  AP:3.00, 3.00, 3.00

Cyclist AP_R40@0.50, 0.50, 0.50:
bbox AP:0.83, 0.83, 0.83
bev  AP:0.00, 0.00, 0.00
3d   AP:0.00, 0.00, 0.00
aos  AP:0.82, 0.82, 0.82

Cyclist AP@0.50, 0.25, 0.25:
bbox AP:3.03, 3.03, 3.03
bev  AP:2.27, 2.27, 2.27
3d   AP:2.27, 2.27, 2.27
aos  AP:3.00, 3.00, 3.00

Cyclist AP_R40@0.50, 0.25, 0.25:
bbox AP:0.83, 0.83, 0.83
bev  AP:0.00, 0.00, 0.00
3d   AP:0.00, 0.00, 0.00
aos  AP:0.82, 0.82, 0.82

文章出处登录后可见!

已经登录?立即刷新

共计人评分,平均

到目前为止还没有投票!成为第一位评论此文章。

(0)
心中带点小风骚的头像心中带点小风骚普通用户
上一篇 2022年5月30日 上午11:36
下一篇 2022年5月30日

相关推荐