【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

先看结果:

face

man1

1,solvePNP姿态估计

1.1简介

这里的姿态估计其实就是人脸相对相机的方向估计,估计的要点就是找出2D像素点与3D像素点之间的映射关系。这个映射矩阵是一个平移矩阵和旋转矩阵的组合。我们先给出3D到3D坐标的映射关系,其实就是相机坐标系向世界坐标系的变换关系(称作相机外参),此变换关系就是人脸相对人脸的方向估计。3D变换关系如下:

【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

可是我们现在不知道对于相机的3D坐标,所以我们需要2D点向相机3D点映射关系(相机内参),关系如下:

【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

其中f是焦距,c是光学中心(我们先不考虑相机畸变)。组合之后的2d到3d变换关系如下

【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

展开得到:

【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

1.2内参标定

内参矩阵 我们可以使用棋盘格来标定相机,过程如下:

import cv2
import numpy as np
import glob

# 设置寻找亚像素角点的参数,采用的停止准则是最大循环次数30和最大误差容限0.001
criteria = (cv2.TERM_CRITERIA_MAX_ITER | cv2.TERM_CRITERIA_EPS, 30, 0.001)

# 获取标定板角点的位置
objp = np.zeros((4 * 6, 3), np.float32)
objp[:, :2] = np.mgrid[0:6, 0:4].T.reshape(-1, 2)  # 将世界坐标系建在标定板上,所有点的Z坐标全部为0,所以只需要赋值x和y

obj_points = []  # 存储3D点
img_points = []  # 存储2D点

images = glob.glob("image4/*.jpg")
i=0;
for fname in images:
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    size = gray.shape[::-1]
    ret, corners = cv2.findChessboardCorners(gray, (6, 4), None)
    #print(corners)

    if ret:

        obj_points.append(objp)

        corners2 = cv2.cornerSubPix(gray, corners, (5, 5), (-1, -1), criteria)  # 在原角点的基础上寻找亚像素角点
        #print(corners2)
        if [corners2]:
            img_points.append(corners2)
        else:
            img_points.append(corners)

        cv2.drawChessboardCorners(img, (6, 4), corners, ret)  # 记住,OpenCV的绘制函数一般无返回值
        i+=1;
        cv2.imwrite('conimg'+str(i)+'.jpg', img)
        cv2.waitKey(1500)

print(len(img_points))
cv2.destroyAllWindows()

# 标定
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(obj_points, img_points, size, None, None)

print("ret:", ret)
print("mtx:\n", mtx) # 内参数矩阵
print("dist:\n", dist)  # 畸变系数   distortion cofficients = (k_1,k_2,p_1,p_2,k_3)
print("rvecs:\n", rvecs)  # 旋转向量  # 外参数
print("tvecs:\n", tvecs ) # 平移向量  # 外参数

print("-----------------------------------------------------")

img = cv2.imread(images[2])
h, w = img.shape[:2]
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))#显示更大范围的图片(正常重映射之后会删掉一部分图像)
print (newcameramtx)
print("------------------使用undistort函数-------------------")
dst = cv2.undistort(img,mtx,dist,None,newcameramtx)
x,y,w,h = roi
dst1 = dst[y:y+h,x:x+w]
cv2.imwrite('calibresult3.jpg', dst1)
print ("方法一:dst的大小为:", dst1.shape)

1.3 估计方法

姿势估计的欧拉角都是根据旋转矩阵转换而来,而旋转矩阵是根据2d图像到3D图像,或者3d相机坐标系到3D世界坐标系的变换关系使用solvePnP算法求得,2d图像到3D相机坐标系是有相机内参矩阵变换得到,3D相机到3D世界坐标系是的变换矩阵包括R(旋转矩阵)和T(平移矩阵构成),其中的旋转矩阵可转换成欧拉角,也就是水平,垂直,深度轴三个方向的角度,具体步骤:

1)首先定义一个具有n个关键点的3D脸部模型,n可以根据自己对准确度的容忍程度进行定义,以下示例定义6个关键点的3D脸部模型(左眼角,右眼角,鼻尖,左嘴角,右嘴角,下颌);

2)采用人脸检测以及面部关键点检测得到上述3D脸部对应的2D人脸关键点;

3)采用Opencv的solvePnP函数解出旋转向量;

4)将旋转向量转换为欧拉角;

OpenCV中solvePnP 和 solvePnPRansac都可以用来估计Pose。第二个solvePnPRansac利用随机抽样一致算法(Random sample consensus,RANSAC)的思想,虽然计算出的姿态更加精确,但速度慢、没法实现实时系统,所以我们这里只关注第一个solvePnP函数,具体的参数可以参见Opencv文档。

注意点1:solvePnP里有三种解法:P3P, EPnP,迭代法(默认);opencv2里参数分别为CV_P3P,CV_EPNP,CV_ITERATIVE (opencv3里多了DLS和UPnP解法)。

注意点2:solvePnP需要至少3组点:P3P只使用4组点,3组求出多个解,第四组确定最优解;EPnP使用大于等于3组点;迭代法调用cvFindExtrinsicCameraParams2,进而使用SVD分解并调用cvFindHomography,而cvFindHomography需要至少4组点。

具体实现:1

import numpy as np
import math
import cv2 
import dlib

detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("./point68.dat")

# 3D model points.
model_points = np.array([
                            (0.0, 0.0, 0.0),             # Nose tip
                            (0.0, -330.0, -65.0),        # Chin
                            (-225.0, 170.0, -135.0),     # Left eye left corner
                            (225.0, 170.0, -135.0),      # Right eye right corne
                            (-150.0, -150.0, -125.0),    # Left Mouth corner
                            (150.0, -150.0, -125.0)      # Right mouth corner
                         
                        ])
                        



# 从旋转向量转换为欧拉角
def get_euler_angle(rotation_vector):
    # calculate rotation angles
    theta = cv2.norm(rotation_vector, cv2.NORM_L2)
    
    # transformed to quaterniond
    w = math.cos(theta / 2)
    x = math.sin(theta / 2)*rotation_vector[0][0] / theta
    y = math.sin(theta / 2)*rotation_vector[1][0] / theta
    z = math.sin(theta / 2)*rotation_vector[2][0] / theta
    
    ysqr = y * y
    # pitch (x-axis rotation)
    t0 = 2.0 * (w * x + y * z)
    t1 = 1.0 - 2.0 * (x * x + ysqr)
    print('t0:{}, t1:{}'.format(t0, t1))
    pitch = math.atan2(t0, t1)
    
    # yaw (y-axis rotation)
    t2 = 2.0 * (w * y - z * x)
    if t2 > 1.0:
        t2 = 1.0
    if t2 < -1.0:
        t2 = -1.0
    yaw = math.asin(t2)
    
    # roll (z-axis rotation)
    t3 = 2.0 * (w * z + x * y)
    t4 = 1.0 - 2.0 * (ysqr + z * z)
    roll = math.atan2(t3, t4)
    
    print('pitch:{}, yaw:{}, roll:{}'.format(pitch, yaw, roll))
    
    # 单位转换:将弧度转换为度
    Y = int((pitch/math.pi)*180)
    X = int((yaw/math.pi)*180)
    Z = int((roll/math.pi)*180)
    
    return X, Y, Z



cap=cv2.VideoCapture(0)
    
while True:
	# Camera internals
	flag,img = cap.read()
	#img = cv2.imread("000_4.jpg")
	size = img.shape

	focal_length = size[1]
	center = (size[1]/2, size[0]/2)
	camera_matrix = np.array(
							 [[focal_length, 0, center[0]],
							 [0, focal_length, center[1]],
							 [0, 0, 1]], dtype = "double"
							 )

	#print("Camera Matrix :\n {0}".format(camera_matrix)

	# 取灰度
	img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

	# 人脸数rects
	rects = detector(img_gray, 0)
	#for i in range(len(rects)):
	point_list=[]
	if len(rects)>0:
		landmarks = list((p.x, p.y) for p in predictor(img, rects[0]).parts())

		point_list.append(landmarks[30])
		point_list.append(landmarks[8])
		point_list.append(landmarks[36])
		point_list.append(landmarks[45])
		point_list.append(landmarks[48])
		point_list.append(landmarks[54])
		print(point_list)

		for idx, point in enumerate(point_list):
			cv2.circle(img, point, 2, color=(0, 255, 0))
	#        font = cv2.FONT_HERSHEY_SIMPLEX
	#        cv2.putText(img, str(idx + 1), None, font, 0.8, (0, 0, 255), 1, cv2.LINE_AA)


		image_points =  np.array(point_list,dtype="double")



		dist_coeffs = np.zeros((4,1)) # Assuming no lens distortion
		success, rotation_vector, translation_vector = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE)#SOLVEPNP_P3P/SOLVEPNP_ITERATIVE

		#print("Rotation Vector:\n {0}".format(rotation_vector))
		#print("Translation Vector:\n {0}".format(translation_vector))


		X1,Y1,Z1 = get_euler_angle(rotation_vector)

		print("=====================>",X1,Y1,Z1 )

		# Project a 3D point (0, 0, 1000.0) onto the image plane.

		# We use this to draw a line sticking out of the nose

		(nose_end_point2D, jacobian) = cv2.projectPoints(np.array([(0.0, 0.0, 1000.0)]), rotation_vector, translation_vector, camera_matrix, dist_coeffs)


		for p in image_points:
			#cv2.circle(img, (int(p[0]), int(p[1])), 3, (0,0,255), -1)

			p1 = ( int(image_points[0][0]), int(image_points[0][1]))
			p2 = ( int(nose_end_point2D[0][0][0]), int(nose_end_point2D[0][0][1]))
			cv2.line(img, p1, p2, (255,0,0), 2)

	cv2.imshow("src",img)
	cv2.waitKey(50)
	cv2.imwrite("dst.jpg",img)


cap.release()

1.4检测结果:

pnp

2,用3DMM参数人脸姿态估计

2.1参数估计

3DMM参数常用于人脸重建,是由一个固定的标准模型经过线性变换得到特定形状,纹理的人脸模型。任意的人脸模型可以由数据集中的m个人脸模型进行加权组合如下:

【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

【人脸姿态】2D人脸姿态估计的两种方式:solvePnP与3DMM参数

这里的形状包括手势和表情。

这里我们加载标准模型和CNN模型推理出3DMM参数(估计出的模型要根据标准模型的方差和平均模型估算),这里包括62个3DMM参数(12个姿态向量,40个姿态向量,10个表情向量),然后选用3DMM参数中姿态向量转化提取出旋转向量,再用旋转向量转化为欧拉角就得出姿态。

 # Crop image, forward to get the param
        param_lst = []
        roi_box_lst = []

        crop_policy = kvs.get('crop_policy', 'box')
        for obj in objs:
            if crop_policy == 'box':
                # by face box
                roi_box = parse_roi_box_from_bbox(obj)
            elif crop_policy == 'landmark':
                # by landmarks
                roi_box = parse_roi_box_from_landmark(obj)
            else:
                raise ValueError(f'Unknown crop policy {crop_policy}')

            roi_box_lst.append(roi_box)
            img = crop_img(img_ori, roi_box)
            img = cv2.resize(img, dsize=(self.size, self.size), interpolation=cv2.INTER_LINEAR)
            inp = self.transform(img).unsqueeze(0)

            if self.gpu_mode:
                inp = inp.cuda(device=self.gpu_id)

            if kvs.get('timer_flag', False):
                end = time.time()
                param = self.model(inp)
                elapse = f'Inference: {(time.time() - end) * 1000:.1f}ms'
                print(elapse)
            else:
                param = self.model(inp)

            param = param.squeeze().cpu().numpy().flatten().astype(np.float32)
            param = param * self.param_std + self.param_mean  # re-scale
            # print('output', param)
            param_lst.append(param)

2.2参数转化

得到12pos参数进行分解,转换成欧拉角

def P2sRt(P):
    """ decompositing camera matrix P.
    Args:
        P: (3, 4). Affine Camera Matrix.
    Returns:
        s: scale factor.
        R: (3, 3). rotation matrix.
        t2d: (2,). 2d translation.
    """
    t3d = P[:, 3]
    R1 = P[0:1, :3]
    R2 = P[1:2, :3]
    s = (np.linalg.norm(R1) + np.linalg.norm(R2)) / 2.0
    r1 = R1 / np.linalg.norm(R1)
    r2 = R2 / np.linalg.norm(R2)
    r3 = np.cross(r1, r2)

    R = np.concatenate((r1, r2, r3), 0)
    return s, R, t3d


def matrix2angle(R):
    """ compute three Euler angles from a Rotation Matrix. Ref: http://www.gregslabaugh.net/publications/euler.pdf
    refined by: https://stackoverflow.com/questions/43364900/rotation-matrix-to-euler-angles-with-opencv
    todo: check and debug
     Args:
         R: (3,3). rotation matrix
     Returns:
         x: yaw
         y: pitch
         z: roll
     """
    if R[2, 0] > 0.998:
        z = 0
        x = np.pi / 2
        y = z + atan2(-R[0, 1], -R[0, 2])
    elif R[2, 0] < -0.998:
        z = 0
        x = -np.pi / 2
        y = -z + atan2(R[0, 1], R[0, 2])
    else:
        x = asin(R[2, 0])
        y = atan2(R[2, 1] / cos(x), R[2, 2] / cos(x))
        z = atan2(R[1, 0] / cos(x), R[0, 0] / cos(x))

    return x, y, z


def calc_pose(param):
    P = param[:12].reshape(3, -1)  # camera matrix
    s, R, t3d = P2sRt(P)
    P = np.concatenate((R, t3d.reshape(3, -1)), axis=1)  # without scale
    pose = matrix2angle(R)
    pose = [p * 180 / np.pi for p in pose]

    return P, pose

2.3检测结果

face

版权声明:本文为博主AI小白龙原创文章,版权归属原作者,如果侵权,请联系我们删除!

原文链接:https://blog.csdn.net/qq_34106574/article/details/123285617

共计人评分,平均

到目前为止还没有投票!成为第一位评论此文章。

(0)
青葱年少的头像青葱年少普通用户
上一篇 2022年3月8日 下午8:26
下一篇 2022年3月8日

相关推荐