使用PCA旋转图像

mac2025-03-20 11

1.PCA简介

PCA是一种常用的降维技术，通过低维的特征向量来表示原数据。使用PCA降维以及重建的一般方法：(1) 计算原数据的协方差矩阵(python 可以使用numpy.linalg.cov计算)；(2) 计算协方差矩阵的特征值以及特征向量；(3) 取前d个特征向量组成转换矩阵 $A=[a_1,a_2,...,a_d]$ 。之后使用 $y=A^{T}x$ 进行降维，使用 $\widetilde{x}=Ay=A(A^{T}x)$ 进行重建。示例图如下所示。

2.使用PCA旋转图像

通过PCA可以获得原始数据的特征向量，而最大的特征向量则对应图像内容的主方向，从而使用旋转矩阵[1]可以将图像的内容旋转至与坐标轴平行。与使用PCA降维不同，使用PCA旋转图像使用的不是原图片的像素，而是图像中目标像素的位置。例如下面的图像，首先将图像转换为灰度图像，之后设定判断的阈值，将大于阈值的像素点的位置记录下来。之后的操作将在这些记录的位置上进行。旋转操作的过程：(1)对位置矩阵(n*2维)求协方差矩阵，并求出协方差矩阵的特征向量；(2)构建旋转矩阵；(3) 使用旋转矩阵左乘每个位置(x,y) 得到旋转后的位置 $(\widetilde{x},\widetilde{y})$ ；(4) 将目标移动至图像的中心位置。在具体实现中，麻烦的一点在于图像的(0,0)位置在左上角，但是一般做几何变换操作都习惯使用左下角为坐标轴原点。

3.实现

代码中，首先将图像转为灰度图像，之后按照设定的阈值记录目标所在的位置，接下来就是使用PCA求出特征向量，并使用构建的旋转矩阵求出旋转后像素的新位置。 main函数中在原始图像上绘制了两条线，分别是主方向和与主方向垂直的方向。最终，旋转后的图像，其中的目标将会垂直于坐标轴。

import numpy as np import matplotlib.pyplot as plt def hack_pca(filename): ''' Input: filename -- input image file name/path Output: img -- image without rotation ''' img_r = (plt.imread(filename)).astype(np.float64) # YOUR CODE HERE # begin answer # convert to gray image H, W, _ = np.shape(img_r) img_gray = np.zeros([H, W], dtype=float) for i in range(H): for j in range(W): img_gray[i, j] = 0.3*img_r[i, j, 0] + \ 0.59*img_r[i, j, 1]+0.11*img_r[i, j, 2] # record the place not zero(or some threshold) data_points = [] threshold = 100 for i in range(H): for j in range(W): if img_gray[i, j] > threshold: data_points.append([i, j]) data_points = np.asarray(data_points) cov_mat = np.cov(data_points.T) eig_val, eig_vec = np.linalg.eig(cov_mat) rot_mat = np.array([[eig_vec[0,0], eig_vec[0,1]], [eig_vec[1,0], eig_vec[1,1]]]).T place_mat = np.zeros([H, W, 2], dtype=int) for i in range(H): for j in range(W): place_mat[i, j] = rot_mat.dot(np.array([i, j])) img_parallel=np.zeros([H,W], dtype=float) mean_x=np.mean(place_mat[:,:,0]) mean_y =np.mean(place_mat[:,:,1]) for i in range(H): for j in range(W): place_mat[i,j] = place_mat[i,j] - np.array([mean_x,mean_y])+np.array([H/2, W/2]) for i in range(H): for j in range(W): if 0 <= place_mat[i,j,0] < H and 0<= place_mat[i,j,1] < W: img_parallel[place_mat[i,j,0],place_mat[i,j,1]]=img_gray[i,j] return img_parallel, eig_vec,img_gray # end answer if __name__ == "__main__": img,eig_vec,raw_img = hack_pca("1.gif") plt.figure(figsize=[8,6]) # main direction k1=eig_vec[0,0]/eig_vec[0,1] xs=[i for i in range(0,30)] ys=[k1*e for e in xs] for i in range(len(xs)): raw_img[int(xs[i]), int(ys[i])]=0 # the orthogonal direction of main direction k2=eig_vec[1,0]/eig_vec[1,1] ys=[k2*e for e in xs] for i in range(len(xs)): raw_img[int(xs[i]), int(ys[i])]=255 plt.subplot(121) plt.imshow(raw_img) plt.subplot(122) plt.imshow(img) plt.show()

4.结果

左边的图像是原图像，图像上那条孔洞直线是主方向，与之垂直的是另一个特征向量的方向。

参考： [1]Creating a rotation matrix in NumPy [2]Rotation and Scaling Image Using PCA [3]principal-direction-in-a-binary-image

最新回复(0)