cs131 hw6
主要内容
- image compression using SVD
- kNN methods for image recognition.
- PCA and LDA to improve kNN
参考资料
- wwdguu/CS131_homework
- Solutions-Stanford-cs131-Computer-Vision-Foundations-and-Application/hw6_release/
1. Image Compression
1.1 使用SVD压缩图片
函数: def compress_image(image, num_values)
参数:
- images: (H,w)
- num_values: 保留的奇异值
返回值: - compressed_image: (H,w) 压缩后的图片
- compressed_size: 压缩图片的size
实现:
- Get SVD of the image
- Only keep the top
num_values
singular values, and computecompressed_image
- Compute the compressed size
def compress_image(image, num_values):
compressed_image = None
compressed_size = 0
# YOUR CODE HERE
H, W = image.shape
# Steps:
# 1. Get SVD of the image
u, s, vt = np.linalg.svd(image)
# 2. Only keep the top `num_values` singular values, and compute `compressed_image`
sigular = np.diag(s[:num_values]) # 格式为对角阵
compressed_image = u[:, :num_values].dot(sigular).dot(vt[:num_values, :]) # (H,n) * (n, n) * ( n, W)
# 3. Compute the compressed size, 保留的数据为: (H,n), (sigular), (n, w)
compressed_size = H * num_values + num_values + num_values * W
pass
# END YOUR CODE
assert compressed_image.shape == image.shape, \
"Compressed image and original image don't have the same shape"
assert compressed_size > 0, "Don't forget to compute compressed_size"
return compressed_image, compressed_size
2. k-Nearest Neighbor
使用knn算法分类,使用图片的原始特征,例如像素值;
步骤如下:
- 计算X_train 和 X_test 的两个元素之间的L2距离
- 将数据集划分为5个文件为了交叉验证(cross-validation)
- 对于每个文件,对于不同的k,进行预测和评估准确度
- 最终获得最佳的k,来评估精确度
2.1 compute_distance
函数: def compute_distances(X1, X2)
参数:
- X1,X2: (M,D)
实现: - L2 距离就是欧式距离
def compute_distances(X1, X2):
M = X1.shape[0]
N = X2.shape[0]
assert X1.shape[1] == X2.shape[1]
dists = np.zeros((M, N))
# YOUR CODE HERE
# (x-y)^2 == x^2 + y^2 -2xy
x_2 = np.sum(X1**2, axis=1)[:, np.newaxis] # 增加一个维度
y_2 = np.sum(X2**2, axis=1)
xy_2 = np.dot(X1, X2.T)
dists = x_2 + y_2 - 2 * xy_2
# END YOUR CODE
assert dists.shape == (M, N), "dists should have shape (M, N), got %s" % dists.shape
return dists
2.2 predict labels
函数: def predict_labels(dists, y_train, k=1)
参数:
- dists: (num_test, num_train)
实现: - 对于每个test寻找最近的train图片,然后统计k个最相似图片的类别,比重占最大的类别就是test的预测类别
def predict_labels(dists, y_train, k=1):
num_test, num_train = dists.shape
y_pred = np.zeros(num_test, dtype=np.int)
for i in range(num_test):
# A list of length k storing the labels of the k nearest neighbors to
# the ith test point.
closest_y = []
# Use the distance matrix to find the k nearest neighbors of the ith
# testing point, and use self.y_train to find the labels of these
# neighbors. Store these labels in closest_y.
# Hint: Look up the function numpy.argsort.
# Now that you have found the labels of the k nearest neighbors, you
# need to find the most common label in the list closest_y of labels.
# Store this label in y_pred[i]. Break ties by choosing the smaller
# label.
# YOUR CODE HERE
indices = np.argsort(dists[i]) # 最近的label,排序
closest_y = y_train[indices[:k]] # 选择最近的k个
# y_train 为重复的label,本例子中有800章图片,共16个分类
y_pred[i] = np.bincount(closest_y).argmax() # 选择k个最相似的label中重复最多的
# END YOUR CODE
return y_pred
2.3 cross-validation
选择k个最相似的参数,不能确定k,通过 cross-validation 来选择k;
将所有训练集分为5个文件,
- 80% : 训练集
- 20%%: 验证集
函数:split_folds(X_train, y_train, num_folds)
def split_folds(X_train, y_train, num_folds):
"""Split up the training data into `num_folds` folds.
The goal of the functions is to return training sets (features and labels) along with
corresponding validation sets. In each fold, the validation set will represent (1/num_folds)
of the data while the training set represent (num_folds-1)/num_folds.
If num_folds=5, this corresponds to a 80% / 20% split.
For instance, if X_train = [0, 1, 2, 3, 4, 5], and we want three folds, the output will be:
X_trains = [[2, 3, 4, 5],
[0, 1, 4, 5],
[0, 1, 2, 3]]
X_vals = [[0, 1],
[2, 3],
[4, 5]]
Args:
X_train: numpy array of shape (N, D) containing N examples with D features each
y_train: numpy array of shape (N,) containing the label of each example
num_folds: number of folds to split the data into
jeturns:
X_trains: numpy array of shape (num_folds, train_size * (num_folds-1) / num_folds, D)
y_trains: numpy array of shape (num_folds, train_size * (num_folds-1) / num_folds)
X_vals: numpy array of shape (num_folds, train_size / num_folds, D)
y_vals: numpy array of shape (num_folds, train_size / num_folds)
"""
assert X_train.shape[0] == y_train.shape[0]
validation_size = X_train.shape[0] // num_folds
training_size = X_train.shape[0] - validation_size
X_trains = np.zeros((num_folds, training_size, X_train.shape[1]))
y_trains = np.zeros((num_folds, training_size), dtype=np.int)
X_vals = np.zeros((num_folds, validation_size, X_train.shape[1]))
y_vals = np.zeros((num_folds, validation_size), dtype=np.int)
# YOUR CODE HERE
# Hint: You can use the numpy array_split function.
# 例如 [1, 2, 3, 4,5]
# [1,2,3,4: 5]
# [2,3,4,5: 1]
# [3,4,5,1: 2]
# [4,5,1,2: 3]
# [5,1,2,3: 4] 这五种组合方式
X_num_folds = np.array(np.array_split(X_train, num_folds))
y_num_folds = np.array(np.array_split(y_train,num_folds))
for i in range(num_folds):
X_trains[i] = X_num_folds[(np.arange(num_folds) != i)].reshape((-1,X_trains.shape[-1]))
X_vals[i] = X_num_folds[i]
y_trains[i] = y_num_folds[(np.arange(num_folds) != i)].reshape((-1, y_trains.shape[-1]))
y_vals[i] = y_num_folds[i]
# END YOUR CODE
return X_trains, y_trains, X_vals, y_vals
3. PCA
PCA方法的总结:
- 标准化数据
- 计算特征向量和特征值
- 按照降序将特征值排序,选择前k个特征值和对应的特征向量,
- 建立映射矩阵W,从选择的k个特征向量
- 将原始的数据集X通过W降低维度为K的子空间Y
3,1 Eigen decomposition
类:class PCA(object)
函数:def _eigen_decomp(self, X)
返回值:
- e_vecs: 特征向量(D,D)
- e_vals: 特征值:(D,)
实现: - 主要功能是执行特征协方差矩阵的特征分解
def _eigen_decomp(self, X):
N, D = X.shape
e_vecs = None
e_vals = None
# YOUR CODE HERE
# Steps: 计算步骤如下
# 1. compute the covariance matrix of X, of shape (D, D)
matrix = np.matmul(X.T, X) / (X.shape[0] - 1) # 协方差矩阵 (D,D)
# 2. compute the eigenvalues and eigenvectors of the covariance matrix
e_vals, e_vecs = np.linalg.eig(matrix) # 计算特征值特征向量
# 3. Sort both of them in decreasing order (ex: 1.0 > 0.5 > 0.0 > -0.2 > -1.2)
indices = np.argsort(-e_vals) # 从达到小排序,序号
e_vals = np.real(e_vals[indices]) # 返回实数
e_vecs = np.real(e_vecs[:, indices])
# END YOUR CODE
# Check the output shapes
assert e_vals.shape == (D,)
assert e_vecs.shape == (D, D)
return e_vecs, e_vals
3.2 SVD
函数:def _svd(self, X)
返回值:
返回值:
- e_vecs: 特征向量(D,D)
- e_vals: 特征值:(D,)
实现: - 实现svd分解
def _svd(self, X):
vecs = None # shape (D, D)
N, D = X.shape
vals = None # shape (K,)
# YOUR CODE HERE
# Here, compute the SVD of X
# Make sure to return vecs as the matrix of vectors where each column is a singular vector
# U: X*X.T , V.T = X.T*X, S:square roots of the eigenvalues of U or V.T
u, s, vt = np.linalg.svd(X)
vals = s
vecs = vt.T # 转置
# END YOUR CODE
assert vecs.shape == (D, D)
K = min(N, D)
assert vals.shape == (K,)
return vecs, vals
3.3 Dimensionality Reduction
对原始数据集进行降维处理
函数: def transform(self, X, n_components)
3.4 Reconstruction error and captured variance
将降低维度后的数据重新映射到原空间
函数:def reconstruct(self, X_proj)
def reconstruct(self, X_proj):
N, n_components = X_proj.shape
X = None
# YOUR CODE HERE
# Steps:
# 1. project back onto the original space of dimension D
X = np.zeros((N, self.W_pca.shape[1])) # 原空间
X[:, :n_components] = X_proj
# 2. add the mean that we substracted in `transform`
X = X.dot(np.linalg.inv(self.W_pca)) + self.mean
pass
# END YOUR CODE
return X
4. Fisherface: Linear Discriminant Analysis
LDA和PCA的区别:
- LDA需要labels,不是完全无监督,PCA是完全无监督的
- PCA保持
maximum variance
- LDA保持
discimination between classes
4.1 Dimensionality Reduction via PCA
对于本例子,N= 800, D = 4096, 为了应用LDA,需要使得D<N,使用PCA来降低维度;
4.2 Scatter matrices
为了分类,分别计算类内散度矩阵Sw, 和类间散度矩阵Sb
函数:def _within_class_scatter(self, X, y)
实现:
- 对于每个类,计算协方差矩阵
- 最后结果为协方差矩阵之和
def _within_class_scatter(self, X, y):
_, D = X.shape
assert X.shape[0] == y.shape[0]
scatter_within = np.zeros((D, D))
for i in np.unique(y):
# YOUR CODE HERE
# Get the covariance matrix for class i, and add it to scatter_within
X_i = X[y == i] # 类为i
X_i_centered = X_i - np.mean(X_i, axis=0)
S_i = np.matmul(X_i_centered.T, X_i_centered) # 协方差矩阵
scatter_within += S_i # 结果
# END YOUR CODE
return scatter_within
函数: def _between_class_scatter(self, X, y)
实现:
- 计算两个类之间的协方差矩阵
def _between_class_scatter(self, X, y):
_, D = X.shape
assert X.shape[0] == y.shape[0]
scatter_between = np.zeros((D, D))
mu = X.mean(axis=0)
for i in np.unique(y):
# YOUR CODE HERE
X_i = X[y==i] # 类别i
mu_i = np.mean(X_i, axis=0) # 类别i的均值
scatter_between += (len(y[y==i])) * np.matmul((mu_i - mu).T, (mu_i - mu))
# END YOUR CODE
return scatter_between
4.3 Solving generalized Eigenvalue problem
拟合训练数据
计算完类内散度矩阵Sw,类间散度矩阵Sb 后,计算矩阵Sw^-1Sb的特征向量和特征矩阵;
新的
函数: def fit(self, X, y)
def fit(self, X, y):
N, D = X.shape
scatter_between = self._between_class_scatter(X, y)
scatter_within = self._within_class_scatter(X, y)
e_vecs = None
# YOUR CODE HERE
# Solve generalized eigenvalue problem for matrices `scatter_between` and `scatter_within`
# Use `scipy.linalg.eig` instead of numpy's eigenvalue solver.
# Don't forget to sort the values and vectors in descending order.
e_vals, e_vecs = scipy.linalg.eig(np.linalg.inv(scatter_within).dot(scatter_between))
# END YOUR CODE
sorting_order = np.argsort(e_vals)[::-1]
e_vals = e_vals[sorting_order]
e_vecs = e_vecs[:,sorting_order] # 从大到小排序
self.W_lda = e_vecs
# Check that the shape of `self.W_lda` is correct
assert self.W_lda.shape == (D, D)
# Each column of `self.W_lda` should have norm 1 (each one is an eigenvector)
for i in range(D):
assert np.allclose(np.linalg.norm(self.W_lda[:, i]), 1.0)