Fisher 向量特征编码#

Fisher 向量是一种图像特征编码和量化技术,可以看作是流行的词袋模型或 VLAD 算法的软版本或概率版本。使用视觉词汇表对图像进行建模,该视觉词汇表使用在低级图像特征(如 SIFT 或 ORB 描述符)上训练的 K 模式高斯混合模型进行估计。Fisher 向量本身是高斯混合模型 (GMM) 相对于其参数(混合权重、均值和协方差矩阵)的梯度串联。

在此示例中,我们计算 scikit-learn 中数字数据集的 Fisher 向量,并基于这些表示训练分类器。

请注意,运行此示例需要 scikit-learn。

plot fisher vector
              precision    recall  f1-score   support

           0       0.89      0.92      0.90        51
           1       0.67      0.82      0.73        44
           2       0.61      0.55      0.58        40
           3       0.63      0.51      0.56        53
           4       0.75      0.60      0.67        45
           5       0.52      0.70      0.60        40
           6       0.50      0.48      0.49        46
           7       0.48      0.64      0.55        39
           8       0.55      0.50      0.53        42
           9       0.62      0.50      0.56        50

    accuracy                           0.62       450
   macro avg       0.62      0.62      0.62       450
weighted avg       0.63      0.62      0.62       450

from matplotlib import pyplot as plt
import numpy as np
from sklearn.datasets import load_digits
from sklearn.metrics import classification_report, ConfusionMatrixDisplay
from sklearn.model_selection import train_test_split
from sklearn.svm import LinearSVC

from skimage.transform import resize
from skimage.feature import fisher_vector, ORB, learn_gmm


data = load_digits()
images = data.images
targets = data.target

# Resize images so that ORB detects interest points for all images
images = np.array([resize(image, (80, 80)) for image in images])

# Compute ORB descriptors for each image
descriptors = []
for image in images:
    detector_extractor = ORB(n_keypoints=5, harris_k=0.01)
    detector_extractor.detect_and_extract(image)
    descriptors.append(detector_extractor.descriptors.astype('float32'))

# Split the data into training and testing subsets
train_descriptors, test_descriptors, train_targets, test_targets = train_test_split(
    descriptors, targets
)

# Train a K-mode GMM
k = 16
gmm = learn_gmm(train_descriptors, n_modes=k)

# Compute the Fisher vectors
training_fvs = np.array(
    [fisher_vector(descriptor_mat, gmm) for descriptor_mat in train_descriptors]
)

testing_fvs = np.array(
    [fisher_vector(descriptor_mat, gmm) for descriptor_mat in test_descriptors]
)

svm = LinearSVC().fit(training_fvs, train_targets)

predictions = svm.predict(testing_fvs)

print(classification_report(test_targets, predictions))

ConfusionMatrixDisplay.from_estimator(
    svm,
    testing_fvs,
    test_targets,
    cmap=plt.cm.Blues,
)

plt.show()

脚本的总运行时间: (0 分钟 33.406 秒)

由 Sphinx-Gallery 生成的图库