通过近期对CSK,CN和KCF等跟踪算法的学习,感觉它们是主要流程都是差不多的,大概可以分解为如下几步:
在It帧中,在当前位置pt附近采样,训练一个回归器。这个回归器能计算一个小窗口采样的响应。 在It+1帧中,在前一帧位置pt附近采样,用前述回归器判断每个采样的响应。 响应最强的采样作为本帧位置pt+1。基本框架如下图所示:
接下来介绍KCF中的一些具体的代码:
1.图片特征的提取:KCF中给出了2中特征
if features.hog, %HOG features, from Piotr's Toolbox x = double(fhog(single(im) / 255, cell_size, features.hog_orientations)); x(:,:,end) = []; %remove all-zeros channel ("truncation feature") end if features.gray, %gray-level (scalar feature) x = double(im) / 255; x = x - mean(x(:)); end
2.介绍了3中核函数:gaussian,polynomical和linear
function kf = gaussian_correlation(xf, yf, sigma) N = size(xf,1) * size(xf,2); xx = xf(:)' * xf(:) / N; %squared norm of x yy = yf(:)' * yf(:) / N; %squared norm of y %cross-correlation term in Fourier domain xyf = xf .* conj(yf); xy = sum(real(ifft2(xyf)), 3); %to spatial domain %calculate gaussian response for all positions, then go back to the %Fourier domain kf = fft2(exp(-1 / sigma^2 * max(0, (xx + yy - 2 * xy) / numel(xf)))); end
function kf = polynomial_correlation(xf, yf, a, b) xyf = xf .* conj(yf); xy = sum(real(ifft2(xyf)), 3); %to spatial domain %calculate polynomial response for all positions, then go back to the %Fourier domain kf = fft2((xy / numel(xf) + a) .^ b); end
function kf = linear_correlation(xf, yf) kf = sum(xf .* conj(yf), 3) / numel(xf); end
3.响应的计算和中心坐标的预测
switch kernel.type case 'gaussian', kzf = gaussian_correlation(zf, model_xf, kernel.sigma); case 'polynomial', kzf = polynomial_correlation(zf, model_xf, kernel.poly_a, kernel.poly_b); case 'linear', kzf = linear_correlation(zf, model_xf); end response = real(ifft2(model_alphaf .* kzf)); %equation for fast detection [vert_delta, horiz_delta] = find(response == max(response(:)), 1); if vert_delta > size(zf,1) / 2, %wrap around to negativ vert_delta = vert_delta - size(zf,1); end if horiz_delta > size(zf,2) / 2, horiz_delta = horiz_delta - size(zf,2); end pos = pos + cell_size * [vert_delta - 1, horiz_delta - 1];