仔细观察,您的代码对我来说是正确的;这使一个人想知道原始作者是否有一个单一的错误。我想有人应该看看OpenCV是如何实现的!
尽管如此,一个更容易理解的建议是先遍历所有大小,然后遍历给定大小的可能位置来翻转for循环的顺序:
#include <stdio.h>
int main()
{
int i, x, y, sizeX, sizeY, width, height, count, c;
/* All five shape types */
const int features = 5;
const int feature[][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};
const int frameSize = 24;
count = 0;
/* Each shape */
for (i = 0; i < features; i++) {
sizeX = feature[i][0];
sizeY = feature[i][1];
printf("%dx%d shapes:\n", sizeX, sizeY);
/* each size (multiples of basic shapes) */
for (width = sizeX; width <= frameSize; width+=sizeX) {
for (height = sizeY; height <= frameSize; height+=sizeY) {
printf("\tsize: %dx%d => ", width, height);
c=count;
/* each possible position given size */
for (x = 0; x <= frameSize-width; x++) {
for (y = 0; y <= frameSize-height; y++) {
count++;
}
}
printf("count: %d\n", count-c);
}
}
}
printf("%d\n", count);
return 0;
}
结果与之前的162336
相同
为了验证这一点,我测试了4x4窗口的情况并手动检查了所有情况(因为1x2 / 2x1和1x3 / 3x1形状仅旋转了90度,形状却很容易计数):
2x1 shapes:
size: 2x1 => count: 12
size: 2x2 => count: 9
size: 2x3 => count: 6
size: 2x4 => count: 3
size: 4x1 => count: 4
size: 4x2 => count: 3
size: 4x3 => count: 2
size: 4x4 => count: 1
1x2 shapes:
size: 1x2 => count: 12 +-----------------------+
size: 1x4 => count: 4 | | | | |
size: 2x2 => count: 9 | | | | |
size: 2x4 => count: 3 +-----+-----+-----+-----+
size: 3x2 => count: 6 | | | | |
size: 3x4 => count: 2 | | | | |
size: 4x2 => count: 3 +-----+-----+-----+-----+
size: 4x4 => count: 1 | | | | |
3x1 shapes: | | | | |
size: 3x1 => count: 8 +-----+-----+-----+-----+
size: 3x2 => count: 6 | | | | |
size: 3x3 => count: 4 | | | | |
size: 3x4 => count: 2 +-----------------------+
1x3 shapes:
size: 1x3 => count: 8 Total Count = 136
size: 2x3 => count: 6
size: 3x3 => count: 4
size: 4x3 => count: 2
2x2 shapes:
size: 2x2 => count: 9
size: 2x4 => count: 3
size: 4x2 => count: 3
size: 4x4 => count: 1
0
我一直在实施Viola-Jones的面部检测算法的改编版。该技术依赖于在图像内放置一个24x24像素的子帧,然后在每个位置以各种尺寸放置矩形特征。
这些特征可以由两个,三个或四个矩形组成。提供以下示例。
他们声称穷举集超过18万(第2节):
本文中未明确陈述以下陈述,因此它们是我的假设:
基于这些假设,我计算了详尽的集合:
其结果是162,336 。
我发现逼近Viola&Jones所说的“超过180,000”的唯一方法是放弃假设4,并在代码中引入错误。这涉及分别将四行更改为:
结果为180,625 。 (请注意,这将有效防止要素触及子帧的右侧和/或底部。)
当然,现在的问题是:他们在实施中是否犯了错误?考虑表面为零的特征有意义吗?还是我看错了方向?