1 exdark数据集

低光数据集使用ExDark,该数据集是一个专门在低光照环境下拍摄出针对低光目标检测的数据集,包括从极低光环境到暮光环境等10种不同光照条件下的图片7363张,其中训练集5891张,测试集1472张,12个类别。


2 加载数据集

2.1 实现思路

exdark数据集中都是图片,看了一下原论文使用的coco数据集也都是图片,所以应该原论文代码中对数据集加载的代码应该可以用,但是需要修改一些地方。

原论文代码中加载数据集是使用的json格式,但是exdark数据集没有自带的json,所以可以先将标签之类的信息存储到一个json文件中。

不能直接使用了,因为源代码中使用了一个封装好的coco数据集类,但是我看了其中没有exdark数据集,所以相当于自己封装一个exdark数据集类吧,加油,一步一步来。

等一下,我突然想到一个问题,就是我把exdark数据集的格式搞得和coco数据集一样,那么我是不是就能用已经封装好的coco数据集类了?甚至直接使用coco类去加载exdark数据集。

2.2 coco数据集json详解

1
2
3
4
5
6
7
# coco_json结构示意
{
'info':info,
'licenses':[licenses],
'images':[image],
'annotations':[annotation]
}

1. images:这个部分包含了所有图像的信息。每个图像都表示为一个字典,包含以下字段:

  • id:唯一标识图像的ID
  • file_name:图像文件的文件名。
  • width:图像宽度(以像素为单位)
  • height:图像高度(以像素为单位)
  • license:图像的许可证信息(可选)

2. annotations:这个部分包含了与图像中对象实例分割相关的注释信息。每个注释表示为一个字典,包含以下字段:

  • id:唯一标识注释的ID
  • image_id:与注释相关联的图像的ID
  • category_id:对象的类别ID,对应于categories部分中的类别
  • segmentation:对象的分割掩码。通常表示为多边形或掩码的像素坐标
  • area:对象的像素面积
  • bbox:对象的边界框,格式为[x, y, width, height]
  • iscrowd:标志位,指示对象是否是“杂乱”(例如,一群对象被视为单个对象)

3. categories:这个部分包含了对象类别的信息。每个类别表示为一个字典,包含以下字段:

  • id:唯一标识类别的ID。
  • name:类别的名称。
  • supercategory:类别的超类别,用于组织相关类别。

4. info:这个部分包含了关于数据集的一般信息,如数据集名称、描述、版本等。

5. licenses:这个部分包含了与数据集许可相关的信息,如许可证名称、ID、URL等。

以下是一个简化的coco格式json示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
{
"images": [
{
"id": 1,
"file_name": "image1.jpg",
"width": 640,
"height": 480
},
{
"id": 2,
"file_name": "image2.jpg",
"width": 800,
"height": 600
}
],
"annotations": [
{
"id": 1,
"image_id": 1,
"category_id": 1,
"segmentation": [[x1, y1, x2, y2, ...]],
"area": 1234,
"bbox": [x, y, width, height],
"iscrowd": 0
},
{
"id": 2,
"image_id": 1,
"category_id": 2,
"segmentation": [[x1, y1, x2, y2, ...]],
"area": 567,
"bbox": [x, y, width, height],
"iscrowd": 0
}
],
"categories": [
{
"id": 1,
"name": "person",
"supercategory": "human"
},
{
"id": 2,
"name": "car",
"supercategory": "vehicle"
}
],
"info": {
"description": "COCO 2017 dataset",
"version": "1.0",
"year": 2017,
"contributor": "Microsoft COCO group",
"url": "http://cocodataset.org"
},
"licenses": [
{
"id": 1,
"name": "CC BY-SA 2.0",
"url": "https://creativecommons.org/licenses/by-sa/2.0/"
}
]
}

COCO数据集的annotations字段中,segmentation字段用于描述对象实例的分割信息。

segmentation字段的内容可以是多边形(polygon)或二进制掩码(mask),具体格式取决于数据集的标注方式。以下是关于segmentation字段的详细介绍:

(1)多边形表示(Polygon Representation)

在COCO数据集中,segmentation字段通常以多边形的形式来表示对象实例的分割区域。多边形表示是一个列表,其中包含一系列坐标点,这些点按照顺序连接以形成多边形边界。坐标点的顺序是按照顺时针或逆时针方向排列的。

例如,segmentation字段的内容可以如下所示:

1
"segmentation": [[x1, y1, x2, y2, x3, y3, ...]]

其中,每对(x, y)表示一个多边形边界上的点坐标。这些坐标点按照顺时针或逆时针的顺序排列。

(2)二进制掩码表示(Binary Mask Representation)

在某些情况下,COCO数据集也可以使用二进制掩码来表示对象实例的分割区域。二进制掩码是一个二维矩阵,其中每个像素都标识为前景(对象)或背景。通常,前景像素用1表示,背景像素用0表示。

例如,segmentation字段的内容可以如下所示:

1
2
3
4
"segmentation": {  
"size": [height, width],
"counts": "binary\_mask\_encoded"
}

其中,size字段包含掩码的高度和宽度,counts字段包含了用一种编码方式表示的二进制掩码。

在实例分割的coco-json中,annotations–segmentation以counts形式存储分割信息,且用到了RLE编码,因此利用segmentation存储的分割信息还需要进行RLE解码操作,这里可以利用pycocotools中的方法进行解码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import pycocotools.mask as mask_utils
from pycocotools.coco import COCO

# 读取json
coco = COCO(json_path)
images_ids = coco.getImgIds()

# 逐图像处理
for img_id in images_ids:
img_info = coco.loadImgs(img_id)[0]
ann_ids = coco.getAnnIds(imgIds=img_id)
anns = coco.loadAnns(ann_ids)

# 逐实例处理
for ann in anns:
rle = coco.annToRLE(ann) # 解码
mask = mask_utils.decode(rle) # 生成原图mask


3 遇到的问题

3.1 ERROR: Could not build wheels for pycocotools

完整报错如下:

1
2
3
4
5
6
7
8
9
cl: 命令行 error D8021 :无效的数值参数“/Wno-cpp”
error: command 'D:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.40.33807\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pycocotools
Running setup.py clean for pycocotools
Failed to build pycocotools
ERROR: Could not build wheels for pycocotools, which is required to install pyproject.toml-based projects

通过以下博客进行解决:

错误:cl: 命令行 error D8021 :无效的数值参数“/Wno-cpp”-CSDN博客 文章浏览阅读1.6w次,点赞55次,收藏72次。错误:cl: 命令行 error D8021 :无效的数值参数“/Wno-cpp”文章目录:一、错误原因二、错误解决1、下载源码2、修改setup.py文件3、编译一、错误原因我是在运行这个项目的时候出现的错误,主要是用到pycocotools库包,在安装的过程需要进行编译,编译就会出现这个错误。二、错误解决最好是通过源码进行安装,不要用pip in https://blog.csdn.net/weixin_41010198/article/details/94053130

3.2 subprocess.CalledProcessError: Command ‘[‘ninja’, ‘-v’]’ returned non-zero exit status 1.

完整报错如下:

1
2
3
4
5
6
7
8
9
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include\crt/host_config.h(153): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
ms_deform_attn_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "F:\anaconda\anaconda3\envs\DeltaZero\lib\site-packages\torch\utils\cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "F:\anaconda\anaconda3\envs\DeltaZero\lib\subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

通过以下方法得以解决:

  • 将anaconda环境下的 lib/python3.6/site-packages/torch/utils/cpp_extension.py文件中的[‘ninja’,’-v’]改成[‘ninja’,’–v’] 或者[‘ninja’,’–version’]

完整报错如下:

1
2
LINK : fatal error LNK1181: 无法打开输入文件“D:\Code\Paper-code\DINO\models\dino\ops\build\temp.win-amd64-cpython-310\Release\Code\Paper-code\DINO\models\dino\ops\src\cuda\ms_deform_attn_cuda.obj”
error: command 'D:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.40.33807\\bin\\HostX86\\x64\\link.exe' failed with exit code 1181

通过以下方法得以解决:

将ninja关闭,修改如图所示,添加红框内的内容,setup.py文件中

3.4 –unsupported Microsoft Visual Studio version

完整报错如下:

1
2
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include\crt/host_config.h(153): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
error: command 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.8\\bin\\nvcc.exe' failed with exit code 2

原因是因为CUDA版本需要与微软的C/C++编译器版本不匹配,需要修改以下路径中的源文件,并将_MSC_VER >=后面的数值修改为与自己vs的对应版本。

1
2
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include\crt\host_config.h

VS2022的对应数值为1940,查看的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>

int main()
{
#if _MSC_VER >= 1940
std::cout << "Visual Studio 2022" << std::endl;
#elif _MSC_VER >= 1920
std::cout << "Visual Studio 2019" << std::endl;
#elif _MSC_VER >= 1910
std::cout << "Visual Studio 2017" << std::endl;
#elif _MSC_VER >= 1900
std::cout << "Visual Studio 2015" << std::endl;
#elif _MSC_VER >= 1800
std::cout << "Visual Studio 2013" << std::endl;
#elif _MSC_VER >= 1700
std::cout << "Visual Studio 2012" << std::endl;
#elif _MSC_VER >= 1600
std::cout << "Visual Studio 2010" << std::endl;
#else
std::cout << "Unknown Version" << std::endl;
#endif
return 0;
}

修改源文件如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#if defined(_WIN32)

#if _MSC_VER < 1910 || _MSC_VER > 1940

#error -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.

#elif _MSC_VER >= 1910 && _MSC_VER <= 1940

#pragma message("support for this version of Microsoft Visual Studio has been deprecated! Only the versions between 2017 and 2022 (inclusive) are supported!")

#endif /* (_MSC_VER < 1910 || _MSC_VER >= 1940) || (_MSC_VER >= 1910 && _MSC_VER < 1910) */

#endif /* _WIN32 */
#endif /* !__NV_NO_HOST_COMPILER_CHECK */

最后编译成功:

3.5 TypeError: iteration over a 0-d tensor

在训练模型的过程中遇到报错如下:

1
2
3
4
5
6
7
File "D:\Code\Paper-code\DINO\models\dino\dn_components.py", line 36, in prepare_for_cdn
known_num = [sum(k) for k in known]
File "D:\Code\Paper-code\DINO\models\dino\dn_components.py", line 36, in <listcomp>
known_num = [sum(k) for k in known]
File "F:\anaconda\anaconda3\envs\DeltaZero\lib\site-packages\torch\_tensor.py", line 916, in __iter__
raise TypeError("iteration over a 0-d tensor")
TypeError: iteration over a 0-d tensor

简单来说就行对一个0维的张量进行迭代,导致出错,与其相关的部分代码如下:

1
2
3
4
5
6
targets, dn_number, label_noise_ratio, box_noise_scale = dn_args
# positive and negative dn queries
dn_number = dn_number * 2
known = [(torch.ones_like(t['labels'])).cuda() for t in targets]
batch_size = len(known)
known_num = [sum(k) for k in known]

将变量targets, dn_number, label_noise_ratio, box_noise_scale打印输出结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>>> "targets":[
{
'image_id': tensor(10, device='cuda:0'),
'labels': tensor(1, device='cuda:0'),
'size': tensor([ 800, 1200], device='cuda:0'),
'orig_size': tensor([576, 864], device='cuda:0')
}, {
'image_id': tensor(10, device='cuda:0'),
'labels': tensor(11, device='cuda:0'),
'size': tensor([ 704, 1056], device='cuda:0'),
'orig_size': tensor([576, 864], device='cuda:0')
}]
>>> "dn_number": 100
>>> "label_noise_ration": 0.5
>>> "box_noise_scale": 1.0

可以看到变量target中是1张图片中的两个目标检测的信息。

感觉是不是targets中应该是2个维度:[batch_size×1],所以应该是2维变量,但是上述报错是因为是1个1维变量,在遍历其中的每一个变量时,就是0维的,所以出了问题。

为什么是2维的,推测是第二维是每张图片,但是每张图片中不止1个要检测的目标,所以是2维的,但是我修改的代码把同一个图片中的多个不同的要检测的目标给分割成了不同的图片,所以出错,现在应该看一个数据集处理的代码。

从下面的json文件中可以看到,在数据处理的时候就把同一张图片中的2个不同的目标给分割开了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
"image_id": 2,
"bbox": [
136,
190,
79,
109
],
"category": 1,
"file_name": "00002.png"
},
{
"image_id": 2,
"bbox": [
219,
172,
63,
131
],
"category": 1,
"file_name": "00002.png"
}

所以需要修改数据集预处理的代码,发现问题了,加载数据集使用的是coco_panoptic.py的代码,但是应该在coco.py基础上修改。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
import json
import torch
import numpy as np
from PIL import Image
from pathlib import Path

from coco import make_coco_transforms
from DINO.util.box_ops import masks_to_boxes


class ExDark:
def __init__(self, img_folder, ann_folder, ann_file, transforms=None, return_masks=True):
with open(ann_file, 'r') as f:
self.exdark = json.load(f)

# 对“images”字段进行排序,以便它们与“annotations”对齐
# i.e., in alphabetical order
# self.exdark['images'] = sorted(self.exdark['images'], key=lambda x: x['id'])
# sanity check

self.img_folder = img_folder
self.ann_folder = ann_folder
self.ann_file = ann_file
self.transforms = transforms
self.return_masks = return_masks

def __getitem__(self, idx):
ann_info = self.exdark['annotations'][idx] if "annotations" in self.exdark else self.exdark['images'][idx]
img_path = Path(self.img_folder) / ("2015_" + ann_info['file_name'])
ann_path = Path(self.ann_folder) / ann_info['file_name']

img = Image.open(img_path).convert('RGB')
w, h = img.size

target = {'image_id': torch.tensor(ann_info['image_id'])}
if self.return_masks:
target['masks'] = masks

# masks = torch.as_tensor(masks, dtype=torch.uint8)
labels = torch.tensor(ann_info['category'], dtype=torch.int64)

target['labels'] = labels

# target["boxes"] = masks_to_boxes(masks)

target['size'] = torch.as_tensor([int(h), int(w)])
target['orig_size'] = torch.as_tensor([int(h), int(w)])

if self.transforms is not None:
img, target = self.transforms(img, target)

return img, target

def __len__(self):
return len(self.exdark['images'])

def get_height_and_width(self, idx):
img_info = self.exdark['images'][idx]
height = img_info['height']
width = img_info['width']
return height, width


def build(image_set, args):
img_folder_root = Path(args.exdark_path)
ann_folder_root = Path(args.exdark_path)
assert img_folder_root.exists(), f'provided ExDark path {img_folder_root} does not exist'
assert ann_folder_root.exists(), f'provided ExDark path {ann_folder_root} does not exist'
mode = 'panoptic'
PATHS = {
"train": ("train", Path("annotations") / f'{mode}_train.json'),
"val": ("val", Path("annotations") / f'{mode}_val.json'),
}

img_folder, ann_file = PATHS[image_set]
img_folder_path = img_folder_root / img_folder
ann_folder = ann_folder_root / f'{mode}_{img_folder}'
ann_file = ann_folder_root / ann_file

dataset = ExDark(img_folder_path, ann_folder, ann_file,
transforms=make_coco_transforms(image_set), return_masks=args.masks)

return dataset


if __name__ == '__main__':
pass

详见加载数据集。


4 AFT模块的输入每次都发生变化

为什么在修改后的DINO中aft模块的输入每次都不一样,第一次是817,第二次是856?

首先需要弄明白这个数字代表什么意思?

打印了一下从数据集中取出来的数据,内容如下:

1
2
3
4
samples: {'tensors.shape': torch.Size([2, 3, 820, 826]), 'mask.shape': torch.Size([2, 820, 826])}
target: [{'boxes': tensor([[0.3312, 0.9495, 0.1812, 0.1010]], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'image_id': tensor([1029], device='cuda:0'), 'area': tensor([10646.6445], device='cuda:0'), 'iscrowd': tensor([0], device='cuda:0'), 'orig_size': tensor([ 900, 1440], device='cuda:0'), 'size': tensor([704, 826], device='cuda:0')}, {'boxes': tensor([[0.2429, 0.8480, 0.2006, 0.2071],
[0.0666, 0.8188, 0.1060, 0.1850],
[0.6082, 0.4589, 0.5473, 0.9129]], device='cuda:0'), 'labels': tensor([ 2, 2, 10], device='cuda:0'), 'image_id': tensor([1335], device='cuda:0'), 'area': tensor([ 19622.2344, 9264.0713, 235995.3125], device='cuda:0'), 'iscrowd': tensor([0, 0, 0], device='cuda:0'), 'orig_size': tensor([500, 357], device='cuda:0'), 'size': tensor([820, 576], device='cuda:0')}]

其中,samples中的第一个元素tensors的形状解释如下:

  • 2:batch_size
  • 3:图像的通道
  • 820:图像的长
  • 826:图像的宽

用了原来的模型跑了一下,发现虽然输入的长和宽也在发生变化,但是模型依旧能够训练起来,现在需要找一下原因。


5 RuntimeError: Expected weight to be a vector of size equal

完整报错如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Traceback (most recent call last):
File "/root/autodl-tmp/DINO/main.py", line 401, in <module>
main(args)
File "/root/autodl-tmp/DINO/main.py", line 286, in main
train_stats = train_one_epoch(
File "/root/autodl-tmp/DINO/engine.py", line 48, in train_one_epoch
outputs = model(samples, targets)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/autodl-tmp/DINO/models/dino/dino.py", line 254, in forward
y = self.test2(y)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 273, in forward
return F.group_norm(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/functional.py", line 2528, in group_norm
return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Expected weight to be a vector of size equal to the number of channels in input, but got weight of shape [256] and input of shape [256, 544, 820]

输入的权重和模型期待的输入形状不符合,因此报错。

5.1 分析

但是这里显示都是256个维度,所以就很奇怪,问了学长可能是由于加载了之前训练的模型,导致出现这个错误。

产生这个问题的背景:我前一天成功训练了aft_simple,但是今天再训练的时候就报了以上错误,但是现在我已经把之前训练的模型以及所在的文件夹都给删除了,但还是报这个错误。

理性分析一下,首先不要急躁,出现这个问题说明代码中肯定有哪个地方有问题,但是我现在不知道如何去定位这个问题。

因为在autodl平台上有一份代码,然后在本地有一份代码,所以就导致有些混乱,虽然我每次修改代码时都尽量在2个地方进行同步。

现在先确保2个平台上的代码一致吧,还是不对,重新clone一份代码吧,不知道问题出在哪。

试了还是不对,新试的肯定没有保存过的模型,但是还是不对,到底哪里出了问题,感觉肯定是新增加的模块出了问题。

所以还是检查一下aft的问题吧。


6 结果记录

6.1 AFT_Simple

1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.003
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.015
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.041
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.049
1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.004
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.013
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.021
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.029
1
2
3
4
5
6
7
8
9
10
11
12
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.174
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.299
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.183
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.049
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.088
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.217
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.259
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.412
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.520
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.089
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.299
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.603
1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.325
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.550
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.338
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.108
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.184
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.389
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.317
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.506
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.573
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.284
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.399
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.639
1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.333
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.575
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.339
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.148
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.201
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.394
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.320
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.509
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.586
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.295
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.456
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.632
1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.385
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.633
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.409
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.146
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.232
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.449
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.345
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.542
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.622
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.338
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.484
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.674

6.2 AFT_Conv

1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.004
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.012
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.042
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.066
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.002
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.084
1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.011
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.023
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.034
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.041
1
2
3
4
5
6
7
8
9
10
11
12
13
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.003
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.012
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.027
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.002
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.034