【论文+代码】PEBAL/Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes（复杂城市驾驶场景异常分割的像素级能量偏置弃权学习）

456 0 0

作者:<span class="portrait"/>Star

CSDN同步更新：http://t.csdn.cn/P0YGb
博客园同步更新：https://www.cnblogs.com/StarTwinkle/p/16571290.html

【初步理解，更新补充中…】

Github：https://github.com/tianyu0207/PEBAL

Article

Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes

复杂城市驾驶场景异常分割的像素级能量偏置弃权学习

@article{YuanhongChen2022PixelwiseEA,
  title={Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on  Complex Urban Driving Scenes},
  author={Yuanhong Chen and Yu Tian and Yuyuan Liu and Guansong Pang and Fengbei Liu and Gustavo Carneiro},
  journal={arXiv: Computer Vision and Pattern Recognition},
  year={2022}
}

论文十问：

Q1论文试图解决什么问题？

复杂城市场景中，保证ID(in-distribution)对象分类准确前提下，准确识别出异常像素（OOD,Out-of-Distribution对象）

in-distribution: 分布内对象，训练时已知的对象
Out-of-Distribution: 异常对象，训练时未见过的对象

Q2这是否是一个新的问题？

之前已经有相关的研究，比如不确定度的方法和重建方法

Q3这篇文章要验证一个什么科学假设？

自适应惩罚的放弃学习在像素集的异常检测中是有用的

平滑性、稀疏性约束是有用的

微调模型比重新训练效果更好

Q4有哪些相关研究？如何归类？谁是这一课题在领域内值得关注的研究员？

基于方法分类，比如：不确定度，重建

基于是否引入了异常像素进行分类，引入异常像素一般效果会比较好一些，但是也有一些缺点（如不可能引入真实世界的所有异常，可能会有危险）

Q5论文中提到的解决方案之关键是什么？

将图像级的放弃学习应用到逐像素的异常检测上，采用自适应惩罚。

微调模型，不重新训练

Q6论文中的实验是如何设计的？

在不同数据集上验证模型的整体性能。

消融实验证明AL，EBM以及这两个模块联合训练的有效性。

文章4.6节，选择了笔记本电脑等不可能在路上出现的物体、只选择一类异常进行训练，在Fishyscapes上仍达到了SOTA性能。这证明模型的稳健型，不需要仔细选择OE类，可以用于现实世界的自动驾驶系统。

Q7用于定量评估的数据集是什么？代码有没有开源？

LostAndFound

Fishyscapes

Road Anomaly

代码已经开源

Q8论文中的实验及结果有没有很好地支持需要验证的科学假设？

消融实验证明了每个模块的有效性和联合训练的有效性。

其他部分的实验结果也证明整体达到了SOTA性能

Q9这篇论文到底有什么贡献？

自适应惩罚放弃学习，能量模型，微调，平滑和稀疏约束

Q10下一步呢？有什么工作可以继续深入？

文章阅读

名词解释

Outlier Exposure（OE）：离群点暴露。引入异常数据集训练异常检测器

Energy Based Model（EBM）：能量模型。对于一个模型有一个定义好的能量函数E(x,y)，这个函数当y是x的输出时小，y不是x的输出时大。在本文中局内点（inlier）的能量小，离群点（outlier）的能量大。采用了logsumexp算子

LSE: (logsumexp(x)_i = log sum limits_j exp(x_{ij})) ，torch.logsumexp中采用了优化，避免了指数的上溢出或者下溢出

ECE：Expected Calibrated Error，预期校准误差。详情可见https://xishansnow.github.io/posts/144efbd1

Abstract

背景：SOTA的异常分割方法都是基于不确定性估计和重建。

uncertainty：

直观

假阳性，将正常像素检测为异常。某些hard-ID靠近分类边界

假阴性，检测不出异常认为是正常像素。

reconstruction：

依赖分割结果

额外网络难以训练；效率低

input不同导致分割模型变化则需要重新训练，适用性不高

提出了新的方法：PEBAL = AL + EBM, 两个模块联合训练

AL, Abstention Learning. 放弃学习，放弃将像素分类为ID标签，而是划分为异常。
EBM, Energy-based Model. 异常像素具有高能量，正常像素具有低能量

Model

得到每像素的一个概率值

[p_{theta}(y|text{x})_w = frac{exp(f_{theta}(y;text{x})_w)}{sum_{y' in {1,...,Y+1}} exp(f_{theta}(y';text{x})_w)} ]
符————号意义
(theta) 模型参数
(omega) 图像点阵 (Omega) 中的像素索引
(p_{theta}(y|text{x})_w) 标注像素 (omega) 在标签 ({1,...,Y+1}) 上的概率
(f_{theta}(y;text{x})_w) 像素 (omega) 在类别 (y) 上的logit值^[1]
流程图：

符————————————号解释
(D^{in} = {(x_i, y_i^{in})}^{|D^{in}|}) inlier的训练图像和注释。
(x in mathcal{X} subset R^{H times W times C})
(y in mathcal{Y} subset [0,1]^{H times W times Y})
(D^{out} = {(x_i, y_i^{out})}^{|D^{out}|}) outlier的训练图像和注释。
(y in mathcal{Y} subset [0,1]^{H times W times (Y+1)})
PEBAL_Loss

EBM_Loss

EBM损失：确保inlier能量低，outlier能量高

校准了inlier的logit值（减少了logit），同时共享相似的值，同时促进PAL的学习。^[2]

能量

能量：inlier能量低，outlier能量高。通过最小化 (l_{ebm}) 实现这一点。

反推过去，inlier的logit指数和更大，outlier的指数和更小。outlier更倾向于放弃分类（Y+1），所以前Y类的logit求和更小

放弃inlier分类的自适应惩罚：inlier惩罚系数高，outlier惩罚系数低

PAL_Loss

PAL（pixel-wise anomaly abstention loss）损失:

放弃outlier分类

校准inlier类logit

最小化min (l_{pal}) 就是对于log中的变量 最大化max

对于第一项，是像素对不同类的logit值

对于第二项，分子是像素在 Y+1 类处的logit值。(放弃预测)

针对类别c，两项就是类别c的logit值，和类别 Y+1 的logit值。考虑需要最大化谁的问题。

正常像素，惩罚系数高，鼓励进行有效预测

异常像素，惩罚系数低，鼓励进行放弃预测（Y+1类）

此损失函数是根据 Gambler's Loss(押注者损失) 优化改进而来

可参考链接：https://www.cnblogs.com/CZiFan/p/12676577.html

可参考论文：Deep Gamblers: Learning to Abstain with Portfolio Theory

inlier & outlier的不同数值对比：

(sumlimits_{tin{1,...,Y}}exp(f_{theta}))（大于1） (E_{theta})（能量是负数） (a_w)（能量是负数有如下结果） (frac{1}{a_w})（(a_w)保证非负）结果
inlier 大小大小鼓励正常分类
outlier 小大小大鼓励放弃分类
reg_Loss

平滑性，稀疏性

第一项：相邻像素之间差别不能太大

第二项：异常像素能量高，就可以使得周边像素与自己差别过大，是一个正则项。

Training

（一）设置 (D^{in}) 和 (D^{out})

（二）fine-tune

只微调分类模块

Inference

通过公式(4)计算出像素的自由能分数

通过模型(1)得到inlier的分割结果

应用高斯平滑核产生最终的energy map

Code

配置环境

查看CUDA版本
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
数据集准备

cityscapes

在官网注册账号

使用脚本下载zip压缩包

使用cityscapes脚本处理代码https://github.com/mcordts/cityscapesScripts
需要labelTrainIds的预处理后的图片

此时文件目录如下：

将需要的文件复制并重命名到文件夹 rename 's/_labelTrainIds//' *.png

此时文件目录如下：

fishyscapes

You can alternatively download both preprocessed fishyscapes & cityscapes datasets here (token from synboost GitHub).

采用作者提供的链接下载fishyscapes数据集，暂时还没有跑出结果。

coco

官网下载数据并解压

使用预处理代码进行处理

此时文件目录如下：

开始

登陆wandb
(pebal) huan@2678:~/zyx/PEBAL$ wandb login
wandb: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
wandb: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 
wandb: Appending key for api.wandb.ai to your netrc file: /home/huan/.netrc
(pebal) huan@2678:~/zyx/PEBAL$ 
修改config.py中的API key

推理

下载预训练模型，并上传到服务器上。

please download our checkpoint from here and specify the checkpoint path ("ckpts/pebal_weight_path") in config file.

开始训练

python code/test.py

训练

主干网络下载 https://github.com/NVIDIA/semantic-segmentation/tree/sdcnet

主干网络预训练参数下载：https://uni-wuppertal.sciebo.de/s/kCgnr0LQuTbrArA/download

参数文件放在主干网络”/pretrained_models“文件夹下

整体 put it in "ckpts/pretrained_ckpts" directory

python code/main.py 开始训练

NameError: name 'numpy' is not defined
-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/huan/zyx/PEBAL/code/main.py", line 106, in main
    trainer.train(model=model, epoch=curr_epoch, train_sampler=train_sampler, train_loader=train_loader,
  File "/home/huan/zyx/PEBAL/code/engine/trainer.py", line 59, in train
    current_lr = self.lr_scheduler.get_lr(cur_iter=curr_idx)
  File "/home/huan/zyx/PEBAL/code/engine/lr_policy.py", line 42, in get_lr
    return numpy.real(numpy.clip(curr_lr, a_min=self.end_lr, a_max=self.start_lr))
NameError: name 'numpy' is not defined
在对应文件加入import numpy即可解决

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])
Traceback (most recent call last):
  File "main.py", line 151, in <module>
    main(-1, 1, config=config, args=args)
  File "main.py", line 106, in main
    trainer.train(model=model, epoch=curr_epoch, train_sampler=train_sampler, train_loader=train_loader,
  File "/home/huan/zyx/PEBAL/code/engine/trainer.py", line 42, in train
    logits = model(imgs)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 167, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 177, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/_utils.py", line 429, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/huan/zyx/PEBAL/code/model/network.py", line 13, in forward
    return self.branch1(data, output_anomaly=output_anomaly, Vision=Vision)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/huan/zyx/PEBAL/code/model/wide_network.py", line 146, in forward
    x = self.aspp(x)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/huan/zyx/PEBAL/code/model/wide_network.py", line 58, in forward
    img_features = self.img_conv(img_features)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 135, in forward
    return F.batch_norm(
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/functional.py", line 2147, in batch_norm
    _verify_batch_size(input.size())
  File "/home/huan/anaconda3/envs/pebal/lib/python3.8/site-packages/torch/nn/functional.py", line 2114, in _verify_batch_size
    raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])
2.网上查找的原因为模型中用了batchnomolization，训练中用batch训练的时候当前batch恰好只含一个sample，而由于BatchNorm操作需要多于一个数据计算平均值，因此造成该错误。

3.解决方法：在torch.utils.data.DataLoader类中或自己创建的继承于DataLoader的类中设置参数drop_last=True，把不够一个batch_size的数据丢弃。

在查看源代码时，发现已经设置。batch_size的数量是：batch_size = config.batch_size // engine.world_size

想到了味了GPU内存数量够用，在config里面设置了batchsize = 4，这里world_size也是4，造成了最后为1的情况

但是发现这好像不是原因，因为distributed没有设成为True，所以莫名奇妙的，img在外面是8batch，进去就变成1了。

在调试debug的时候：进不去模型
Frame skipped from debugging during step-in.
Note: may have been skipped because of "justMyCode" option (default == true). Try setting "justMyCode": false in the debug configuration (e.g., launch.json).
是在自己配置的调试文件launch.json文件里面启用了JustMyCode，不调试第三方库，导致前向传播的forword没办法调试，所以把这个关掉，重新调试。

遗留问题

https://github.com/tianyu0207/PEBAL/issues/9，不能理解最后关于 PAL 的解释。额外通道是指outlier吗，add back 是什么

logit值的值遇范围是多少？ ↩︎

为什么校准logit就是减少logit？”share similar values at the same time” 是什么意思？ ↩︎

内容来源于网络如有侵权请私信删除

文章来源: 博客园

原文链接: https://www.cnblogs.com/StarTwinkle/p/16571290.html

标签： AI 人工智能



你还没有登录，请先登录或注册！

还没有人评论，欢迎说说您的想法！

相关课程

基于 OpenVINO™ 的 AI 视觉应用进阶课

41905 0元限免

英特尔® OpenVINO™工具套件初级课程

292959 0元限免

基于 OpenVINO™ 的 AI 视觉应用基础课

55999 0元限免

符————号	意义
(theta)	模型参数
(omega)	图像点阵 (Omega) 中的像素索引
(p_{theta}(y\|text{x})_w)	标注像素 (omega) 在标签 ({1,...,Y+1}) 上的概率
(f_{theta}(y;text{x})_w)	像素 (omega) 在类别 (y) 上的logit值^[1]

符————————————号	解释
(D^{in} = {(x_i, y_i^{in})}^{\|D^{in}\|})	inlier的训练图像和注释。 (x in mathcal{X} subset R^{H times W times C}) (y in mathcal{Y} subset [0,1]^{H times W times Y})
(D^{out} = {(x_i, y_i^{out})}^{\|D^{out}\|})	outlier的训练图像和注释。 (y in mathcal{Y} subset [0,1]^{H times W times (Y+1)})

	(sumlimits_{tin{1,...,Y}}exp(f_{theta}))（大于1）	(E_{theta})（能量是负数）	(a_w)（能量是负数有如下结果）	(frac{1}{a_w})（(a_w)保证非负）	结果
inlier	大	小	大	小	鼓励正常分类
outlier	小	大	小	大	鼓励放弃分类

【论文+代码】PEBAL/Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes（复杂城市驾驶场景异常分割的像素级能量偏置弃权学习）

【论文+代码】PEBAL/Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes（复杂城市驾驶场景异常分割的像素级能量偏置弃权学习）

Article

文章阅读

名词解释

Abstract

Model

PEBAL_Loss

EBM_Loss

能量

PAL_Loss

reg_Loss

Training

Inference

Code

配置环境

数据集准备

cityscapes

fishyscapes

coco

开始

推理

训练

遗留问题

相关课程

热门标签

推荐文章