计算机视觉基础入门课程（从算法到实战应用）

lesson19笔记 - 图像分割与语义分割

一.

1. 图像分割分为语义分割(semantic segmentation)和实体分割(instance segmentation)

2. 语义图像分割: 对图片中的像素进行分类,如判断像素是否属于天空/行人/车辆, 是一个分类任务.

3. 实体图像分割: 除了对像素进行分类外,还要区分不同的object,比如车辆A/车辆B,可类比检测任务,但是难度大于检测.

4. 评价分割任务: IoU intersection/union

5. 图像分割的实际应用: 遥感图像分析, 医疗图像分析(肺结节检测), 抠图, 场景理解(可行驶区域估计), 语义地图(3D重建)

二.

6. 语义分割的传统方法: 图模型(graph-cut), 超像素分割, 边缘检测;

深度学习方法: F函数一般是一个全卷积网络 FCN (full convolutional networks), 与用于分类任务(classification)网络比起来, FCN去掉了global pooling层, 所有的全连接层都被替换成了卷积层, 目的是为了得到dense prediction(分类任务是对整张图像进行分类/概率分布, 而分割任务是对图像中的每个像素点分类).

7. FCN的一般结构.

FCN细节分析:

1) 为何使用归一化(batch norm)层: 是网络的非线性更加明显;在一定程度上见笑了噪声的影响; 防治网络过拟合.

2) 为何要通过pooling或者设置卷积的stride 做降采样操作: 增大网络感受野, 减少网络计算量.

3) softmax的作用:将最后一层卷积的输出归一化成概率分布

8. FCN变体系列

1) FCN变体1 deconvolution

内部也做padding,这样输出的featurer map的分辨率会比原来大,细节更好.双线性差值是其特殊情况.

Unet, U型, 降采样+特征提取 + 每层hybird融合 -- 上采样, 浅层featuremap有较多的空间位置信息,深层的featuremap经过特征提取和融合后有丰富的语义信息, 还原后具有更多的细节.

2) FCN变体2 context sharing

引入上下文, 对pixel level的分类任务也是有帮助的, 这也解释了为什么分割网络需要更大的感受野.

3) FCN变体3 full-resolution resident network

双流网络, 维护了一个residual stream (保证分辨率不变)和 pooling stream(减少分辨率提升感受野, 特征提取和融合, FRRU单元, U型采样);

pooling stream深层的特征虽然有丰富的语义信息, 但是空间位置信息损失较多(由于降采样);

通过与residual stream的特征进行融合, 可以得到对空间位置更为敏感的特征表达.

三.

10. 实例分割

instance segmentation 需要区分每个单独的object, 但是卷积操作具有平移不变性的特点(相同的pattern都会激活卷积层)(很适合做语义分割), 卷积网络全局共享一个filter, 如何解决定位这个问题?

思路1:

先做检测,然后在bbox里分割前后景 - mask R-CNN

思路2:

检测每个object的边缘 - Deep Watershed transform

rgn+sem.seg -> direction net -> watershed transform net

使用direction net 回归每个pixel到距离其最近的边界的矢量, 所给的监督信号为一个2通道的矢量场;

使用watershed transform net 预测每个pixel的能量大小,距离边界越近的pixel能量越低,背景像素的能量为0.

思路3:

特征解耦 - > instance-sensitive FCN

[展开全文]

auto_孤竹孙 · 2019-07-07 · 图像分割与语义分割介绍 0

Segmentation includes semantic segmentation and instance segmentation.

Semantic segmentation requires telling which category each pixel falls while instance segmenation requires separating each object.

We can use IoU to evaluate the performance of one object and use mean IoU to evaluete the general performance on the entire image.

Segmentation is widely used in areas like remote sensing, medeical image analysis , scene understanding and SLAM etc.

traditional method includes: graph-cut, super-pixel segmentaion and edge detection etc.

modern method: finding a mapping between certain input X and P(i,j) where i, j means each pixel. The output is a tensor of shape [w, h, k] where k means number of classes.

this mapping is normally realized using Fully Convoluitional Networks, also known as FCN.

But how to find out the perfect mapping?

1. Large number of traininng samples;

2. resonalble loss;

3, optimization method.

As it is fundamentally a classification problem, we should use cross-entropy loss to meausre the mismatch between label and output. the loss is not a scalar, but a matrix as each pixel has a scalar output.

something we need to note:

1. why batch norm?

increase nonlinearity, improve noise resistance, prevent overfitting.

2. why pooling during downsampling?

improve receptive field, decrease parameters.

3. softmax in the last layer.

1.deconv: apply conv after apply padding .

2.context shareing : introduce the help from other image levels.

example: Pyramid scene parsing network (best performance in cityscope) also known as PspNet.

3.Full-resolution residual network.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Instance Segmentation Problem:

Approaches:

1. Mask R-CNN

2. detect the edge of each object -> Deep Watershed Transform.

The output of Direction Net outputs the direction of each pixel towards the nearest boundary inside each object. Each direction can be represented as the combination of two channels vector field.

3. Instance-sensitive fully convolutional networks.

[展开全文]

帝福尼•拉曼 · 2018-01-29 · 图像分割与语义分割介绍 1

图像分割——成硕

1.概述

问题：语义图像分割和实例图像分割。

本质：像素级分类任务。

评价标准：IOU

2.应用

遥感、自动驾驶、医用、体育、语义地图、3D重建

3.示例

IOU值：路面》车子》路标

4. 用DeepLearning解决图形分割

1）传统算法：细节差

2）DP

[展开全文]

andy · 2018-01-27 · 图像分割与语义分割介绍 0

计算机视觉基础入门课程（从算法到实战应用）

相关课程

授课教师

最新学员

学员动态