前言
之前介绍的影像处理主要是为了撰写本次介绍的图像分割,影像分割现今运用非常广泛,虽然人工智慧当中有Faster R-CNN
处理方法,但目前谁也无法保证哪个才是最佳演算法,因此两者都必须学习。这次主要参考OpenCV
源码[3]。
连通法
简介
原先连通法主要将图像二值化,再给予同区块的白色像素都设定同一个标记,主要分为四方连通和八方连通。这里使用八方连通法,而走访由左至右由上而下,只需要走访中心点的四个相连点(这里使用左 + 上排)找出最小标记,若未标记则给予中心点新的标记,走访第一次如下图,可以看到白色部分都已被标记。
来源[4]
接着再从头走访一次去作合併,因为第一次走访时依序慢慢给予标记,也就是说第一次走访下方和右方未标记的也可能是白色,所以在走访一次比大小去作合併,就变为八方连通了,结果如下图一,上色为图二。
图一,来源[4]。
图二,来源[4]。
图像分割连通法
这里所使用的方法则是先计算所有像素与四边的像素差,再由小排到大后依序去标记连通。而这里更新标记不同的是,它会多一个变数rank来记录优先顺序(通常先处理像素的rank会较大),合併时比较rank,给予较大rank的标记。
程式码
GraphNode
:结构是纪录一个像素的rank
优先程度、连结的index
和目前大小。Find
:更新连结index
和取得连结的index
。Merge
:合併两个像素(区域)。NumSets
:区域数量。GetSize
:取得区域大小。GraphTree.h
#pragma once#ifndef GRAPH_TREE_H#define GRAPH_TREE_H#include "general.h"typedef struct {UINT32 rank;UINT32 next;UINT32 size;} GraphNode;class GraphTree{public:GraphTree(C_UINT32 nodeSize);~GraphTree();UINT32 Find(C_UINT32 index);void Merge(C_UINT32 index1, C_UINT32 index2);UINT32 NumSets() const { return _nodeSize; };UINT32 GetSize(C_UINT32 index) const { return _graphNodes[index].size; };private:UINT32 _nodeSize;GraphNode* _graphNodes;};#endif
GraphTree.cpp
#include "GraphTree.h"GraphTree::GraphTree(C_UINT32 nodeSize){_nodeSize = nodeSize;_graphNodes = new GraphNode[nodeSize];for (UINT32 index = 0; index < nodeSize; index++){GraphNode* graphNode = &_graphNodes[index];graphNode->next = index;graphNode->rank = 0;graphNode->size = 1;}}UINT32 GraphTree::Find(C_UINT32 index){UINT32 next = index;while (next != _graphNodes[next].next){next = _graphNodes[next].next;}_graphNodes[index].next = next;return next;}void GraphTree::Merge(C_UINT32 index1, C_UINT32 index2){GraphNode& graphNode1 = _graphNodes[index1];GraphNode& graphNode2 = _graphNodes[index2];if (graphNode1.rank > graphNode2.rank){graphNode2.next = index1;graphNode1.size += graphNode2.size;}else{graphNode1.next = index2;graphNode2.size += graphNode1.size;if (graphNode1.rank == graphNode2.rank){graphNode2.rank++;}}_nodeSize--;}GraphTree::~GraphTree(){delete[] _graphNodes;}
区域阀值
在论文当中有对于阔值的演变详细解说,这里使用[2]提到使用公式为面积(阀值) / 周长 + 像素差
,这里可以想像当下点与其他排序点的差距越来越远来理解,因为随着合併的像素越多周长就越大相对的合併条件也会越严谨(因排序完所以前面影响越大,后面影响越小)。
主要算法流程
取得连通资料,走访取得四个邻边像素差和资讯。排序连通资料,依照像素差小->大。走访连通资料,比较像素差及阔值,若符合则合併和更新。走访连通资料,将不符合参数的最小大小合併。走访图像,取得连通索引值并给予相对的颜色。程式码
SegmentImage
:取得处理完的连通资料。SegmentGraph
:排序并将符合条件的区域合併(合併条件当下资料小于中心(连通标籤)和邻边(连通标籤)阔值)。Visualization
:可视化连通资料。void Segment::SegmentImage(C_UCHAE* src, UCHAE* pur, C_UINT32 width, C_UINT32 height, C_FLOAT sigma, C_FLOAT threshold, C_UINT32 minSize, GraphTree* graphTree){assert(src != nullptr && pur != nullptr);assert(width > 0 && height > 0);assert(graphTree != nullptr);assert(graphTree->NumSets() == width * height);C_UINT32 size = static_cast<UINT32>(ceil(4 * sigma)) + 1;MNDT::BlurGauss24bit(src, pur, width, height, size, sigma);Image purImage(pur, width, height, MNDT::ImageType::BGR_24BIT);Edge* edges = new Edge[width * height * 4];UINT32 edgeSize = 0;// 1. get neighborsfor (UINT32 row = 0; row < height; row++){for (UINT32 col = 0; col < width; col++){UINT32 nowIndex = row * width + col;Pixel nowPix = purImage.GetPixel(row, col);Edge* edge = nullptr;// topif (row > 0){edge = &edges[edgeSize];edge->centerIndex = nowIndex;edge->linkIndex = (row - 1) * width + col;edge->weights = Diff(nowPix, purImage.GetPixel(row - 1, col));edgeSize++;}// top leftif (row > 0 && col > 0){edge = &edges[edgeSize];edge->centerIndex = nowIndex;edge->linkIndex = (row - 1) * width + col - 1;edge->weights = Diff(nowPix, purImage.GetPixel(row - 1, col - 1));++edgeSize;}// top rightif (row > 0 && col < width - 1){edge = &edges[edgeSize];edge->centerIndex = nowIndex;edge->linkIndex = (row - 1) * width + col + 1;edge->weights = Diff(nowPix, purImage.GetPixel(row - 1, col + 1));++edgeSize;}// leftif (col > 0){edge = &edges[edgeSize];edge->centerIndex = nowIndex;edge->linkIndex = row * width + col - 1;edge->weights = Diff(nowPix, purImage.GetPixel(row, col - 1));++edgeSize;}}}// step 2.3// 连通法SegmentGraph(graphTree, edges, edgeSize, threshold);// 4. merge region if the region is smaller than minSize paramsfor (UINT32 index = 0; index < edgeSize; index++){const Edge& edge = edges[index];C_UINT32 centerIndex = graphTree->Find(edge.centerIndex);C_UINT32 linkIndex = graphTree->Find(edge.linkIndex);if (centerIndex != linkIndex&& (graphTree->GetSize(centerIndex) < minSize || graphTree->GetSize(linkIndex) < minSize)){graphTree->Merge(centerIndex, linkIndex);}}delete[] edges;edges = nullptr;}void Segment::SegmentGraph(GraphTree* graphTree, Edge* edges, C_UINT32 edgeSize, C_FLOAT threshold){// 2. sortstd::sort(edges, edges + edgeSize, [](const Edge& edge1, const Edge& edge2) { return edge1.weights < edge2.weights; });double* thresholds = new double[graphTree->NumSets()];for (UINT32 index = 0; index < graphTree->NumSets(); index++){thresholds[index] = Threshold(threshold, 1);}// 3. merge regionfor (UINT32 index = 0; index < edgeSize; index++){const Edge& edge = edges[index];UINT32 centerIndex = graphTree->Find(edge.centerIndex);C_UINT32 linkIndex = graphTree->Find(edge.linkIndex);if (centerIndex != linkIndex&& edge.weights < thresholds[centerIndex]&& edge.weights < thresholds[linkIndex]){graphTree->Merge(centerIndex, linkIndex);centerIndex = graphTree->Find(centerIndex);thresholds[centerIndex] = edge.weights + Threshold(threshold, static_cast<float>(graphTree->GetSize(centerIndex)));}}delete[] thresholds;thresholds = nullptr;}void Segment::Visualization(Image& image, GraphTree* graphTree){C_UINT32 size = image.Height() * image.Width();Pixel *pixs = new Pixel[size];for (UINT32 index = 0; index < size; index++){pixs[index] = GetColor();}for (UINT32 row = 0; row < image.Height(); row++){for (UINT32 col = 0; col < image.Width(); col++){UINT32 index = graphTree->Find(row * image.Width() + col);image.SetPixel(row, col, pixs[index]);}}delete[] pixs;pixs = nullptr;}
结果图
C#视窗原始码
C++函数原始码
结语
在论文当中讲的很详细,而做起来也并不困难,对于我来说最不好理解的部分大概是未改进的阀值,主要是英文太烂,真的要好好加强英文未来才能更快速的读论文,下次会介绍已此论文演变的Selective Search for Object Recognition
论文。现在文章改为讲解演算法部分,所以有些小地方可能就会略过,若有兴趣可去Github看源码,有问题或错误欢迎提问。
参考文献
[1]P. Felzenszwalb, D. Huttenlocher,"Efficient Graph-Based Image Segmentation"
International Journal of Computer Vision, Vol. 59, No. 2, September 2004
[2]TTransposition(2014).
图像分割—基于图的图像分割(Graph-Based Image Segmentation) from:https://blog.csdn.net/ttransposition/article/details/38024557 (2018.11.29)
[3]TTransposition(2014). 图像分割—基于图的图像分割(OpenCV源码注解) form:https://blog.csdn.net/ttransposition/article/details/38024605 (2018.11.29)
[4]维基百科(2017).连通分量标记from:https://zh.wikipedia.org/wiki/%E8%BF%9E%E9%80%9A%E5%88%86%E9%87%8F%E6%A0%87%E8%AE%B0 (2018.11.29)