QUICK REVIEW

[论文解读] PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

Yang Zhang, Zixiang Zhou|arXiv (Cornell University)|Mar 31, 2020

Advanced Neural Network Applications参考文献 39被引用 31

一句话总结

PolarNet 引入了极坐标 BEV 表示、环卷积 CNN，以及可学习的每网格特征提取器，以在在线 LiDAR 点云分割中实现比现有方法更高的 mIoU 和更低成本。

ABSTRACT

The need for fine-grained perception in autonomous driving systems has resulted in recently increased research on online semantic segmentation of single-scan LiDAR. Despite the emerging datasets and technological advancements, it remains challenging due to three reasons: (1) the need for near-real-time latency with limited hardware; (2) uneven or even long-tailed distribution of LiDAR points across space; and (3) an increasing number of extremely fine-grained semantic classes. In an attempt to jointly tackle all the aforementioned challenges, we propose a new LiDAR-specific, nearest-neighbor-free segmentation algorithm - PolarNet. Instead of using common spherical or bird's-eye-view projection, our polar bird's-eye-view representation balances the points across grid cells in a polar coordinate system, indirectly aligning a segmentation network's attention with the long-tailed distribution of the points along the radial axis. We find that our encoding scheme greatly increases the mIoU in three drastically different segmentation datasets of real urban LiDAR single scans while retaining near real-time throughput.

研究动机与目标

解决在线细粒度 LiDAR 分割在近实时延迟、点分布不均衡和众多细粒度类别条件下的挑战。
提出一种平衡网格点分布的 LiDAR 专用输入表示。
开发可学习的每网格特征提取器和在极坐标 BEV 网格上运行的环卷积 CNN。
在多个城市 LiDAR 数据集上展示在降低计算成本的前提下的性能提升。

提出的方法

将 LiDAR 点量化到一个极坐标 BEV 网格，以在径向和角向轴上平衡每个单元的点数。
使用一个可学习的简化 PointNet 对每个网格单元产生固定长度的局部特征。
附加一个环卷积 2D CNN，通过在角向轴上进行环绕以实现极坐标网格连接性，从而实现端到端处理。
将极坐标网格的预测解码回笛卡尔点域标签以进行评估。
在 SemanticKITTI、A2D2 和 Paris-Lille-3D 上使用体素基础的分割损失进行端到端训练。
与 SqueezeSeg、SqueezeSegv2、PointNet、RangeNet++、Unet 等在笛卡尔 BEV 与球面投影上的变体进行比较。

实验结果

研究问题

RQ1与笛卡尔 BEV 或球面投影相比，极坐标 BEV 网格是否能改善点分布平衡和每 voxel 标签纯度？
RQ2可学习的每网格表示结合环卷积是否能在降低 MACs 与延迟的同时提升分割精度（mIoU）？
RQ3在不同 LiDAR 数据集（SemanticKITTI、A2D2、Paris-Lille-3D）上，性能提升是否具有一致性？
RQ4在准确性和效率方面，PolarNet 与最先进的在线 LiDAR 分割方法相比如何？

主要发现

模型	FPS	延迟	MACs	参数	Acc	mIoU	Per class IoU	car	bicycle	motorcycle	truck	other-vehicle	person	bicyclist	motorcyclist	road	parking	sidewalk	other-ground	building	fence	vegetation	trunk	terrain	pole
PointNet	11.5	0.087s	141B	3.5M	-	14.6%	46.3%	1.3%	0.3%	0.1%	0.8%	0.2%	0.2%	0.0%	61.6%	15.8%	35.7%	1.4%	41.4%	12.9%	31.0%	4.6%	17.6%	2.4%	3.7%
SqueezeSeg	49.2	0.031s	13B	0.9M	-	29.5%	68.8%	16.0%	4.1%	3.3%	3.6%	12.9%	13.1%	0.9%	85.4%	26.9%	54.3%	4.5%	57.4%	29.0%	60.0%	24.3%	53.7%	17.5%	24.5%
SqueezeSegv2	36.7	0.036s	14B	0.9M	-	39.7%	81.8%	18.5%	17.9%	13.4%	14.0%	20.1%	25.1%	3.9%	88.6%	45.8%	67.6%	17.7%	73.7%	41.1%	71.8%	35.8%	60.2%	20.2%	36.3%
DarkNet53	12.7	0.087s	378B	50M	87.8%	49.9%	86.4%	24.5%	32.7%	25.5%	22.6%	36.2%	33.6%	4.7%	91.8%	64.8%	74.6%	27.9%	84.1%	55.0%	78.3%	50.1%	64.0%	38.9%	52.2%
RangeNet++	-	-	378B	50M	89.0%	52.2%	91.4%	25.7%	34.4%	25.7%	23.0%	38.3%	38.8%	4.8%	91.8%	65.0%	75.2%	27.8%	87.4%	58.6%	80.5%	55.1%	64.6%	47.9%	55.9%
RandLA	-	-	-	1.2M	-	53.9%	94.2%	26.0%	25.8%	40.1%	38.9%	49.2%	48.2%	7.2%	90.7%	60.3%	73.7%	20.4%	86.9%	56.3%	81.4%	66.8%	49.2%	47.7%	38.1%
Unet w/ Cartesian BEV	-	0.028s	60B	14M	83.5%	20.3%	27.0%	7.3%	20.3%	66.0%	1.9%	25.2%	54.7%	6.5%	12.7%	0.0%	20.3%	26.8%	21.4%	42.5%	0.0%	9.5%	0.0%
PolarNet	16.2	0.062s	135B	14M	90.0%	54.3%	93.8%	40.3%	30.1%	22.9%	28.5%	43.2%	40.2%	5.6%	90.8%	61.7%	74.4%	21.7%	90.0%	61.3%	84.0%	65.5%	67.8%	51.8%	57.5%

PolarNet 在 SemanticKITTI、A2D2 和 Paris-Lille-3D 上超越了最先进的方法，同时使用大约三分之一的参数和 MACs。
在 SemanticKITTI 上，PolarNet 实现了 mIoU 54.3% 和准确率 90.0%，参数量为 14M，MACs 为 105B。
在 A2D2 上，PolarNet 实现了 mIoU 23.9% 及若干类别的 per-class IoU 提升，参数量为 60B，延迟为 0.031s。
在 Paris-Lille-3D 上，PolarNet 实现了 mIoU 43.7% 和准确率 87.5%，优于 DarkNet53 和笛卡尔 BEV 基线。
极坐标 BEV 的每网格点分布比笛卡尔 BEV 更平衡（极坐标：0.7±1.4 vs 笛卡尔：0.7±3.2 点每个网格单元）。
多数类别的 IoU 都有提升，尤其是不规则分布或距离较远的类别，归因于极坐标表示和环卷积设计。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。