QUICK REVIEW

[论文解读] Auto4D: Learning to Label 4D Objects from Sequential Point Clouds

Bin Yang, Min Bai|arXiv (Cornell University)|Jan 17, 2021

Advanced Neural Network Applications参考文献 42被引用 25

一句话总结

Auto4D 自动通过将问题分解为固定对象尺寸估计和在完整轨迹上对运动路径进行细化，以减少人工标注工作量多达 25%。

ABSTRACT

In the past few years we have seen great advances in object perception (particularly in 4D space-time dimensions) thanks to deep learning methods. However, they typically rely on large amounts of high-quality labels to achieve good performance, which often require time-consuming and expensive work by human annotators. To address this we propose an automatic annotation pipeline that generates accurate object trajectories in 3D space (i.e., 4D labels) from LiDAR point clouds. The key idea is to decompose the 4D object label into two parts: the object size in 3D that's fixed through time for rigid objects, and the motion path describing the evolution of the object's pose through time. Instead of generating a series of labels in one shot, we adopt an iterative refinement process where online generated object detections are tracked through time as the initialization. Given the cheap but noisy input, our model produces higher quality 4D labels by re-estimating the object size and smoothing the motion path, where the improvement is achieved by exploiting aggregated observations and motion cues over the entire trajectory. We validate the proposed method on a large-scale driving dataset and show a 25% reduction of human annotation efforts. We also showcase the benefits of our approach in the annotator-in-the-loop setting.

研究动机与目标

Motivate automatic 4D labeling to reduce human annotation cost for autonomous driving datasets.
Propose a two-branch model that separately estimates a constant 3D size and refines the motion trajectory over time.
Leverage full trajectory observations to improve 3D bounding box accuracy (IoU) and trajectory smoothness.
Evaluate on a high-quality Car4D dataset to demonstrate substantial improvements over baselines.

提出的方法

Use initial online detector + discrete tracker to obtain noisy 4D object trajectories.
Object size branch aggregates multi-frame observations to predict a single constant size for each object and refines boxes with a corner-align strategy.
Motion path branch uses a spatial-temporal encoder–decoder to refine the pose trajectory using 4D point clouds and motion cues.
Train branches sequentially with IoU-based loss; during inference apply size refinement across the trajectory and then sliding-window path refinement.

实验结果

研究问题

RQ1Can leveraging observations over the entire object trajectory yield more accurate constant-size estimates for 3D bounding boxes?
RQ2Does incorporating a motion-path refinement over the full trajectory improve 4D labeling precision beyond static size estimation?
RQ3How much can automatic 4D labeling reduce the need for human correction in high-quality driving datasets?
RQ4Is the corner-align strategy for size refinement superior to center-aligned approaches under LiDAR sparsity and occlusions?

主要发现

Method	IoU≥0.5	IoU≥0.6	IoU≥0.7	IoU≥0.8	IoU≥0.9
Online detector + discrete tracker	98.8%	97.5%	94.0%	82.2%	40.6%
Offline detector + discrete tracker	99.0%	97.9%	94.7%	83.3%	41.5%
Offline detector + disc. & cont. tracker	99.5%	98.4%	95.0%	82.9%	41.3%
Auto4D (size)	98.9%	97.7%	94.8%	85.4%	49.0%
Auto4D (size + path)	99.0%	98.0%	95.6%	87.9%	55.3%

Auto4D increases the number of boxes with IoU ≥ 0.9 by 8.4% from the size branch and by 6.3% from the path branch on Car4D test set.
Overall, Auto4D yields about a 25% reduction in human effort to correct poorly localized boxes (IoU ≥ 0.9) compared to the online detector + discrete tracker baseline.
The size branch with corner-align refinement significantly outperforms center-align and random baselines in producing precise size estimates.
The annotator-in-the-loop experiment demonstrates that minimal human corrections can further improve labeling accuracy without retraining the model.
Static objects benefit more from the size branch, while moving objects gain more from the motion path refinement

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。