High performance point-Voxel feature set abstraction with mamba for 3D object detection.

Saved in:
Bibliographic Details
Title: High performance point-Voxel feature set abstraction with mamba for 3D object detection.
Authors: Ren, Junfeng1 (AUTHOR), Wen, Changji1,2 (AUTHOR) changjiw@jlau.edu.cn, Zhang, Long1 (AUTHOR), Su, Hengqiang1 (AUTHOR), Yang, Ce1,3 (AUTHOR), Lv, Yanfeng4 (AUTHOR), Yang, Ning2 (AUTHOR), Qin, Xiwen1,5 (AUTHOR) qinxiwen@ccut.edu.cn
Source: Expert Systems with Applications. Aug2025, Vol. 286, pN.PAG-N.PAG. 1p.
Subjects: Object recognition (Computer vision), Convolutional neural networks, Feature extraction, Point cloud, LIDAR
Abstract: • HP-PV-RCNN for 3D object detection on LiDAR point clouds. • Backbone based on mamba2 and linear angle attention used to extract 3D features. • The pointhead based on kolmogorov-arnold networks for extracting features of points. • Fuzzy NMS with an adaptive threshold removes redundant detection boxes. • HP-PV-RCNN has achieved good results on the Kitti, NuScenes, and Waymo open datasets. In the field of autonomous driving, a two-stage three-dimensional object detection approach has seen significant advancements. However, challenges persist in terms of detection accuracy, which can have a profound impact on the safety of autonomous vehicles. This study examined four critical issues that impair the accuracy and efficiency of the model: limited acceptance fields, slow acquisition of global features from voxels, challenges in capturing keypoint features, and uncertainties associated with network post-processing. To address these challenges, we propose four novel techniques: (1) a non-empty voxel feature extraction method that utilises linear angular attention to broaden the receptive field; (2) an efficient voxel feature extraction and downsampling approach based on Mamba2, designed to accelerate the acquisition of global voxel features; (3) a node extraction strategy that employs the Kolmogorov-Arnold Network (KAN) to extract key point features via segmented farthest point sampling (S-FPS); (4) a fuzzy non-maximum suppression (Fuzzy-NMS) method that refines suppression thresholds during the post-processing phase. By integrating these techniques, we introduce a High-Performance Point-Voxel Region Convolutional Neural Network (HP-PV-RCNN) algorithm specifically tailored for precise 3D object detection. We validated the effectiveness of the HP-PV-RCNN algorithm through comprehensive experiments using the Kitti, NuScenes, and Waymo open datasets. Specifically, our proposed network attained average precisions of 83.73 % for vehicles, 76.32 % for bicycles, and 53.52 % for pedestrians in the medium-difficulty category of the Kitti dataset for detecting these entities. The code and model are available at https://github.com/jlauwcj/HP-PV-RCNN. [ABSTRACT FROM AUTHOR]
Copyright of Expert Systems with Applications is the property of Pergamon Press - An Imprint of Elsevier Science and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Be the first to leave a comment!
You must be logged in first