End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

Publikation: Bidrag til tidsskriftKonferenceartikelForskningfagfællebedømt

Standard

End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection. / Qian, Rui; Garg, DIvyansh; Wang, Yan; You, Yurong; Belongie, Serge; Hariharan, Bharath; Campbell, Mark; Weinberger, Kilian Q.; Chao, Wei Lun.

I: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, s. 5880-5889.

Publikation: Bidrag til tidsskriftKonferenceartikelForskningfagfællebedømt

Harvard

Qian, R, Garg, DI, Wang, Y, You, Y, Belongie, S, Hariharan, B, Campbell, M, Weinberger, KQ & Chao, WL 2020, 'End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection', Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, s. 5880-5889. https://doi.org/10.1109/CVPR42600.2020.00592

APA

Qian, R., Garg, DI., Wang, Y., You, Y., Belongie, S., Hariharan, B., Campbell, M., Weinberger, K. Q., & Chao, W. L. (2020). End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 5880-5889. [9157553]. https://doi.org/10.1109/CVPR42600.2020.00592

Vancouver

Qian R, Garg DI, Wang Y, You Y, Belongie S, Hariharan B o.a. End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2020;5880-5889. 9157553. https://doi.org/10.1109/CVPR42600.2020.00592

Author

Qian, Rui ; Garg, DIvyansh ; Wang, Yan ; You, Yurong ; Belongie, Serge ; Hariharan, Bharath ; Campbell, Mark ; Weinberger, Kilian Q. ; Chao, Wei Lun. / End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection. I: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2020 ; s. 5880-5889.

Bibtex

@inproceedings{ceb53ef854e7481a847a9f49742c63f9,
title = "End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection",
abstract = "Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks - - yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e.",
author = "Rui Qian and DIvyansh Garg and Yan Wang and Yurong You and Serge Belongie and Bharath Hariharan and Mark Campbell and Weinberger, {Kilian Q.} and Chao, {Wei Lun}",
note = "Funding Information: This research is supported by grants from the National Science Foundation NSF (III-1618134, III-1526012, IIS-1149882, IIS-1724282, and TRIPODS-1740822), the Office of Naval Research DOD (N00014-17-1-2175), the Bill and Melinda Gates Foundation, and the Cornell Center for Materials Research with funding from the NSF MRSEC program (DMR-1719875). We are thankful for generous support by Zillow and SAP America Inc. Publisher Copyright: {\textcopyright} 2020 IEEE.; 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 ; Conference date: 14-06-2020 Through 19-06-2020",
year = "2020",
doi = "10.1109/CVPR42600.2020.00592",
language = "English",
pages = "5880--5889",
journal = "I E E E Conference on Computer Vision and Pattern Recognition. Proceedings",
issn = "1063-6919",
publisher = "Institute of Electrical and Electronics Engineers",

}

RIS

TY - GEN

T1 - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

AU - Qian, Rui

AU - Garg, DIvyansh

AU - Wang, Yan

AU - You, Yurong

AU - Belongie, Serge

AU - Hariharan, Bharath

AU - Campbell, Mark

AU - Weinberger, Kilian Q.

AU - Chao, Wei Lun

N1 - Funding Information: This research is supported by grants from the National Science Foundation NSF (III-1618134, III-1526012, IIS-1149882, IIS-1724282, and TRIPODS-1740822), the Office of Naval Research DOD (N00014-17-1-2175), the Bill and Melinda Gates Foundation, and the Cornell Center for Materials Research with funding from the NSF MRSEC program (DMR-1719875). We are thankful for generous support by Zillow and SAP America Inc. Publisher Copyright: © 2020 IEEE.

PY - 2020

Y1 - 2020

N2 - Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks - - yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e.

AB - Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks - - yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e.

U2 - 10.1109/CVPR42600.2020.00592

DO - 10.1109/CVPR42600.2020.00592

M3 - Conference article

AN - SCOPUS:85089602386

SP - 5880

EP - 5889

JO - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

JF - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

SN - 1063-6919

M1 - 9157553

T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020

Y2 - 14 June 2020 through 19 June 2020

ER -

ID: 301822796