摘要

Three-dimensional object detection aims to produce a three-dimensional bounding box of an object at its full extent. Nowadays, three-dimensional object detection is mainly based on red green blue-depth (RGB-D) images. However, it remains an open problem because of the difficulty in labeling for three-dimensional training data. In this article, we present a novel three-dimensional object detection method based on two-dimensional object detection, which only takes a set of RGB images as input. First, aiming at the requirement of three-dimensional object detection and the low location accuracy of You Only Look Once, a modified two-dimensional object detection method based on You Only Look Once is proposed. Then, using a set of images from different visual angles, three-dimensional geometric data are reconstructed. In addition, making use of the modified You Only Look Once method, the two-dimensional object bounding boxes of the forward and side views are obtained. Finally, according to the transformation between the two-dimensional pixel coordinate and the three-dimensional space coordinate, the two-dimensional object bounding box is mapped onto the reconstructed three-dimensional scene to form the three-dimensional object box. Because this method only needs the collection of two-dimensional images to train the modified You Only Look Once model, it has a wide range of applications. The experimental results show that the modified You Only Look Once model can improve the location accuracy, and our algorithm can effectively realize the three-dimensional object detection without depth images.