Estimation of 3D Bounding Box for Image Object Sunghoon Jung and Minhwan Kim Dept. of Computer Engineering, Pusan National University, Jangjeon-dong, Geumjeong-gu, Busan, Korea {shjung,mhkim}@pusan.ac.kr http://vision.ce.pusan.ac.kr Abstract. In this paper, we propose a method to automatically esti- mate a 3D bounding box for an image object. The 3D object for the image object is assumed to be on the ground and the image object is acquired using a single calibrated camera. We use the fact that when an image object is back-projected onto the ground a part of its lower bound-ary is correctly back-projected onto the 3D object boundary touching the ground. The estimated 3D bounding box can be effectively used in au-tonomous obstacle avoidance, 3D object packing, and smart surveillance of moving object. Usefulness of the proposed method is presented with experimental results on real objects. Keywords: 3D bounding box, object approximation 1 Introduction 3D object reconstruction from images is an attractive and challenging research area in computer vision. In some applications, however, approximation of a 3D object, such as a 3D bounding box, can provide sufficient information for achiev- ing their goals. A 3D bounding box enclosing a given 3D object is useful for describing its size and location. Meanwhile, 3D object reconstruction from a single image is generally under- constrained because only one view of a given 3D object is available from the image. In order to resolve the lack of constraints, previously proposed methods [1,2] restricted their focuses to the specific shape of objects. In this paper, we propose a method to automatically estimate a 3D bounding box from a single image without user interactions. We assume a general situation that an object stands on the flat ground and its image object is acquired by a calibrated camera installed in somewhat high place. We also assume that the given object is symmetric as observed in most of artificial objects. A key fact in our estimation method is that when an image object is back- projected onto the flat ground a part of its lower boundary is correctly back- projected onto the 3D object boundary touching the ground. In other words, 3D location of image points corresponding to the 3D points on the ground can be recovered if the image points are acquired by a calibrated camera. - 81 -
Fig. 1. Examples of image objects and their silhouette images 2 Estimation of 3D Bounding Box Making a 3D bounding box for a given image object starts from back-projecting the image object onto the ground. In this paper, the back-projected 2D shape on the ground is called the silhouette of the image object as shown in Fig. 1. We can see that the lower boundaries of the silhouettes tend to be the parts of the corresponding 3D real object while the other upper boundaries to be the parts which do not lie on the ground. Actually the lower boundaries of the silhouettes coincide with the base of the 3D real object. Fig. 1. Examples of image objects and their silhouette images To get the minimum-area bounding rectangle enclosing the base of a 3D object, we apply the rotating caliper method [3] to the lower boundary of the silhouette with a minor modification. The rotating caliper method uses two parallel line sets which are perpendicular to each other. We use a perpendicular line set to apply it to the lower boundary of the silhouette. As proved in [4], a side of a minimum bounding rectangle must be collinear with a side of the convex hull of a shape. Based on this theorem, we rotate the perpendicular line set keeping one of its line to be collinear with a side of convex hull of the silhouette and the other line to be tangent to convex hull. Then this perpendicular line set may become two edges of the base rectangle that encloses the base of the 3D object. The rotating perpendicular line set method should be applied to only the convex hull segments on the ground. The on-ground convex hull segments can be determined using two support lines. Let consider a vertical pole, which stands on the rotating perpendicular line in the ground. A support line is defined as the silhouette of the vertical pole, which is tangent to the silhouette of the image object. We can determine two support lines; one is tangent to left part of the image object silhouette, the other to right part. Then the corresponding vertical poles to these support lines should be the diagonal vertical edges of the 3D bounding box. Let consider the point where the support line and the convex hull of the image object silhouette meet, which we call a tangent point. Such a tangent point is inevitably one of the convex hull points and limits the range of the on-ground segments. After the range of on-ground convex hull segments is determined, the middle point of the two tangent points is selected as a rotational symmetric point. The full on-ground convex hull is hypothetically constructed by rotating the on- ground segments with respect to this rotational symmetric point. Then the base rectangle is determined by using the original rotating caliper method. - 82 -
Fig. 2. Experimental results for various objects Once the base rectangle is determined, the 3D bounding box can be easily constructed after determining the height of the box. Initial height is set high enough to ensure that both of two rear lines of the upper rectangle do not intrude the object. Then the height of the bounding box is decreased until one of two rear lines of the upper rectangle touches the object in image space. 3 Experimental Results The proposed method is verified through various kinds of objects in multiple viewpoints. Fig. 2 shows examples of 3D bounding boxes for several kinds of objects and superimposed ones on real images. The image objects are extracted from images manually. We can see that the bounding boxes are suitable for approximation of the 3D objects. Fig. 2. Experimental results for various objects 4 Conclusions In this paper, we present a method for estimating a 3D bounding box of an object in a single image, which works without limitation of object’s shape or user interaction. We used the fact that a lower part of an image object is correctly back- projected to the location of corresponding 3D object boundary touching the ground. A modified version of the rotating caliper method has been applied to estimate the base rectangle of bounding box. Through experiments, we found that the proposed method is useful for approximating and localizing various kinds of real objects on the ground. References Z. Li, J. Liu, X. Tang: A Closed-form Solution to 3D Reconstruction of Piecewise Planar Objects from Single Images. In: IEEE Conf. on Computer Vision and Pattern Recognition, vol.1, pp.1–6 (2007) S. Mohan, S. Murali: Automated 3D Modeling and Rendering from Single View Images. In: Proc. International Conf. on Computational Intelligence and Multimedia Application, vol.4, pp.476–480 (2007) G.T. Toussaint: Solving Geometric Problems with the Rotating Calipers. In: Proc. IEEE MELECON’83, pp.10–02 (1983) H. Freeman, R. Shapira: Determining the Minimum-area Enclosing Rectangle for an Arbitrary Closed Curve. In: Communications of the ACM, vol.18, no.7, pp.409–413 (1975) - 83 -