Full Camera Calibration from a Single View of Planar Scene full lers.This 1 Introduction camera visiomdetoh aeeie SkCcs5anaA2w
Full Camera Calibration from a Single View of Planar Scene Yisong Chen1, Horace Ip2, Zhangjin Huang1, and Guoping Wang1 1 Key Laboratory of Machine Perception (Ministry of Education), Peking University 2 Department of Computer Science, City University of Hong Kong Abstract. We present a novel algorithm that applies conics to realize reliable camera calibration. In particular, we show that a single view of two coplanar circles is sufficiently powerful to give a fully automatic calibration framework that estimates both intrinsic and extrinsic parameters. This method stems from the previous work of conic based calibration and calibration-free scene analysis. It eliminates many a priori constraints such as known principal point, restrictive calibration patterns, or multiple views. Calibration is achieved statistically through identifying multiple orthogonal directions and optimizing a probability function by maximum likelihood estimate. Orthogonal vanishing points, which build the basic geometric primitives used in calibration, are identified based on the fact that they represent conjugate directions with respect to an arbitrary circle under perspective transformation. Experimental results from synthetic and real scenes demonstrate the effectiveness, accuracy, and popularity of the approach. 1 Introduction As an essential step for extracting metric 3D information from 2D images, camera calibration keeps an active research topic in most computer vision applications[9]. Much work has been devoted to camera calibration. They can be classified into two categories: (1D, 2D or 3D) Calibration pattern based algorithms, and multiple view based self-calibration approaches [15,19]. Conics and quadrics are widely accepted as most fundamental patterns in computer vision due to their elegant properties such as simple and compact algebraic expression, invariance under projective transformation, and robustness to image noise. Conics have long been employed to help perform camera calibration and pose estimation [6]. The strategy of using spheres as calibration pattern also draws more and more attention in recent years [1]. Vanishing point and vanishing line also play important roles in a lot of calibration and scene analysis work[12,14]. Under the assumption of zero skew and unit aspect ratio, all intrinsic parameters can be solved from the vanishing points of three mutually orthogonal directions in a single image[3]. Multiple patterns or views can be employed to perform calibration in the cases where not all three vanishing points are available from a single view. Although recent research has come up with fruitful achievements, most work suffers from the problems of multiple views, restricted patterns or incompleteness of solutions [5,13,18]. Two major obstacles are the mandatory requirements of multiple views and G. Bebis et al. (Eds.): ISVC 2008, Part I, LNCS 5358, pp. 815–824, 2008. c Springer-Verlag Berlin Heidelberg 2008
816 Y.Chen et al. non-planar scene structures.In this paper.we make an attempt to calibrate the camera from a single image of planar scene.In our approach,coplanar circles are adopted as basic calibration patters and vanishing points function in a hrand new way.Conic based planar rectification(1 1]and conic based pose estimation[4]are two previous approaches most related to our work.The work of [4]estimates the focal length and the camera pose from two coplanar circles in the image.However,this method implicitly assumes that the principal point is known beforehand and cannot treat non-unit aspect ratio.The work of [1I]makes accurate Euclidean measures from coplanar circles in a calibration-free manner.Nevertheless,the analysis is limited on the target plane and cannot be extended to applications in need of camera parameters. In our work,we propose a full calibration scheme which statistically estimates the focal length,the principal point,the aspect ratio as well as the extrinsic parameters.In particular,we show that circle is a powerful conic in that a single view of two coplanar cireles is capable of providing adequate information to do metric calibration.A coarse pipeline of our algorithm is as follows:First,calibration-free planar rectification re- ported in [11]is performed and extended to recover the vanishing line,the centers of the circles and many orthogonal vanishing point pairs.Second.under different guesses of the principal point the distribution of the focal length are computed from all orthogonal vanishing point pairs.Third,based on the previously computed focal length distribution a statistical optimization routine is designed to estimate the focal length,the principal point and the aspect ratio simultaneously.Fourth,the conic based pose estimation[4,6] are employed to compute the extrinsic parameters.Finally,the calibration result is vali- dated by comparison with the ground truth for synthetic scenes or by augmented reality tests for real scenes. The major advantage of our work lics in that the algorithm uses only a single image of simple planar scene to achieve a full solution of camera calibration.Therefore,the scene requirement is low in comparison with previous methods.This approach is very practical and works well for many scenes which previous methods fail to treat. The rest of the paper is organized as follows.Section 2 briefly reviews and extends the previous work of coplanar cireles based scene analysis.Section 3 elaborates a fea- sible scheme which statistically estimates the focal length,the principal point and the aspect ratio simultaneously.Some discussions are also provided in this section.Section 4 presents the experimental results on both synthetic and real scenes.Finally.conclud- ing remarks are given in Section 5. 2 Preliminaries We will present a calibration algorithm step by step under the practical assumption of zero skew.Our algorithm solves the camera projection matrix P=K[]where K is the zero-skew calibration matrix containing 4 intrinsic parameters as defined in equation (1)and the metric matrix fully encodes the 6 extrinsic parameters. (1) 001
816 Y. Chen et al. non-planar scene structures. In this paper, we make an attempt to calibrate the camera from a single image of planar scene. In our approach, coplanar circles are adopted as basic calibration patterns and vanishing points function in a brand new way. Conic based planar rectification[11] and conic based pose estimation[4] are two previous approaches most related to our work. The work of [4] estimates the focal length and the camera pose from two coplanar circles in the image. However, this method implicitly assumes that the principal point is known beforehand and cannot treat non-unit aspect ratio. The work of [11] makes accurate Euclidean measures from coplanar circles in a calibration-free manner. Nevertheless, the analysis is limited on the target plane and cannot be extended to applications in need of camera parameters. In our work, we propose a full calibration scheme which statistically estimates the focal length, the principal point, the aspect ratio as well as the extrinsic parameters. In particular, we show that circle is a powerful conic in that a single view of two coplanar circles is capable of providing adequate information to do metric calibration. A coarse pipeline of our algorithm is as follows: First, calibration-free planar rectification reported in [11] is performed and extended to recover the vanishing line, the centers of the circles and many orthogonal vanishing point pairs. Second, under different guesses of the principal point the distribution of the focal length are computed from all orthogonal vanishing point pairs. Third, based on the previously computed focal length distribution a statistical optimization routine is designed to estimate the focal length, the principal point and the aspect ratio simultaneously. Fourth, the conic based pose estimation[4,6] are employed to compute the extrinsic parameters. Finally, the calibration result is validated by comparison with the ground truth for synthetic scenes or by augmented reality tests for real scenes. The major advantage of our work lies in that the algorithm uses only a single image of simple planar scene to achieve a full solution of camera calibration. Therefore, the scene requirement is low in comparison with previous methods. This approach is very practical and works well for many scenes which previous methods fail to treat. The rest of the paper is organized as follows. Section 2 briefly reviews and extends the previous work of coplanar circles based scene analysis. Section 3 elaborates a feasible scheme which statistically estimates the focal length, the principal point and the aspect ratio simultaneously. Some discussions are also provided in this section. Section 4 presents the experimental results on both synthetic and real scenes. Finally, concluding remarks are given in Section 5. 2 Preliminaries We will present a calibration algorithm step by step under the practical assumption of zero skew. Our algorithm solves the camera projection matrix P = K[R|t] where K is the zero-skew calibration matrix containing 4 intrinsic parameters as defined in equation (1) and the metric matrix [R|t] fully encodes the 6 extrinsic parameters. K = ⎛ ⎝ αf 0 u 0 f v 0 01 ⎞ ⎠ (1)
Full Camera Calibration from a Single View of Planar Scene 817 We first briefly introduce some related work.Throughout the discussion we adopt the homogeneous presentation which is standard in algebraic projective geometry |171. In [11]it is suggested that under perspective transformation the images of the two circular points on a plane.I (1,i,0)and J=(1.-i,0)".can be computed by solving the intersection of the images of two coplanar circles,which have the following forms under homogencous presentation. anx2+ory+cry?+dirw +ery+fu2=0 (2) 2x2+b2y+c22+d2x+2m+f2r2=0 By solving equation (2)the images of the two circular points,I'and',can be com- puted as P=(r0,%,1) '=6,死1) (3) where (o.o)are the roots of equation (2)corresponding to the circular points.After- wards the vanishing line is computed as the cross product of the two circular points: =T×'=(h一而,石-0,0所-0) (4) Notice that the vanishing line is a real line although the entries of /and J are always complex. In [4]an algorithm is addressed to estimate the focal length and the camera pose from two coplanar circles.Unfortunately the principal point has to be known a priori and the aspect ratio is fixed to be 1.0 for this algorithm to take effect.The attempt of using this constraint solely to estimate the principal point and the focal length at the same time results in large errors,especially in the presence of non-unit aspect ratio.It is also mentioned in 10]that small changes in the estimated principal point may severely degrade the quality of reconstruction.Therefore,some alterative algorithm is desired to give a reliable estimate of the principal point as well as the aspect ratio. By integrating and extending the ideas of the above approaches,we propose a cal- ibration scheme which simultancously estimates the focal length,the principal point, and the aspect ratio.The algorithm is outlined in Section 3. 3 Statistical Camera Calibration In this section,we present a step-by-step framework that benefits from conjugate direc- tion computation and fully calibrates the camera We start from Some preparing theories and give an orthogonal direction identification algorithm based on coplanar circles |17]. 3.1 Orthogonal Vanishing Point Pairs Identification To make full use of the geometric cues in the image we tum to the following fact:the line at the infinity,lc,is the polar line of the cirele center,o of an arbitrary circle
Full Camera Calibration from a Single View of Planar Scene 817 We first briefly introduce some related work. Throughout the discussion we adopt the homogeneous presentation which is standard in algebraic projective geometry [17]. In [11] it is suggested that under perspective transformation the images of the two circular points on a plane, I = (1,i, 0)T and J = (1, −i, 0)T , can be computed by solving the intersection of the images of two coplanar circles, which have the following forms under homogeneous presentation. a1x2 + b1xy + c1y2 + d1xw + e1yw + f1w2 = 0 a2x2 + b2xy + c2y2 + d2xw + e2yw + f2w2 = 0 (2) By solving equation (2) the images of the two circular points, I and J , can be computed as I = (x0, y0, 1) J = (x0, y0, 1) (3) where (x0, y0) are the roots of equation (2) corresponding to the circular points. Afterwards the vanishing line is computed as the cross product of the two circular points: l ∞ = I × J = (y0 − y0, x0 − x0, x0y0 − x0y0) (4) Notice that the vanishing line is a real line although the entries of I and J are always complex. In [4] an algorithm is addressed to estimate the focal length and the camera pose from two coplanar circles. Unfortunately the principal point has to be known a priori and the aspect ratio is fixed to be 1.0 for this algorithm to take effect. The attempt of using this constraint solely to estimate the principal point and the focal length at the same time results in large errors, especially in the presence of non-unit aspect ratio. It is also mentioned in [10] that small changes in the estimated principal point may severely degrade the quality of reconstruction. Therefore, some alternative algorithm is desired to give a reliable estimate of the principal point as well as the aspect ratio. By integrating and extending the ideas of the above approaches, we propose a calibration scheme which simultaneously estimates the focal length, the principal point, and the aspect ratio. The algorithm is outlined in Section 3. 3 Statistical Camera Calibration In this section, we present a step-by-step framework that benefits from conjugate direction computation and fully calibrates the camera. We start from Some preparing theories and give an orthogonal direction identification algorithm based on coplanar circles [17]. 3.1 Orthogonal Vanishing Point Pairs Identification To make full use of the geometric cues in the image we turn to the following fact: the line at the infinity, l∞ , is the polar line of the circle center, oi, of an arbitrary circle
818 Y.Chen et al. 1.-i.0 Fig.1.Vanishing line and orthogonal point pair computation under original and perspective view. ti and vz are orthogonal vanishing points. C on the plane.In other words,o:and l satisfy the pole-polar relation described in equation (5). a b/2 d/2 Ige =(11,12,l3)"=Coi b时/2ce/2 (5) d/2e/2f」 A corollary of the above fact is that two orthogonal directions are conjugate to each other with respect to any circle on the plane.Note that a planar direction can be rep- resented by the corresponding point at infinity.Accordingly.given a circle C and the line at infinity l on the plane,we can freely choose one point at infinity.v.on and determine another point at infinity t'in the orthogonal direction of e using the conju- gate property of the two directions.The calculation is formulated with the following equations under homogeneous representation. I=C,t=【Xlx (6) That is,the orthogonal direction of a given point at infinity can be computed by solving the intersection of its polar line.I,and the line at infinity,l.All the above computa- tions are based on the pole-polar relationship,which is invariant under projective trans- formation.Consequently,the process can be easily transported to determine as many conjugate vanishing point pairs as we want in a perspective view.An illustration of these computations is given in Figure 1.With an image of two coplanar circles,the van- ishing line can be computed using equations (2-4).Then many orthogonal directions can he computed using equation(6).This paves the way for our statistical calibration framework.which will be detailed in the next section. 3.2 Statistical Calibration by Maximum Likelihood Estimate For convenience we first consider the camera model with unit aspect ratio.In the Carte- sian image coordinate if the position of the principal point p(o.o)is given,then for each freely chosen vanishing point v(r,)a corresponding vanishing point '(') which represents the orthogonal direction of v.can be identified on the vanishing line using equation (6).Moreover,according to the orthogonal property of e and u'.there is
818 Y. Chen et al. Fig. 1. Vanishing line and orthogonal point pair computation under original and perspective view. v1 and v2 are orthogonal vanishing points. C on the plane. In other words, oi and l∞ satisfy the pole-polar relation described in equation (5). l∞ = (l1, l2, l3) T = Coi = ⎡ ⎣ a b/2 d/2 b/2 c e/2 d/2 e/2 f ⎤ ⎦ ⎡ ⎣ xi yi zi ⎤ ⎦ (5) A corollary of the above fact is that two orthogonal directions are conjugate to each other with respect to any circle on the plane. Note that a planar direction can be represented by the corresponding point at infinity. Accordingly, given a circle C and the line at infinity l∞ on the plane, we can freely choose one point at infinity, v, on l∞ and determine another point at infinity v in the orthogonal direction of v using the conjugate property of the two directions. The calculation is formulated with the following equations under homogeneous representation. l = Cv,v = l × l∞ (6) That is, the orthogonal direction of a given point at infinity can be computed by solving the intersection of its polar line, l, and the line at infinity, l∞. All the above computations are based on the pole-polar relationship, which is invariant under projective transformation. Consequently, the process can be easily transported to determine as many conjugate vanishing point pairs as we want in a perspective view. An illustration of these computations is given in Figure 1. With an image of two coplanar circles, the vanishing line can be computed using equations (2-4). Then many orthogonal directions can be computed using equation (6). This paves the way for our statistical calibration framework, which will be detailed in the next section. 3.2 Statistical Calibration by Maximum Likelihood Estimate For convenience we first consider the camera model with unit aspect ratio. In the Cartesian image coordinate if the position of the principal point p(x0, y0) is given, then for each freely chosen vanishing point v(x, y), a corresponding vanishing point v (x , y ), which represents the orthogonal direction of v, can be identified on the vanishing line using equation (6). Moreover, according to the orthogonal property of v and v , there is
Full Camera Calihration from a Single View of Planar Scene 819 a unique focal length corresponding to specified p.v.and v.which can be computed from the following cquation [3]. f=V-(x-0j(x'-0)-(y-o(-0) (7) Different orthogonal vanishing point pairs lead to different estimated focal lengths. Therefore,for each guessed principal point p and a set of orthogonal vanishing points V=1,{2,n,,we can estimate a corresponding set of focal lengths F =f1,f2....Our basic idea is to employ the set F containing large amounts of estimated values to statistically put constraint on the principal point. We reasonably expect that,if the principal point is correctly estimated,then the f values in the F set surely form a densely distributed cluster.On the contrary.if the guessed principal point is far from the correct position,then the distribution of the focal lengths computed by equation(7)is more likely quite sparse.Therefore,the distribution of the entries of the F set provides a confidence measure of the guess about the principal point ()Naturally,the variance of the distribution,D(F),is a good candidate to measure such confidence and evaluate the goodness of the guess.In other words, although the probability density function P()is hidden from us it can he measured through the observable focal length distribution D(F).Smaller D(F)corresponds to higher confidence of (o,yo).Under this formulation we can use D(F)to characterize the probability density function of the principal point and perform calibration through maximum likelihood estimate.Note that is determined by (zo.and should be more strictly written as F(ro,0).We take D(F)as the cost function and try to solve the following optimization problem: Minimize()(D(F(ro:)) (8) Under this formulation,from every guess about the principal point a confidence value can be estimated and the corresponding focal length can be computed.An optimization routine is required to seek the minimum of equation (8).which corresponds to the max- imum likelihood estimate of the intrinsic parameters (o,2o,f).In our study the above statistical function is not easily differentiated analytically.So a derivative-free opti- mizer is preferred.The downhill simplex method is a good candidate for this type of optimization [16].In addition,Experiments show that a lot of local minimums exist in the solution space.We solve this problem by employing multiple initial points.Namely. the optimization is repcated several times with multiple rundomly chosen starting points and the best result produced is adopted as the final solution.This strategy ensures the reliability and robustness.After the principal point (u,v)and the focal length f are de- termined,the conic based pose estimate algorithm in 4,8]is employed to calculate the extrinsic parameters.This completes a full single-view hased calibration framework. 3.3 Taking Aspect Ratio into Account Having made the above calibration algorithm work,adding an extra intrinsic parameter, i.e.,the seale factor a,becomes straightforward.All we need to do is just introduce a as a fourth unknown variable into the optimization routine.During optimization,for each guessed value of the scale factor,the image is first scaled horizontally to give a corrected
Full Camera Calibration from a Single View of Planar Scene 819 a unique focal length corresponding to specified p, v, and v , which can be computed from the following equation [3]. f = −(x − x0)(x − x0) − (y − y0)(y − y0) (7) Different orthogonal vanishing point pairs lead to different estimated focal lengths. Therefore, for each guessed principal point p and a set of orthogonal vanishing points V = {{v1, v 1}, {v2, v 2}, ...{vn, v n}} , we can estimate a corresponding set of focal lengths F = {f1, f2, ...fn}. Our basic idea is to employ the set F containing large amounts of estimated f values to statistically put constraint on the principal point. We reasonably expect that, if the principal point is correctly estimated, then the f values in the F set surely form a densely distributed cluster. On the contrary, if the guessed principal point is far from the correct position, then the distribution of the focal lengths computed by equation (7) is more likely quite sparse. Therefore, the distribution of the entries of the F set provides a confidence measure of the guess about the principal point (x0, y0). Naturally, the variance of the distribution, D(F), is a good candidate to measure such confidence and evaluate the goodness of the guess. In other words, although the probability density function P(x0, y0)is hidden from us it can be measured through the observable focal length distribution D(F). Smaller D(F) corresponds to higher confidence of (x0, y0). Under this formulation we can use D(F) to characterize the probability density function of the principal point and perform calibration through maximum likelihood estimate. Note that F is determined by (x0, y0) and should be more strictly written as F(x0, y0). We take D(F) as the cost function and try to solve the following optimization problem: Minimize(x0,y0)(D(F(x0, y0))) (8) Under this formulation, from every guess about the principal point a confidence value can be estimated and the corresponding focal length can be computed. An optimization routine is required to seek the minimum of equation (8), which corresponds to the maximum likelihood estimate of the intrinsic parameters (x0, y0, f). In our study the above statistical function is not easily differentiated analytically. So a derivative-free optimizer is preferred. The downhill simplex method is a good candidate for this type of optimization [16]. In addition, Experiments show that a lot of local minimums exist in the solution space. We solve this problem by employing multiple initial points. Namely, the optimization is repeated several times with multiple randomly chosen starting points and the best result produced is adopted as the final solution. This strategy ensures the reliability and robustness. After the principal point (u,v) and the focal length f are determined, the conic based pose estimate algorithm in [4,8] is employed to calculate the extrinsic parameters. This completes a full single-view based calibration framework. 3.3 Taking Aspect Ratio into Account Having made the above calibration algorithm work, adding an extra intrinsic parameter, i.e., the scale factor α, becomes straightforward. All we need to do is just introduce α as a fourth unknown variable into the optimization routine. During optimization, for each guessed value of the scale factor, the image is first scaled horizontally to give a corrected