Absolute Localization with the Calibrated SYCLOP Sensor

 

Cyril CAUCHOIS, Eric BRASSART, Laurent DELAHOCHE, Cyril DROCOURT

 

CREA (Center of Robotic, Electrotechnic and Automatic)

7, rue du moulin neuf

80000 Amiens – FRANCE

Phone : (33) 322-827-669

FAX : (33) 322-454-647

E-mail : Cyril.Cauchois@crea.u-picardie.fr , Eric.Brassart@crea.u-picardie.fr

 

 

Abstract

 

In this paper we propose, while using the SYCLOP sensor model (Conical SYstem for LOcalization and Perception), an original method of localization in an artificial indoor environment with a very precise 3D map. Until now, localization using conical omnidirectional vision sensor use radial segment from vertical beacon of the environment. The main motivation of the work presented in this paper is to show that it is possible to use conical vision with primitives other than radial straight lines. This is why we have chosen to develop a localization method based on the use of points in the image and their 3D correspondents. After shortly reminding the SYCLOP sensor geometry, we explain how, with the help of an adapted pattern calibration, we calibrate our sensor with a method coming from classical hard calibration techniques. This calibration allow us to know, with high accuracy, the formation process of SYCLOP's omnidirectional images. The last part of this paper presents the localization method and results obtained during an experimenting phase.

 

 

1 Introduction

 

Classical applications in mobile robotics based on vision system use in large majority pinhole type matrix cameras that limit, according to lenses, the field of view. While sophisticating vision systems, it is nevertheless possible to get fields of view larger by using several cameras in different directions, Ishiguros and al in 1993 [9]. Other applications use only one camera, with a rotation motion, in order to sweep a large space. In these two cases, algorithms of matching in successive pictures are imperative to rebuild a panoramic picture. It exists another possibility to get wide-angle pictures: systems with omnidirectional vision. Although this notion has existed for numerous years, it was necessary to wait until the beginning of the 90’s to see their use intensifying in robotic applications [13] [14] [15]. If he wishes, the reader will be able to get more explanations in generality on omnidirectional vision by referring himself to the article of Shree K. Nayar [11].

These types of sensors represent acquired stages observed on 360°. This approach can be very interesting in applications such as:

·        telemonitoring,

·        telepresence,

·        teleconference,

·        telesurveillance

·        acquirement of environment model for the evaluation (assessment) of displacements of an autonomous vehicle,

·        tracking,

 

There are two big classes of omnidirectional vision systems. The first one uses a mirror and a camera, it is called 'catadioptric system' (see applications in [1] [7] [12] [13] [14] [15] ). The second one is composed by a classical camera surmounted by a fish-eye lens, and called ' dioptric system' (see Cao in 1986 [10]).

We focus on the first class of these sensors.

The first patent for a system using a catadioptric mechanism was registered in 1970 by D.W. Rees [12]. In spite of this first patent, these systems remained a long time in the shade and it was only at the beginning of the 90’s that these sensors really emerged with utilization in robotic applications with Y. Yagi [1] [13]. More recently these systems have known a very large expansion in multimedia applications with the explosion of ‘internet’, as well as teleconference applications. In our laboratory, we have developed applications using omnidirectional sensors for robotics applications since the beginning of the 90’s [7] [14]. Our system, named SYCLOP (" Conic SYstem for LOcalization and Perception "), is constituted of an oriented vertical camera, surmounted by a conical mirror.

These types of mirror have the advantage of providing omnidirectional picture of an environment essentially showing the shape of radial projections of vertical beacons. The majority of authors using omnidirectional vision with this type of mirror use solely these radial beacons to make localization, segmentation, etc. It is due to the fact that straight lines other than vertical, in environments do not project themselves following a simple mathematics model, and that the initial vocation of these sensors is not to detect beacons other than vertical ones.

Besides these systems present, according to Nayar, the major inconvenience of not possessing a single view point. In fact, they possess a circle of view point so, they generate blur pictures [15].

In this paper, we present, in section 2, an original solution for the determination of a mathematical model created for the SYCLOP sensor. This model is based on the virtual point notion. We consider that punctually, the conical mirror work like a planar mirror. Then, the projection of a real point of the environment from a conical reflection is equivalent to the projection of a virtual point on the image plan while using a pinhole camera. Then, we show that it is necessary to take into account distortions to get a more accurate model of SYCLOP [7].

In section 3 we will show the protocol of calibration used to calibrate the system in its entirety. This protocol answers efficiently to constraints of markup, positioning and motives extraction of 3D calibration points. Later we will show the SYCLOP simulator developed with the defined model. In section 4 we will explain the localization method and show results obtained during an experimental phase.

We conclude our subject by a discussion on perspectives offered by this new method.

 

Figure 1: SYCLOP sensor

 

2 SYCLOP Model

 

2.1 Problem description

 

Our sensor, as the one used by Yagi in [1], is made of a conical mirror and a CCD camera with a 8.5 mm lens (see Figure 1). Nowadays, panoramic vision only allows us to detect all the vertical elements on a 2p radian domain because they generate a set of radial straight lines converging at the center of the cone through a 2D projection. To extend the detection to other lines, we have to calibrate this sensor. First, we have to determine a mathematical model of the transformation. An object of the world will be reflected onto the conical mirror and projected onto the image plane. Figure 2 shows the sensor geometry.

 

Figure 2 : SYCLOP geometry.

 

The transformation contains a conical reflection that we compute with a virtual point notion. As shown in Figure 3, to determine the point V we project the point P on the shape reflector perpendicularly according to the straight-line Dt.. Then the point V will be projected on the CCD matrix camera.

 

Figure 3 : The virtual point V (cutting out according to the P plane).

 

For more details about the method of calculation, the reader will be able to refer to [7].

 

To camera model we chose is a pinhole one. Generally speaking, observed pixel locations are not equal to locations resulting from a simple projection onto the image plan. Because of acquisition, spatial digitization noises, point extraction and different kinds of deformations, the image is distorted. As a result of several types of imperfections in the design and assembly of the lenses composing the camera optical system, geometrical distortion concerns the position of image points in the image plane. There are two kinds of distortions: radial and tangential [2]. For each kind of distortion, an infinite series is required. To determine the type of distortion as well as the number of distortion coefficient to take in counts we led a similar survey to the one made by Flourou and Mohr [6]. As Tsai [3] and Beyer [4] in the case of the calibration of a monocular camera, we noted that using only one coefficient of radial distortion is sufficient.

 

2.2 The complete model

 

The calculated model is based on:

§           techniques of classic calibration (hard calibration) [3] [4] [5] for the intrinsic and extrinsic camera parameters determination,

§           the notion of virtual point for the determination of points reflected on the conical mirror.

The different stages of transformation from a real point P to its projected point P' consists of:

§           a change from the world coordinate system to the cone coordinate system,

§           a conic reflection,

§           a change from the cone coordinate system to the camera coordinate,

§           a perspective projection (with only one distortion coefficient).

 

Final results of our models are given by the following equations system:

 

 

with  

and                                                                              (1)

 

,      and   ,

R is the radius of the conical reflector and H his height.

 

 

3 Calibration

 

3.1 The calibration pattern

 

In order to determine the set of parameters characterizing the model, we achieved an adapted calibration sighting, positioned directly on the SYCLOP sensor. This sighting is a hollow cube on which the 4 interior vertical faces are provided of a pattern with the repetitive square motives. (Figure 4).

 

Figure 4: The new calibration system.

 

The created geometric motives are on the interior sides of the cube. To detect the orientation of the pattern calibration in the image, one of the sides of the pattern calibration has a special motive (see the reference mark on Figure 5).

 

Figure 5: Pattern calibration image.

 

 

3.2 Parameters determination

 

Figure 5 constitutes the picture of reference for the continuation of our work. Only the central part is used for calculation.

In this image, we apply a Canny-Derich edges detector followed by a simplified Hough transform. In this way, we can easily estimate all radial straight lines (Figure 6).

 

Figure 6: All radial straight lines.

 

Then, all points belonging to the radial straight lines are eliminated. They characterize all vertical straight lines of the pattern calibration. All other points are preserved (Figure 7).

 

Figure 7: The binarised image after the radial straight lines extraction.

 

They characterize all horizontal straight lines of the pattern calibration, projecting themselves in the picture as curved shapes. Next, we approximate all these curves by elliptic function. As follows :

 

                                       (2)

xi and yi represent the set of pixels which compose the ellipsis portion. cx and cy are the center coordinates, and Rx and Ry are radius of the ellipsis.

            A least mean square solution permits to solve the system easily. The Figure 8 shows all ellipsis extract from the image.

Figure 8: All the ellipsis extracted from the image.

 

Then, we calculate intersections between radial straight lines and ellipsis (Figure 9) and we get the set of 2D calibration points with a sub-pixel accuracy (Figure 10).

 

Figure 9: Radial straight lines and ellipsis composing the pattern calibration.

Figure 10: The calibration points set descended from the intersections.

 

 

The set of 2D and 3D calibration points gives an overdeterminated system. To solve this system, we use Levenberg-Marquardt method of non linear resolution [8]. Results obtained during a calibration are summarized in the following table:

 

1020.79

1016.43

384.56

287.18

-2.31 e-7

 

-0.221

2.060

0.000

0.715

0.021

214.67

 

0.396

-0.556

5.644

1.218

0.160

-2.417

Table 1: Parameters value of Figure 5

 

            We have acquired a multitude of images from the pattern calibration placed at the top (the base) of the conical mirror, in order to calculate the mean of each parameter set, in accordance with Puget and Skordas in [5]. Results obtained in this part are summarized in Table 2.

 

1015.75

1011.16

384.06

287.79

-2.292 e-7

 

0.200

0.503

0.000

1.814

-0.866

210.88

Table 2: Means results.

 

 

Example 1

Example 2

Real image

Simulated image

Superimposition

Table 3: Comparison between real and synthetic images.

 

 

3.3 The simulator

 

With the mathematical model of our sensor SYCLOP, and with the result obtained with the calibration (table 2), we have implemented (in C) a SYCLOP simulator. With the help of a 3D environment map, the simulator is able to compute synthetics omnidirectionals images close to real ones. Table 3 shows some examples done in an indoor environment. The matching between real and synthetic images is really interesting.

 

4 Localization

 

In few words, the localization method we propose consist in establishing a correspondence between twelve safe points extracted in the image and their 3D correspondents. Then, it's possible to estimate the sensor 3D position (and so the mobile robot position) in its environment.

 

4.1 The environment

 

Our simulator is able to compute synthetics images in “flat” model (no texture possibility). Consequently, we chose to create an artificial environment comprising a lot of polygons (which act as texture).

The environment of work (below) has for measurements: 2m by 3m5. It is composed of 5 blocks of 1m25 height and different widths. Every block is covered of a set of motive permitting a strong contour detection.

 

Figure 11: a 3D representation of our artificial environment.

 

As you can note on Figure 11, we have covered each block with different motives. These motives are black on a white background in such a way that contour extraction can be easily done.

First of all, we wanted to assure us that the 3D map was sufficiently accurate. Thus, we made an acquisition in the environment, while taking big care to record the precise position of the robot.

With the simulator we calculated a picture at this position. We superimposed the synthetic picture to the real one. The result is more that convincing. As you can note on Figure 12, the projection of the 3D map is very close to the real picture. Therefore, we can pass to the following stage, to tempt to localize in an absolute manner the sensor in its environment.

 

Figure 12: zoom on a part of the picture.

 

4.2 The method

 

The aim of the method is to localize with only twelve known points. Several stages are necessary:

 

4.2.1 Stage 1: The image base

 

So, the first stage consist in calibrating the sensor to obtain an accuracy projection model on the image plan. In this way, with the calibrated model and a 3D environment map, we are able to build, offline, an image base of the twelve localization points. We have choose to use twelve points in order to get a system a bit over-dimensioned. In theory, only six points are necessary. These points have been chosen for there extraction facilities. For each position, seven hundred twenty pictures are calculated. Therefore, it gives us an evaluation all 1/2 degrees in orientation.

 

4.2.2 Stage 2: The matching

 

Then, after having done a picture acquirement, the twelve localization points are extracted. Next, we search in the image base the best matching. This research is done by using the Hausdorff distance as criteria to minimize [16]. The Hausdorff distance measures the extent to which each point of a model set lies near some point of an image set and vice versa. Thus this distance can be used to determine the degree of resemblance between two objects that are superimposed on one another.

Given two finite sets A={a1, …, ap} and B={b1, …, bq}, the Hausdorff distance is defined as

 

 

     (3)

 

where

 

                   (4)

 

and  is some underlying norm of the points of A and B (e.g., the L2 or Euclidean norm).

The function  is called the directed Hausdorff distance from A to B. It identifies the point a Î A that is the farthest from any point of B, and measures the distance from a to its neighbor in B (using the given norm ). That is,  in effect ranks each points of A based on its distance to the nearest point of B, and then uses the largest ranked such point as the distance (the most mismatched point of A). Intuitively, if , then each point of A must be within the distance d of some point of B, and there also is some point of A that is exactly distance d from the nearest point of B (the most mismatched point).

The Hausdorff distance, , is the maximum of  and . Thus it measures the degree of mismatch between two sets, by measuring the distance of the point A that is the farthest from any point of B and vice versa. Intuitively, if the Hausdorff distance is d, then every point of A must be within a distance d of some point of B and vice versa. Thus the notion of resemblance encoded by this distance is that each member of A be near some member of B and vice versa.

 

Figure 13: The Hausdorff distance between a picture and the image base.

 

 

Let's consider that A is the set of points of the model, and B the set of points of the picture. Since A and B have the same number of points (twelve), this stage also goes to permit us to know correspondences between 2D and 3D points.

We looked at the evolution of the Hausdorff distance with all pictures of the base (Figure 13). We could note that the solution (the best matching) is in the bottom of a hollow. Therefore, we decided to start with cutting up the environment by windows of 50cm of sides.

This stage finished, we have an evaluation of the position and the orientation of the sensor in its environment. These values will be used during the stage of refinement.

 

4.2.3 Stage 3: The refinement

 

Now, we know the correspondence between 2D and 3D points. Moreover, we have an estimation of the sensor position. So, during this stage, we will refine the estimation of the position by minimizing the model (1) by only estimating the rotation around the cone axis. As you can note it on Figure 13, locally, we can consider that the Hausdorff distance is linear. Thus, we are going to be able to do the refinement of dichotomous manner.

We start with defining a window of 100mm of side around the estimated position. Then, we sample the window with a step of 25mm. Around the best matching we define a window of 50mm of side. Then we sample with a step of 12,5mm. The best correspondence give the estimation of position (Tx, Ty) and orientation (Rz) of the sensor.

 

4.2.4 Stage 4: The spatial localization

 

This fourth and last stage is going to permit us to localize the SYCLOP sensor spatially in its environment. The final values of the third stage are going to act as initialization for the minimization of the model (1). Of this way we are going to be able to estimate the rigid motion existing between the world (the environment) and the sensor (the robot). This minimization is done with the help of the algorithm of Levenberg-Marquardt [8].

 

4.3 Experimentation

 

For this experimentation we chose twelve points located on three different blocks (four on each). The order of the robot was to browse a distance of 1 meter along a straight line, and to do an acquirement every 5cm. In this way we acquired 21 pictures of the environment. Results obtained during this experimentation are summarized in the Table 4.

 

 

Theoretical

values

Computed values

Tx

(mm)

Ty

(mm)

Tx

(mm)

Ty

(mm)

Tz

(mm)

Rx

(degree)

Ry

(degree)

Rz

(degree)

1

953,51

901,37

966,00

909,63

513,93

-0,13

-0,31

106,23

2

1001,18

916,44

1012,22

922,31

513,30

-0,08

-0,31

106,20

3

1048,86

931,50

1056,68

946,06

512,98

-0,26

-0,27

106,29

4

1096,53

946,57

1109,13

954,28

512,89

-0,18

-0,21

106,18

5

1144,21

961,64

1152,62

968,71

513,44

-0,09

-0,10

106,22

6

1191,88

976,71

1198,89

969,71

512,77

0,00

0,00

106,30

7

1239,56

991,78

1247,37

995,53

513,68

0,00

-0,04

106,27

8

1287,24

1006,85

1299,29

1011,00

512,78

-0,10

-0,05

106,17

9

1334,91

1021,91

1341,02

1024,86

512,07

-0,02

-0,05

106,11

10

1382,59

1036,98

1389,99

1040,63

512,32

0,02

-0,12

106,02

11

1430,26

1052,05

1431,74

1051,99

512,35

-0,03

-0,05

106,08

12

1477,94

1067,12

1469,02

1064,08

512,46

0,15

-0,05

105,88

13

1525,61

1082,19

1515,12

1074,48

511,92

0,22

-0,12

105,99

14

1573,29

1097,26

1566,37

1096,09

510,98

0,08

-0,17

105,94

15

1620,96

1112,33

1615,84

1108,20

510,75

0,19

-0,23

105,83

16

1668,64

1127,39

1662,74

1128,07

510,87

-0,04

-0,17

105,84

17

1716,31

1142,46

1708,42

1138,84

510,10

-0,02

-0,17

106,03

18

1763,99

1157,53

1760,09

1152,72

510,34

0,02

-0,28

105,85

19

1811,67

1172,60

1799,82

1164,13

509,39

-0,07

-0,22

105,83

20

1859,34

1187,67

1856,98

1180,32

509,20

-0,07

-0,24

106,15

21

1907,02

1202,74

1906,93

1193,61

509,15

0,02

-0,12

105,92

Table 4: Spatial localization results (millimeters and degrees).

 

Figure 14: 2D projection of the localization results.

 

The gotten results are very interesting since the maximum error with the theoretical position is only about 17mm (acquirement n°3 on Figure 14). Moreover, the orientation is very well estimated too since it reaches a maximum of 0.3° with the theoretical orientation of 106°.

An interesting remark about the Tz value can be expressed. During its trajectory of 1 meter of length, the robot descended of nearly 5mm. In other words, the floor of the environment descends relatively to blocks.

 

 

5 Conclusions and perspectives

 

In this paper, we have presented an original method of spatial localization based on the use of an omnidirectional vision sensor.

After having recalled the mathematical model characterizing to best the phenomenon of picture formation gotten with our omnidirectional vision sensor, we showed that the use of a suitable pattern calibration permits to calibrate easily, quickly and accurately this sensor. Thus, having a mathematical model and a set of values characterizing the SYCLOP sensor, we have developed a simulator of omnidirectional pictures. Images computed with this simulator are very close to real ones. With these images, we build an image base that allow us to estimate the position of the sensor in its environment. Finally, we estimate the rigid motion between the environment coordinate system and the sensor coordinate system. Thus, we obtain a spatial localization of the robot. Results obtained are very interesting and accurate in position as well as orientation.

Future works should be the extension of the image base. Indeed, we think using a cylindrical projection of omnidirectional images in order to get panoramic images. In this way, we hope to still estimate the robot position and then, automatically extract the set of calibration points usable at this position.

 

Acknowledgement

 

This work was supported in part by “Région Picardie” under the project “Pôle DIVA” (SAAC project 00-2 - “Système Actif d'Aide à la Conduite”).

 

References

 

[1]    Y. Yagi., S. Kawato and S. Tsuji, “Real-time omnidirectional image sensor (COPIS) for vision-guided navigation”, IEEE Transaction on Robotics and Automation, Vol. 10, N°1, 1994.

[2]    C.C. Slama, editor. “Manual of Photogrammetry, fourth edition”. American Society of Photogrammetry and RemoteSensing, Falls Church, Virginia, USA, 1980.

[3]    R.Y. Tsai. “A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses”. IEEE Journal of Robotics and Automation, 3(4):323-344, August 1987.

[4]    H.A. Beyer. “Accurate Calibration of CCD-Cameras”, International Conference on Computer Vision and Pattern Recognition, Urbana-Champaign, Illinois, USA, pages 96-101, 1992.

[5]    P. Puget and Th. Skordas. “An Optimal Solution for Mobile Camera Calibration”. In Proceedings on European Conference on Computer Vision, Antibes, France, April 1990, pp187-198.

[6]    G. Florou and R. Mohr. “What Accuracy for 3D Measurements with Cameras”. Internal Report, GRAVIR, Grenoble, France, December 21, 1995.

[7]    E. Brassart, L. Delahoche, C. Cauchois, C. Drocourt, C. Pegard, M. Mouaddib “Experimental results got with the omnidirectional vision sensor : SYCLOP”, Proc. of the IEEE Workshop on Omnidirectional Vision (OMNIVIS'00), Hilton Head Island, South Carolina, USA, June 12, 2000, p. 145-152.

[8]    “Numerical Recipes in C: The Art of Scientific Computing - Second Edition”, Cambridge University Press, ISBN 0-521-43108-5.

[9]    H. Ishiguron, S. Tsuji “Applying Panoramic Sensing to Autonomous Map Making a Mobile Robot” in Proc, 93 International Conference on Advanced Robotics, november 1993, pp127-132.

[10]  Zuoliang L. Cao, Sung J. Oh, Ernest L. Hall. “Omidirectional dynamic vision positioning for a mobile robot” Journal of Robotic System, 3(1), 1986, pp5-17.

[11]  Shree K. Nayar. “Omidirectional vision” The eighth international Symposium of Robotics Research, October 3-7, 1997.

[12]  D.W. Rees, “Panoramic television viewing system”, United states Patent N°3, 505, 465 Apr. 1970.

[13]  Y. Yagi, S. Kawato, “Panorama Scene Analysis with Conic Projection”, IEEE International Workshop on Intelligent Robots and Systems, IROS’90.

[14]  C. Pégard.,M. Mouaddib., “A mobile robot using panoramic view”, IEEE International Conference on Robotics and Automation, Minneapolis, Minnesota, pp 89-94, April 1996.

[15]  S. Baker and S. K. Nayar A Theory of Catadioptric Image Formation Proceedings of the 6th International Conference on Computer Vision, Bombay, Juanuary 1998, pp. 35-42.

[16]  D. P. Huttenlocher, G. A. Klanderman, W. J. Rucklidge “Comparing Images Using the Hausdorff Distance”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9):850-863, 1993.