Academia.eduAcademia.edu
SVM Based Traffic Sign Classification Using Legendre Moments Hasan Fleyeh Mark Dougherty hfl@du.se mdo@du.se Computer Science Department Dalarna University Borlänge - Sweden Abstract This paper presents a novel approach to recognise traffic signs using Support Vector Machines (SVMs) and Legendre Moments. Images of traffic signs are collected by a digital camera mounted in a vehicle. They are colour segmented and all objects which represent signs are extracted and normalised to 36x36 pixels images. Legendre moments of sign borders and speed-limit signs of 350 and 250 images are computed and the SVM classifier is trained with theses features. Two stages of SVM are trained; the first stage determines the class of the sign from the shape of its border and the second one determines the pictogram of the sign. Training and testing of both SVM classifiers are done offline by using still images. In the online mode, the system loads the SVM training model and performs recognition. Keywords: Traffic signs, Legendre moments, SVM, Classification. 1. Introduction The main goal of automatic sign recognition is to extract traffic signs from images of complex scenes under uncontrollable illumination. Traffic signs define a visual language that can be interpreted by drivers. They represent the current traffic situation on the road, show danger and difficulties facing drivers, give warnings to them, and help them with their navigation by providing useful information that makes the driving safe and convenient [1, 2]. Traffic signs have been designed using special shapes and colours, very different from the natural environment, which make them easily recognisable by drivers. They are designed, manufactured and installed according to stringent regulations [3]. To be distinguishable from natural and/or man-made backgrounds, they are designed in fixed 2-D shapes like triangles, circles, octagons, or rectangles [4]. A sign can have three colours; border, background and pictogram. The tint of the paint which covers the sign should correspond to a specific wavelength in the visible spectrum [3, 5]. The signs are located in well-defined locations with respect to the road, so that the driver can, more or less, expect the location of these signs [6]. They may contain a pictogram, a string of characters or both [5]. The traffic signs are characterized by using fixed text fonts, and character heights. They can appear in different conditions, including partly occluded, distorted, damaged and clustered in a group of more than one sign [4, 5]. Because of the complex environment of roads and the scenes around them, the detection and recognition of traffic signs may face some difficulties. The colour of the sign fades with time as a result of long exposure to sunlight, and the reaction of the paint with the air [1, 7]. Visibility is affected by weather conditions such as fog, rain, clouds and snow [1]. The colour information is very sensitive to the variations in the light conditions such as shadows, clouds, and the sun. [1, 7, 8]. It can be affected by the illuminant colour (daylight), illumination geometry, and viewing geometry [9]. Objects similar in colour to the traffic signs in the scene under consideration may be present, like buildings or vehicles. Signs may be found disoriented, damaged or occluded. If the image is acquired from a moving car, then it often suffers from motion blur and car vibration. This paper aims to present a new traffic sign recognition system based on using Legendre moments and SVM as shape and pictogram classifier. Two different SVM classifiers are trained with Legendre moments computed for 350 and 250 images, respectively, to recognise and classify the signs. This paper is organized as follows. In section 2, related work is presented. Section 3 depicts the Swedish traffic signs and section 4 presents an overview over the system. In sections 5 and 6, the experimental results and conclusions are demonstrated. 2. Related Work Research in traffic sign recognition is growing rapidly because of the real need for such systems in future vehicles. de la Escalera et al. [10] used neural networks for classification of the traffic signs. It follows Adaptive Resonance Theory ART1. Fang et al. [1] carried out classification by the conceptual component module in which an ART2 network with a configurable long term memory to achieve classification. Lafuente-Arroyo et al. [11] developed a system in which candidate signs are extracted by threshold of Hue and Saturation. Candidate objects are classified using SVM which is trained by the distance from the external contour of the object to the bounding-box. Gil-Jim´enez et al. [12] designed a traffic sign classification system using a series of compressions between the FFT of the signature of the candidate object and the FFT of the signature of the reference shape of the traffic sign. group followed by the recognition of Speed-limit signs. Speed-Limit signs are part of the prohibitory signs. There are five standard Speedlimit signs; namely 30, 50, 70, 90, and 110 km/h. However, there are many other special SpeedLimit signs such as 5, 10, 15, 20, 40 km/h. These special Speed-Limit signs are occasionally found in use and it is therefore hard to collect sufficient examples for training and testing the system. Figure 1: Main colour-shape combinations of Swedish road signs. 3. Swedish Road Signs 4. System Overview In contrast to many other European countries, Swedish road signs are characterized by using yellow background colour for Warning, and Prohibitory signs. Swedish traffic signs can be categorized into four types: • Warning signs: they are red rimmed yellow triangle with symbol or letter messages. • Prohibitory signs: they are red rimmed yellow or blue circle with different symbols or messages. An octagon is used for stop sign • Mandatory signs: they are circle shaped with blue filling colour and white symbols or arrows. • Indicatory and supplementary signs: They are characterised by using rectangles with different background colours such as yellow, green, or blue etc. with white or black symbols or messages. Figure 1 shows the basic grouping of Swedish traffic signs based on the colour of the border. They can be divided, mainly, into two major groups, red and blue. This paper concentrates on the recognition of the signs in the red border The proposed system, shown in figure 2, consists of certain number of units which work together to perform the recognition of the traffic signs. These units are as follows: A. The Camera: Data acquisition of images for training, testing and for real-time applications is carried out by a camera mounted on a moving vehicle. More than 3400 images are collected in different light conditions are used for training and testing of the algorithms. B. Colour Segmentation: Colour segmentation is an important step to eliminate all background objects and unimportant information in the image. It generates a binary image containing the road signs and any other objects similar to the road sign in colour. Colour segmentation is carried out by a shadow and highlight invariant algorithm [13]. C. Shape Analysis: The output of the former unit is a binary image with a number of objects which could be probable traffic signs. This unit is designed to work in two modes. In the offline mode is it invoked to create or update the image database. Objects in the segmented image are extracted using the connected components labelling algorithm, size normalised to 36x36 pixels and saved in the database for the training of the classifier. The following set of equations is used for normalisation: (1) x ′ = N ( x − x min ) /( x max − x min ) (2) y ′ = N ( y − y min ) /( y max − y min ) x x Where the coordinates values min , max , y min , y max are the rectangle vertices containing the sign before normalisation with sides parallel to the vertical and horizontal axes, and ( x ′, y ′) are the coordinates of a generic point in the new NxN matrix corresponding to the ( x, y ) coordinates of the pixel of the original matrix. In the online mode, the same aforementioned process is followed, but images are forwarded to the feature extraction unit instead. D. Training Database: The training database consists of 600 binary images of size 36x36 pixels used for training of the SVM by calculating Legendre moments. The database comprises 350 images of border shapes and 250 images for Speed-Limit signs. It is created by the method described in the preceding step. Figures 3 and 4 show part of the database of the borders and pictograms used for recognition of the speed limit signs. STOP RC TRI RCB RCX NOEN Figure 3: Part of the Training Set for the Border Recognition. SP30 SP50 SP70 SP90 SP110 E. Feature Extraction: Legendre moments are used in this work as features. The set of Legendre moments was proposed by Teague [14] as a set of orthogonal moments for image analysis. Legendre moments are used in different applications such as pattern recognition, image indexing and face recognition. The kernel of Legendre moments are the products of Legendre polynomials defined along rectangular image coordinate axes inside a unit circle. Legendre moments of order (m + n) are defined as [15] (2m + 1)(2n + 1) Lmn = 4 (3) 1 1 ∫ ∫ Pm ( x) P n ( y) −1−1 where m, n = 1, 2, 3, L , ∞ and x, y ∈ [− 1, 1] . The nth order Legendre polynomials are defined as: Pn ( x) = ∑ (−1) (n−k ) / 2 n k =0 (n + k )! x k 2 n ⎛⎜ n − k ⎞⎟! ⎛⎜ n + k ⎞⎟! k! ⎝ 2 ⎠⎝ 2 ⎠ 1 (4) where x ≤ 1 and (n − k ) is even. The above series expansion of Legendre polynomial can be obtained from the equation ( ) 1 ⎛d ⎞ ⎡ 2 n⎤ (5) ⎜ ⎟ ⎢1− x ⎥ n ⎦ 2 n! ⎝ dx ⎠ ⎣ The set of Legendre polynomials Pn (x) forms a complete orthogonal basis set on the interval [-1, 1], and the Legendre moments Lmn generalizes the geometric moments m pq in the Pn ( x) = 2 sense that the monomial x p y q is replaced by the orthogonal polynomial Pm ( x) Pn ( y ) of the same order. As mentioned in the previous discussion, the region of definition of Legendre polynomials is inside the interval [-1, 1]. An N × N pixels image with intensity f (i, j ) such that 0 ≤ i, j ≤ ( N − 1) should be scaled to fit the region −1 ≤ x, y ≤ 1 . The discrete version of the Legendre moments can be given as [16] Lmn = (2m + 1)(2n + 1) N −1 N −1 ∑ ∑ Pm ( xi ) Pn ( y j ) f (i, j ) N2 i =0 j =0 where Figure 4: Part of the Training Set for the Interior of the Sign. f ( x, y ) dx dy (6) xi and y j denote the normalised pixel coordinates in the range [-1, 1] and given by 2i 2j −1 , y j = −1 xi = (7) N −1 N −1 To calculate the Legendre moments for digital binary images the following steps are invoked 1. Find the centre of mass ( xcen , y cen ) of the object under consideration. 2. Find the minimum bounding circle and calculate its radius denoted rmin from (8) rmin = (i − xcen ) 2 + ( j − y cen ) 2 Where 0 ≤ i, j ≤ ( N − 1) and (i, j ) is the position of the current pixel. 3. Normalise the coordinates of the image such that −1 ≤ xi , y j ≤ 1 as follows j − y cen x −i , y j = cen (9) rmin rmin Calculate Legendre moments for equation (6). xi = 4. F. SVM Classifier: SVM is a new kind of pattern classification and regression technique based on the Statistical Learning Theory, which was first proposed by Vapnik in 1992 [17]. The SVM learns a separating hyperplane to maximize the margin and to produce good generalisation ability. Due to the good generalisation performance on a lot of real-life data and due to the fact that the approach is properly motivated theoretically, it has been used for a wide range of applications. In a binary classification problem the training data is given as a data set S of points xi ∈ ℜd with the label yi ∈ {−1, + 1} , for all training data i = 1,L, l , where l is the number of training examples, and d is the dimension of the problem. When training SVM, the goal is to construct a separating hyperplane as the decision plane, which successfully separates the positive (+1) and the negative (-1) classes with the largest margin, as shown in figure 5. w C1 C2 Margin Figure 5: Linear separating hyperplane of two classes. Linear classification is performed by using a linear function of its input vectors. This function is given by f (x) = w. x + b = ∑ wi xi + b l i =1 (10) where xi is the ith attribute value of an input vector x , wi is the weight value for the attribute xi and b is the bias. The hyperplane can be defined as (11) w. x + b = 0 w ∈ ℜd , b ∈ ℜ The optimal hyperplane can then be found by maximising the margin which leads to the following optimisation problem: min τ ( w) = w 2 (12) 2 If Lagrangian multiplier is introduced L( w, b, α ) = w 2 2 − ∑ α i ( yi ((xi . w ) + b) − 1) l i =1 (13) The classification of a new pattern x can now be obtained by solving the decision function f (x) f (x) = sign( w. x + b ) ⎛ l ⎞ = sign⎜ ∑ yiα i (x. xi ) + b ⎟ ⎜ ⎟ ⎝ i =1 ⎠ (14) 5. Experiments and Results Sign recognition is mainly carried out by three major steps; colour segmentation, detection and classification. Figure 6 depicts results from both colour segmentation and detection. Colour segmentation is carried out by an algorithm which invariant to shadows and highlights, it is robust to wide range of light conditions and it is tested on hundreds of images. Sign detection is based on calculating four shape measures which are rectangularity, triangularity, ellipticity , and octagonality. They are invariant to in-plane transformations such as rotation, scaling, and translation. This feature is a necessary one for the traffic sign recognition applications as the signs can appear rotated, in different places in the image or in different sizes. Once such object is detected, it is then normalised and Legendre moments are computed and forwarded to the SVM classifier. Classification is achieved by two stages. In the first stage, the red border is recognised, followed by the recognition of the interior part of the sign or the pictogram. reduce the amount of calculations and makes the system faster. Table II depicts results of classification when different SVM types and different kernels are used. According to this table, linear kernel and C-SVM gives the best classification results. Speed Limit signs are used for this experiment because it is hard to recognise the different speed limit signs because of high similarity. Table I: Classification rate of red rimmed signs and speed limit signs. Sign Rate % Sign Rate% NOE 100 SL30 100 STP 100 SL50 100 RC 100 SL70 93 TRI 100 SL90 100 RCB 93 SL110 93 RCX 100 C-SVM, Linear Kernel Training Testing Classification Accuracy % 102 100 98 96 94 92 90 3 4 5 6 7 8 9 10 Order of Legendre moments Figure 7: Effect of Legendre moments order on classification rate when C is constant. The other experiment is to test the order of the Legendre moments versus the SVM classification rate. As it is illustrated in figure 7, it is clears that the classification rate is almost constant with all values of Legendre moments above the 6th order. Choosing this order can Nu-SVM Shapes of red rimmed traffic signs can be divided into seven categories. Because Legendre moments are rotation invariant by definition, it is impossible to discriminate “upward” and “downward” triangles. Therefore, number of classes is reduced to six by merging the two triangle classes. Table I depicts the classification rate of SVM for different traffic signs. C-SVM SVM Figure 6: Results of Segmentation and Detection in different conditions. Table II: Classification rates of border shapes and Speed-Limit signs using different kernels and SVM types Shapes Speed-Limits Kernel Train Test Train Test % % % % 100 98.9 100 98.7 Linear 94.7 94.4 97.7 97.3 Polyn. RBF 100 98.9 98.8 97.3 Sigm. Linear 100 100 98.9 98.9 99.4 98.8 97.3 97.3 Polyn. 100 98.9 100 97.3 RBF 100 98.9 98.8 97.3 Sigm. 100 98.9 98.8 97.3 6. Conclusions This paper presents a new method to classify traffic signs. It is based on using Legendre moments as invariant features. Legendre moments are invariant to rotation by definition. Furthermore, a method to make Legendre moments invariant to scaling and translation is shown in this paper. Invariance is an important property to deal with images of different transformations in the image plane, which is very likely to happen when dealing with traffic signs. Two stages of SVM classifier is used for the classification of signs border shapes and sign interiors respectively. The method shows high robustness and high classification rate. For future work, more features or feature fusion will be tested. Orthogonal Fourier-Mellin descriptors are planned to be tested in future. References [1] C. Fang, C. Fuh, S. Chen, and P. Yen, "A road sign recognition system based on dynamic visual model," presented at The 2003 IEEE Computer Society Conf. [2] [3] [4] [5] [6] [7] [8] [9] Computer Vision and Pattern Recognition, Madison, Wisconsin, 2003. C. Fang, S. Chen, and C. Fuh, "Road-sign detection and tracking," IEEE Trans. on Vehicular Technology, vol. 52, pp. 13291341, 2003. S. Vitabile and F. Sorbello, "Pictogram road signs detection and understanding in outdoor scenes," presented at Conf. Enhanced and Synthetic Vision, Orlando, Florida, 1998. P. Parodi and G. Piccioli, "A featurebased recognition scheme for traffic scenes," presented at Intelligent Vehicles '95 Symposium, Detroit, USA, 1995. S. Vitabile, A. Gentile, and F. Sorbello, "A neural network based automatic road sign recognizer," presented at The 2002 Inter. Joint Conf. on Neural Networks, Honolulu, HI, USA, 2002. M. Lalonde and Y. Li, "Road sign recognition. Technical report, Center de recherche informatique de Montrèal, Survey of the state of Art for sub-Project 2.4, CRIM/IIT," 1995. J. Miura, T. Kanda, and Y. Shirai, "An active vision system for real-time traffic sign recognition," presented at 2000 IEEE Intelligent Transportation Systems, Dearborn, MI, USA, 2000. S. Vitabile, G. Pollaccia, G. Pilato, and F. Sorbello, "Road sign Recognition using a dynamic pixel aggregation technique in the HSV color space," presented at 11th Inter. Conf. Image Analysis and Processing, Palermo, Italy, 2001. S. Buluswar and B. Draper, "Color recognition in outdoor images," presented at Inter. Conf. Computer vision, Bombay, India, 1998. [10] [11] [12] [13] [14] [15] [16] [17] A. de la Escalera, J. Armingol, and M. Mata, "Traffic sign recognition and analysis for intelligent vehicles," Image and Vision Comput., vol. 21, pp. 247-258, 2003. S. Lafuente-Arroyo, P. Gil-Jim´enez, R. Maldonado-Basc´on, F. L´opez-Ferreras, and S. Maldonado-Basc´on, "Traffic sign shape classification evaluation I: SVM using distance to borders," presented at IEEE Intelligent Vehicles Symposium, Las Vegas, USA, 2005. p. Gil-Jim´enez, S. Lafuente-Arroyo, H. Gomez-Moreno, F. L´opez-Ferreras, and S. Maldonado-Basc´on, "Traffic sign shape classification evaluation II: FFT applied to the signature of blobs," presented at IEEE Intelligent Vehicles Symposium, Las Vegas, USA, 2005. H. Fleyeh, "Shadow And Highlight Invariant Colour Segmentation Algorithm For Traffic Signs," presented at 2006 IEEE Conf. on Cybernetics and Intelligent Systems, Bangkok, Thailand, 2006. M. Teague, "Image Analysis via the general theory of moments," J. Opt. Soc. Am., vol. 70, pp. 920-930, 1980. P. Yap and R. Paramesran, "An efficient method for the computation of Legendre moments," IEEE Trans. Pattern Anal. Mach Intell., vol. 27, pp. 1996-2002, 2005. C. Chong, P. Raveendran, and R. Mukundan, "Translation and scale invariants of Legendre moments," Pattern Recog., vol. 37, pp. 119-129, 2004. C. Cortes and V. Vapnik, "Support vector networks," Machine Learning, vol. 20, pp. 273-297, 1995. Figure 2: Block Diagram of the Proposed System.