2DPCA fractal features and genetic algorithm for efficient face representation and recognition

In this article, we present an automatic face recognition system. We show that fractal features obtained from Iterated Function System allow a successful face recognition and outperform the classical approaches. We propose a new fractal feature extraction algorithm based on genetic algorithms to speed up the feature extraction step. In order to capture the more important information that is contained in a face with a few fractal features, we use a bi-dimensional principal component analysis. We have shown with experimental results using two databases as to how the optimal recognition ratio and the recognition time make our system an effective tool for automatic face recognition.


I. Introduction
The human face is a very rich source of information that can be used to identify persons. This ability of recognition allows us to distinguish persons despite the facial resemblance between them. Nowadays, many researchers try to benefit from computer applications, which become widely used in face automatic recognition.
After more than 30 years of research, we can classify the different existing face recognition systems into three main approaches.
• Local approaches which are based on the fact that the face contains parts that have a high discriminating power such as eyes, nose, mouth... To recognize a person, we use either the blocks containing these regions or the geometric relationships between them [1,2]. Representative works include hidden Markov model [3], elastic bunch graph matching algorithm [4]... • There are global approaches which treat the face as a whole object and use all the information included in it. Many methods have been proposed that include the use of Eigenfaces [5], discrete cosine transform, and Gabor Wavelets [6]... These methods suffer from the size of the feature vector provided to the classifier. For this reason, many linear and nonlinear methods for vector size reduction are applied (PCA, LDA, ICA, ...).
• Hybrid approaches: The principle of these approaches is to imitate the human visual system, which uses both local and global features to recognize persons. The combination of these two methods has only one interest: to take advantage of the combined benefits of both approaches [7,8].
Despite the number of researchers and the proposed methods, several factors can significantly affect face recognition performances, such as the pose, the presence/absence of structural components, facial expressions, occlusion, and illumination variations.
In order to encounter these factors and ensure a high recognition rate and a fast recognition time, we have used, in this article, the fractal representation which exploits the inter-image resemblance [9]. There are few articles that are related to this topic [face recognition using Iterated Function System (IFS) theory] [10][11][12][13][14]. A description of some of these studies and their differences from the proposed method can be found in Section 6.
The proposed system contains the following steps: • Normalization of the original image.
• Feature extraction using fractal encoding of the normalized image and genetic algorithm.
• Application of the bi-dimensional principal component analysis (2DPCA) technique on the fractal code to reduce the feature vector dimension.
• Classification using Multi layer perceptron.
The idea proposed in this article has two major advantages compared with the other approaches: • Reduced size of the fractal code represents the feature vector. Since it has a reduced dimension, the recognition can be ensured with satisfactory time. We have proposed a new fractal algorithm based on genetic algorithm to ensure a low time for feature extraction step.
• High fidelity compared with the original image. The fractal code represents discriminant features of the original image. These features are invariant overlooked lighting, rotation, and translation of the face and scaling, because the IFS theory takes into account these variations.
We proposed to apply a 2DPCA to represent face by a few fractal features having a high discriminatory power.
This article is organized as follows: Basic notions concerning IFS, fractal coding theory and the new fractal algorithm based on genetic algorithm are provided in Section 2. Fractal features are presented in section 3. The most discriminating fractal parameters extracted using 2DPCA are described in Section 4. Section 5 provides face recognition system based on neural networks, the experimental results and Comparison between the two types of features obtained using IFS and PCA-IFS, respectively. A comparison with other approaches is also done in section 6. Conclusion and future works are presented in Section 7.

A. IFS theory
The IFS theory is proposed by Barnsley, who suggested that, instead of storing all the pixels of the still image, we can keep only a collection of global contracting transformations such as rotation and contrast scaling [15].
Image fractal encoding is well known in the literature. It has been widely used for image compression [9,16]. In this article, we have used it for classification purpose.
Firstly, the coding involves the partitioning of the image into ranges R i , which do not intersect and can have fixed size or not (quadtree partitioning), and domain D i which can intersect. Secondly, we have searched the best range/ domain matching by applying a transformation W i to each domain D i (see Figure 1). This is possible because fractal coding is based on the self-similarity of the face, which means that regions can be the transformed versions of some others like shown in Figure 2.
Therefore, to code an image, we need to determine a set of R i , D i , and W i . To achieve an excellent coding phase, we should make a good choice of transformation W i between both R i and D i . Then, we have to find the perfect adjustment of the contrast S i and the lighting O i for each W i using the method of least square [9].

B. The proposed algorithm
The major problem of standard fractal coding is time consumption compared with other methods of image coding. The time is essentially spent on the search of the similar domain block. We present in this article, a new genetic algorithm for image coding, that speeds up this method. In the next, we have detailed our algorithm: the representation of the fitness function, the Genetic operators and some other improvements to the simple genetic algorithms.
There are many algorithms of optimization used for different domains. We have chosen genetic algorithm [17][18][19] to accelerate our fractal image coding algorithm. We have given details of genetic characteristics in the following section.

1) Chromosome attributes
According to the regions parameter coding, a chromosome is constituted by N genes, where N is the number of regions not yet coded.
The gene is composed of three parameters (X Dom , Y Dom ), that represent the domain block coordinates and the rotation W i . These three parameters are integers.

2) Genetic operators
The crossover and mutation operators ensure the production of offspring. These genetic operators must be With these basic components, a genetic algorithm works as follows: The first procedure is to generate the first population represented with string codification (chromosome) that represents possible solution to the problem. Each individual is evaluated, and according to its fitness, an associated probability to be selected for reproduction is assigned.
• The crossover operator combines two individuals (the parents) of the current generation whose chromosomes have not given selected solution to produce two offspring individuals. According to our chromosome specification, a new scheme of the crossover operator is proposed. The offspring coordinates and the isometric flip are selected randomly from the parents as presented in Figure 4.

3) Fitness measure
The fitness function assigns to each individual in the population a numeric value, that determines its quality as a potential solution. The fitness denotes the individual's ability to survive and to produce offspring.
In our case, the fitness is the number of regions that can be coded with root mean square error (RMSE)less than a fixed value. The RMSE is the distance between the region and the domain block is determined by its coordinates (X Dom , X Dom ) and transformed with corresponding contrast S and the lighting O.   The RMSE parameter is given in the following: || is the two norm function, D i is domain elements, R j denotes the range elements, and values of contrast S and lighting O are obtained when minimizing the RMSE criterion (they are the two arguments that minimize the RMSE).

4) Genetic coding algorithm
Genetic algorithms have been used previously to find solutions to the minimization problems related to the fractal inverse problem [18]. Here, we describe the Genetic Algorithm that we have used to speed up the coding algorithm. This algorithm is used for all decomposition schemes. In spite of the range block size and position, the domain block is always double the size of the range one.

III. Fractal features extraction
After fractal coding, where each domain is compared with all regions of the image, we obtain a set of transformations which can approximate the face image. Each transformation is represented by parameters of contrast S i , brightness O i , spatial coordinates of Range/Domain, and rotation W i (seven parameters). The size of the obtained feature matrix is equal to 7× the number of transformations necessary to code all regions. So reducing the size of the information is necessary for minimizing the recognition time. An immediate reduction of the feature vector consists of replacing the coordinates of the regions and domains by two normalized distances: • x: the distance between the Domain D i and the region R i according to the abscissas, • y: the distance between the Domain D i and the region R i as the ordinates.
The size of the new matrix is then equal to 5× the number of transformations.
Despite all the reductions of the fractal vector, it remains quite large. Thus, we proposed to use a twodimensional PCA to extract the most discriminating features.

IV. The discriminating parameters of fractal features
The 2DPCA is a method of data analysis, based on finding a new reference on which we represent the information while keeping only discriminating data [20]. As opposed to conventional PCA, 2DPCA is based on matrices rather than vectors. Consequently, the covariance matrix can be constructed directly using original matrix of features. So, when using 2DPCA, it is easier to evaluate the covariance matrix, and less time is required to determine the corresponding eigenvectors.
The idea consists of projecting each feature matrix X (n × m) through a linear transformation. The first step is to calculate the covariance matrix G t of fractal features which is obtained from the images of the training database as follows: where M is the number of images in the database, X j represents the fractal matrix obtained from the image number j of the training database, andX is the average of all fractal matrices associated to the images from the training database.
The next step is to choose d eigenvalues associated with eigenvectors obtained from the previously calculated covariance matrix. These eigenvalues determine the new reference that minimizes the criterion J (R) defined by: The third step is to extract the main features of X as follows: is the fractal feature matrix produced after applying 2DPCA.
To project the matrix in the new base, we have selected the eigenvectors associated with the largest eigenvalues. The biggest shortcoming of 2DPCA is the choice of the number of retained eigenvalues. To solve this problem, researchers have adopted different solutions, either heuristically [21] or graphically according to the shape of eigenvalues [22]. In this article, we used a graphically method to select the most important eigenvectors.

A. Overview of the used face databases
To highlight the performances of the proposed system, we have carried out the first experiment on the Yale database [23], with the aim of pinpointing the behavior of our approach under changing face expressions and poses. This base contains 165 images of 15 individuals. In this experiment, 30% of all image samples per class are chosen randomly and are used for training, and the remaining images for test. The proposed approach has also been applied on the ORL database [24], which contains 10 different images of each of the 40 distinct individuals. For this database also, 30% image samples per class are chosen randomly, and are used for training and the remaining images for test. In the ORL database, images are taken at different lighting conditions, facial expressions, and orientations which allows testing the behavior of our approach under these changes.

B. The classification system
The face recognition was ensured by a multilayer perceptron architecture. The training of weights is assured by the algorithm of retro-propagation. This architecture is the most used one because it can reduce miss-classification among the neighborhood classes.

C. Face recognition using fractal features
In order to have fractal feature vectors with the same length, the size of the face must be normalized (32 × 32). The normalized image is coded by 64 transformations using fractal code. Consequently, we obtained 320 fractal features as each transformation is coded on 5 parameters, as already explained in Section 3. Table 1 shows the performance of our system using fractal features for the two databases. Each of the recognition rate has been tested on five random combination of the face samples. According to these observations, fractal features gave very good results.

D. Face recognition using 2DPCA-IFS features
Given the large number of the used parameters (320 parameters), our idea was to apply a bi-dimensional PCA on the fractal features for the two already-mentioned databases. After a 2DPCA was applied on all features matrices, we were able to achieve the following results found in Table 2.
We study here the relation between the number of transformation used as features and the recognition rate. We have found that recognition rate grows until transformation number reaches a value about 5, and then it remains almost constant as transformation number continues to grow. Consequently, the compromise recognition rate/transformation (the optimal situation is given with the minimum number of transformation and a satisfactory recognition rate) number is solved for five eigenvectors for the two databases. This can be explained according to the variation of the eigenvalues. We can keep until the fifth eigenvalue and the remaining eigenvalues can be neglected as shown in Figure 6. Here, we preserve all the five parameters for each transformation, for the two databases.
From the previous analysis, we can notice that the best choice to keep is five transformations where each one is coded by five parameters to ensure a good recognition phase.

E. Comparison between IFS and 2DPCA-IFS features
In Table 3, we present the recognition rate obtained, when using all fractal features, and those reduced by the 2DPCA with the optimal configuration (five transformations where each one is coded by five parameters). While comparing the two methods, we can deduce those recognition rates that were found using the two databases which did not decrease significantly. The major advantage of 2DPCA-IFS method is that the number of parameters decreases from 320 parameters with IFS to only 25 parameters with 2DPCA-IFS, which can reduce the recognition time while keeping a very satisfactory recognition rate.

VI. Comparison with other approaches
Here we compare our results with earlier results published in [10][11][12]. ORL database is used for comparison. The FND method [10] consists of three steps: • A standard fractal coding giving a code for each image in the database.
• Each image I is decoded with each code in the database to generate the output image called the attractor.
• A classification step for each test image I using the minimization of FND distance d FN (fractal neighbor distance) defined as the distance between the test image and the attractor image: where f j is the jth fractal code in the database, f j (I ) is the decoded image using the code f j .
The LR-SNN-T method [11] is based on an equalization of the original image and a normalization of its dimension using a bicubic interpolation. The feature vector is represented by the whole image after processing. The classification is achieved by a multilayer perceptron.
Finally, the X method [12] consists on extracting parts of face, containing the most discriminating information like eyes, nose... Then applying a standard fractal coding on each detected part, the classification is also ensured by a multilayer perceptron.
Although both approaches, the FND and X, use the IFS coding, they are completely different from our approach. This difference is summarized in Table 4.
The performance comparison between all approaches is shown in Table 5 and Figure 7.
We conclude that • Fractal features are much more powerful than others and are good means to characterize faces.
• Five eigenvalues are sufficient to code faces. A little improvement is observed when more than five eigenvalues are used.
• The robustness of our 2DPCA-IFS approach is that it gives the best time recognition, and thanks to the use of genetic algorithm and 2DPCA technique,    which keeps a high recognition rate proving its applicability for real time system.

VII. Conclusion
A hybrid approach is introduced in which, through the 2DPCA, the most discriminating genetic fractal features are extracted and used as the input of a neural network. The performance of our method is both due to the fidelity of fractal coding for representing images, the genetic algorithm to speed up the features extraction step, and the 2DPCA which highlights all discriminating features.
Compared with other approaches, the proposed recognition method has achieved high recognition rate and low recognition time for the two databases.