Department of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Received: 26-Aug-2022, Manuscript No. GRCS-22-73053; Editor assigned: 29-Aug-2022, Pre QC No. GRCS-22-73053 (PQ); Reviewed: 12-Sep-2022, QC No. GRCS-22-73053; Revised: 19-Sep-2022, Manuscript No. GRCS-22-73053 (R); Published: 26-Sep-2022, DOI: 10.4172/2229-371X.10.3.001.
Visit for more related articles at Journal of Global Research in Computer Sciences
Face recognition systems had become more used over time by multiple organizations or even individuals to achieve different and various purposes such as secure authentication or solving crimes. As the emergence of the COVID-19 virus pandemic had an impact on these systems consequently it forced people to wear masks most of the time which reduced the face recognition system's nicety. For that, different techniques of masked face recognition-based deep learning systems have appeared. This paper introduces the main concepts of deep learning and faces recognition to give the reader an overall comprehensive view, and presents the most recent research in this field, alongside proffers to use of the integration of masked face recognition and deep learning that involve Convolution Neural Network (CNN) to obtain precise results. To conclude, many researchers approved that the integration between these two fields is give satisfactory outcomes. For future work, more methods of masked face recognition-based deep learning will be discovered.
Deep learning; Machine learning; Masked face recognition; Face recognition; Convolution Neural Network (CNN)
Face recognition technology is used to identify and verify the identity of the selected face in a photo, video, or in real-time to be used for different purposes like authentication, face payment, criminal identification, security, or even to unlock your phone. Using this technology based on deep learning which a part of machine learning is has become a very interesting research area. As everyone knows, the world has passed in the recent three years with the COVID-19 virus pandemic, which obligated everyone to wear masks to limit its spread; so many scientists believe that this virus will evolve and that new strains may appear which will make the issue of wearing masks last for a long time. Therefore, many computer scientists are currently developing face recognition technology to be Masked Face Recognition. Using deep learning that involves a Convolution Neural Network (CNN) to recognize the face is very useful but in the case of face wearing a mask it will cause some limitations according to most of the face will be covered with a mask. Note that the vast majority of people are currently wearing masks most of the time as a result that there are many consequences, such as an increasing the rate of crime. Now, different methods explored by researchers to identify masked faces some of them focusing on the eye and the forehead or eyebrow areas. As well as, removing the mask from the face by using different methods in order to compare masked and unmasked faces to recognize it, fine-tuned existing face recognition application to identify masked faces .
Therefore, this paper at the beginning will introduce the main concept of deep learning and its major types in section II, as face recognition technology with its main approaches and tasks explains obviously in section III. Moreover, section IV provides integration between face recognition and deep learning that assist in identifying masked faces, also a discussion and analysis of current research summarized in section V. Finally, section VI offers the conclusion and future work.
Artificial Intelligence AI, Machine Learning ML, Deep Learning DL, and Neural Network terminologies are particularly related to each other. Expressively, deep learning is part of machine learning, and both fall below artificial intelligence which will explain the differences between them later in the discussions and analysis section. As for the relation between deep learning and the neural network, it can release the term deep learning when the neural network contained more than three layers specifically more than one hidden layer besides the two basic input and output layers that need to apply a set of learning algorithms. In short, deep learning is based on a neural network that has a depth of hidden layers and is named sometimes deep neural networks . Figure 1 depicted the four popular types of learning techniques that can be summarized as follows:
It is also called discriminative learning and is required to train a model on a labelled dataset that has an input and output value which is based on finding the mapping function from input to the output value via the prediction that relied on analyzing previously learned data. Also, it is divided into two categories, the first classification that has discrete output such as predicting a category, and the second regression that has continuous output such as predicting a number . An example of supervised learning is the Convolutional Neural Network (CNN) which is the most model popular used mainly in the image for its ability to extract essential patterns from images without the intervention of humans. Besides, it has multiple layers of convolution, pooled, or entirely connected to gain high accuracy. As well as Recurrent Neural Network (RNN) which depends on the internal memory to remember important information about the received input to predict what will happen next in precise .
It is also called generative learning and is required to train a model that has unstructured data and an unlabeled dataset that has only input value, and the model by itself has to define its own learning to obtain output value without the help of any previously learned data. Also, it is divided into three categories, the first clustering which clusters the data based on differences and similarities, the second the association that finds the collections of elements in the dataset that occur together, and the third is the dimensionality reduction which reduces the dimensions of an existing dataset to prepare it for processing using a supervised algorithm. An example of unsupervised learning is the Generative Adversarial Network (GAN) used to generate new outcome samples from the original dataset through self-learning of patterns or regularity in the input data.
Semi supervised learning or hybrid
It is composed of unlabeled and labeled data with unsupervised and supervised learning approaches that group similar data by using unsupervised learning to train the model on various clusters of data labeling an unlabeled data. An example of unsupervised learning is CNN+GAN, which is integration unsupervised followed by a supervised model.
It consists of agent, environment, state, action, reward, and policy which can be defined as training an agent or entity to do a specific function by interacting with a dynamic environment that had described its state and providing a list of permissible actions to let the agent behave upon the trial-and-error principle based on current policy to increase rewards that maximize performance. An example of reinforcement learning is games.
Face recognition is a technique used to detect or verify faces that are found in a video frame or image in front of an existing faces database. Typically, it consists of pattern recognition and computer vision that utilize analyzing images and data to extract facial features that assist in face matching operation. Although, the face is not the ideal choice compared with other biometric measurements such as fingerprint or iris in achieving precise results according to some limitations that may appear such as masquerade or lighting. But is still the most familiar biometric feature preferred and used over years. The following part illustrates the main two approaches as shown in Figure 2, and face recognition tasks:
Face recognition approaches
Classical and modern approaches are the main approaches in face recognition technology, and it explained briefly in the following :
It is a 2D face recognition that uses video or images from the camera, surveillance system, CCTV, or other. Indeed, it works by detecting the face in a video or image, segmenting it from the detected region, aligning it to a determined structure, and treated in the case of lighting changes. After that, it extracts features from an aligned and treated image in order to get the final identity recognition through a suitable classification that relied on the calculated features. Based on the classification and extraction types it had divided into three main approaches described smoothly below:
Also called subspace-based techniques that assume any facial image collection contains redundancies that can be deleted via the tensor's decomposition. This approach provides a set of basis vectors that reflect a low dimension space while keeping the original groups of images. There are two types of this approach linear and non-linear techniques rely on subspace representation.
This type of approach is deal with a few facial traits; also it is more focused on facial expressions, pose, and occlusions. In sense, its basic goal is to find landmark features. Two types of this approach can be defined as the key point-based technique for detecting landmarks placed on the image and the local appearance-based technique for local feature extractions from a small divided image .
This approach is based on local and holistic features to obtain the benefit of both types and to raise face recognition performance.
There are new and modern approaches and theories defined recently; this section will specifically focus on the paper's desire which is the deep learning approach.
Deep learning approach:
Using deep learning in face recognition will enhance accuracy, raise performance, and achieve great results according to its ability to classify a high number of unlabeled faces. Identically, the deep learning approach has advanced on all types of classical approaches by using CNN that trained across a supervised way . Many different methods used CNN to recognize faces that can be categorized according to the most famous architecture types such as AlexNet, VGGNet, GoogleNet, LeNet, and ResNet.
Face recognition tasks:
• Face matching, face similarity, and face transformation are the basic tasks used to recognize faces that can be described in a nutshell as:
•Face matching: That is the process of comparing two or more faces maybe are peculiar to the spectator to make sure they belonged to the exact person. Correspondingly, this task includes face tracking, face clustering, face verification, and face identification processes. Additionally, this is an important task in face recognition, and it had very benefits applicated in different places such as airports, banks, and other places that need supervision [8,9].
•Face similarity: The output of the previous process which is matched face will be used in this stage that measures similarities between the original face and matched face based on different and various features such as distance between eyes, skin texture, skin colour, pose, expressions, and so on.
•Face transformation: This is the process of transforming the face into a digital form that is conducive to performing specific operations which help in analysing the face features signals to extract useful information. The following section is the most essential part of this paper as it demonstrates the significance of deep learning and its usage in recognizing masked faces.
Integrating face recognition and deep learning techniques had many benefits such as improving security in commercial operations or public spaces, wherefore most of the researchers direct their heedfulness in this range. This section provides the integration between these two main fields that yield to assist in acquiring successful and accurate outcomes in detecting masked faces and recognizing them by using extraction features and CNN. Furthermore, different types of datasets can be used to train the model such as Masked Faces in Real World for Face Recognition (MFR2) contains 279 images of 53 identities, and the Masked Face Detection Dataset (MFDD) which contains 24771 images, and so on. As seen in Figure 3 three main processes that create masked face recognition based on deep learning thusly.
Masked face detection
To detect faces wearing masks there are multiple deep learning techniques that can be used in this stage such as R-CNN, or Faster R-CNN based on the observation it has the ability in getting precise outcomes in a faster time to raise performance.
To extract features such as eyes with forehead, and eyebrows from masked faces various methods can be used such as ResNet, AlexNet, Context-aware CNN, and more.
Masked face recognition
This is the final process in the masked face recognition-based deep learning approach that is used to recognize a masked face, and it can be working with Domain Constrained Ranking (DCR) or FaceNet, and more for identity matching.
Masked Face Recognition (MFR) technology that is based on deep learning is consisted of using two primary fields, the first one is computer vision which allows computers to gain meaningful information derived from real-world images after applying specific processes and analysis operations to obtain decisions or actions. The second primary field is deep learning which is part of machine learning, and it concerns learning machines or computers alone by itself without human intervention depending on the Convolution Neural Network (CNN) that is similar to the human brain. Both fields are considered under Artificial Intelligence (AI) which focuses on making machines perform tasks in an intelligent way like humans and improving themselves by using collected data .
To detect and recognize people who wear masks the most time as a result of the COVID-19 pandemic based on deep learning technique, multiple researchers made use of the integration of face recognition and deep learning to recognize the masked face to gain accurate results with high performance. According to Alzu'bi, et al. reviewed the most recent algorithms, architectures, models, and approaches used in the pipeline of MFR, common evaluation matrix, and benchmarking datasets that could be used.
Hariri introduced a new method to remove masks from faces and focused on the eye with the forehead area to extract its features by applying three pre-trained CNN which are RestNet-50, VGG-16, and AlexNet. Then, he applied the paradigm of the bag-of-features, and for the classification process, he applied multilayer perceptron (MLP). Also, he used Simulated Masked Face Recognition Dataset (SMFRD) and Real-World-Masked-Face-Dataset (RMFRD) which resulted in high performance of recognition.
In another study researchers proposed a new method that combined Local Binary Pattern (LBP) with deep learning by using a Retina Face detector that deals with different face scales. Besides they extract LBP features of the eye with the forehead and eyebrow area from the masked face in order to compare it with the Retina Face features to recognize the masked face . Also, they used the Essex dataset and introduced a new dataset named COMASK20 consisting of 300 subjects which both result in 98% f1-score and 87% f1-score respectively which present the suitability and effectiveness of the proposed method.
In a simple way, researchers presented how to recognize masked faces on images or videos and indicate if that face wears a mask or not. They differed only by the type of technique used, for researchers used tools of machine learning such as Keras, Scikit-Learn, TensorFlow, and OpenCV. In addition, they used the optimal value of parameters for CNN and applied on with/without mask dataset from Kaggle.com which consisted of 3832 images that yield accurate results. On the other hand, researchers used deep learning techniques and a dataset of COVID-19 face mask detection that consisted of 1376 images which achieved efficient and accurate results [12,13].
Researchers provided an experimental evaluation based on the most three recent lightweight CNN Face recognition models which are VarGFaceNet, MobileFaceNet, and ShuffleFaceNet. These models had studied by researchers to address masked faces recognition efficiently and accurately by using two approaches that are fine-tuning these models with masked faces and using periocular images. Also, researchers implemented the CASIAWebFace dataset, and RMFR-CEN database and used well-established Face benchmarks to compare between two approaches which concluded that fine-tuned models appeared better performance than those using the periocular area .
Finally, researchers proposed a new method of MFR that consisted of three CNN which are the SSD model, Hourglass Network, and FaceNet. This proposed method showed effective results .
Table 1 summarized all previous works presented in this paper. Also, it is analyzed according to the defined criteria that are the model used, features extraction area, dataset, and accuracy (Table 1).
|||RestNet-50, VGG-16, and AlexNet||The eye with the forehead area||RMFRD datasetSMFRD dataset||91.3%
|||Deep ensemble model and Linear Binary Patterns (LBP)||The eye with the forehead and eyebrow area||Essex dataset COMASK20 dataset||98%
|||CNN||Face mask or unmask||Kaggle.com||-|
|||CNN||Face mask or unmask||COVID-19 face mask detection dataset||98%|
|||VarGFaceNet, MobileFaceNet, and ShuffleFaceNet||Periocular||CASIAWebFace dataset||94%|
|||SSD, Hourglass Network, and FaceNet||Eye-brow area||-||-|
Table 1. Table showing various formulations, their target and outcomes which are currently being researched.
Despite the decline in the number of people infected with Coronavirus and governments allowing removing masks, this matter did not reduce the importance of proposing new methods. To identify masked faces to be fully prepared in case such crises recur. For that, this paper presented deep learning and faces recognition basic concepts, approaches, and tasks, side by side to analyse and discuss current research in this domain, and presented the main processes that contribute to building a masked face recognition-based deep learning model. Finally, new approaches will be involved and discussed in future work.
I would like to thank my supervisor, Dr. Mutasem Jarrah, for his great efforts and helped with his advice and recommendations in writing this paper.
[Crossref][Google Scholar] [Pubmed]
[Crossref][Google Scholar] [Pubmed]