AdaFace is a state-of-the-art face recognition model, introduced at CVPR (Conference on Computer Vision and Pattern Recognition) in 2022. Its main contribution is the introduction of image quality as a new variable for the adaptive loss function. The authors argue that image quality should be considered when calculating the loss function of samples during training.
Motivated by the presence of unidentifiable facial images in both public and private datasets, the goal is to design a loss function that allocates different levels of importance to samples based on their difficulty in relation to image quality. The aim is to emphasize challenging samples for high-quality images while prioritizing easier samples for low-quality images.
The reason for differentiating the samples’ importance according to image quality is that solely emphasizing difficult samples would invariably give excessive weight to unidentifiable images. This is due to the fact that unidentifiable images always belong to the group of difficult samples, as random guesses are the only possible approach.
Consequently, by prioritizing samples based on image quality, the network can learn from "authentic" challenging pairs instead of introducing confusing elements through the inclusion of impossible/low-quality samples.
After recognizing the significance of distinguishing images based on quality, the next obstacle lies in determining an accurate and computationally efficient method to estimate the level of quality for a given picture.
Image quality encompasses a range of attributes such as brightness, contrast, and sharpness. While algorithms like SER-FIQ and BRISQUE exist for quality estimation, their computational demands make them impractical for training purposes. As a solution, the authors of AdaFace have chosen to utilize the feature norm as a substitute measure for image quality. This decision is supported by observations that in models trained with a margin-based softmax loss, the feature norm demonstrates a correlation trend with image quality. In the research paper, the correlation score between the feature norm and IQ score reaches 0.5235 (on a scale of -1 to 1) at the final epoch. The strong correlation between the feature norm and the IQ score provides solid evidence for employing the feature norm as a proxy for image quality.
Finally, the researchers demonstrate that when performing backpropagation, an adaptive angular margin has the ability to adjust the significance of a particular sample in relation to the rest. Essentially, the angular margin introduces an extra component in the gradient that amplifies the signal based on the difficulty level associated with the sample.
In order to compare against state-of-the-art techniques, the authors assessed the performance of ResNet100 trained with AdaFace loss on nine datasets, as shown in the tables below.
AdaFace achieves comparable results with respect to competitive methods on high-quality datasets. This is attributed to the emphasis on challenging samples during the training process. In the case of mixed-quality datasets, i.e. IJB-B and IJB-C, AdaFace exhibits an average error reduction of 10% compared to the second-best performing method. This outcome highlights the effectiveness of utilizing feature norms as a measure of image quality.
Finally, for low-quality datasets, the disparity between AdaFace and other baseline methods becomes even more significant. In fact, AdaFace demonstrates performance improvements exceeding 2-3%.
AdaFace has emerged as a leading state-of-the-art algorithm in the field of face recognition, primarily due to its innovative approach of leveraging image quality as a differentiating factor to control the gradient scale assigned to samples during training. This algorithm showcases outstanding performance across various dataset types, but truly excels when applied to mixed-quality/mixed-domains datasets.