Beyond FID: Human Perceptual Judgments Reveal Systematic Blind Spots in GAN Face Evaluation
Loading...
Date
2026
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association
Abstract
Generative Adversarial Networks (GANs) can synthesize highly realistic facial images from random noise vectors. The Fréchet Inception Distance (FID) is widely used as a standard metric to automatically evaluate the quality of GAN-generated images. However, it remains unclear to what extent this statistical measure reflects human perceptual judgments, which ultimately define image realism in practical applications. To address this, we conducted a psychophysical study in which participants (n = 20) performed a two-alternative forced-choice task, assessing actual photographs and GAN-generated images as real or fake. We show that while FID provides a reliable global ordering of image quality, it systematically fails for localized semantic artifacts (e.g., eyewear and skin texture) that disproportionately affect human realness judgments. This demonstrates that FID and human perception are not merely noisy versions of the same signal, but that FID has systematic blind spots for localized semantic artifacts that disproportionately drive human realism judgments.
Description
@inproceedings{10.2312:egs.20261007,
booktitle = {Eurographics 2026 - Short Papers},
editor = {Musialski, Przemyslaw and Lim, Isaak},
title = {{Beyond FID: Human Perceptual Judgments Reveal Systematic Blind Spots in GAN Face Evaluation}},
author = {Nierula, Birgit and Melnik, Anna and Stephani, Tilman and Bosse, Sebastian and Barthel, Florian and Brama, Aileen and Hilsmann, Anna and Eisert, Peter and Nikulin, Vadim V. and Gaebler, Michael and Klotzsche, Felix and Chen, Yonghao},
year = {2026},
publisher = {The Eurographics Association},
ISSN = {2309-5059},
ISBN = {978-3-03868-299-8},
DOI = {10.2312/egs.20261007}
}
