Beyond FID: Human Perceptual Judgments Reveal Systematic Blind Spots in GAN Face Evaluation

Nierula, Birgit; Melnik, Anna; Barthel, Florian; Brama, Aileen; Hilsmann, Anna; Eisert, Peter; Nikulin, Vadim V.; Gaebler, Michael; Klotzsche, Felix; Chen, Yonghao; Stephani, Tilman; Bosse, Sebastian

Beyond FID: Human Perceptual Judgments Reveal Systematic Blind Spots in GAN Face Evaluation

dc.contributor.author	Nierula, Birgit
dc.contributor.author	Melnik, Anna
dc.contributor.author	Barthel, Florian
dc.contributor.author	Brama, Aileen
dc.contributor.author	Hilsmann, Anna
dc.contributor.author	Eisert, Peter
dc.contributor.author	Nikulin, Vadim V.
dc.contributor.author	Gaebler, Michael
dc.contributor.author	Klotzsche, Felix
dc.contributor.author	Chen, Yonghao
dc.contributor.author	Stephani, Tilman
dc.contributor.author	Bosse, Sebastian
dc.contributor.editor	Musialski, Przemyslaw
dc.contributor.editor	Lim, Isaak
dc.date.accessioned	2026-04-20T08:01:34Z
dc.date.available	2026-04-20T08:01:34Z
dc.date.issued	2026
dc.description.abstract	Generative Adversarial Networks (GANs) can synthesize highly realistic facial images from random noise vectors. The Fréchet Inception Distance (FID) is widely used as a standard metric to automatically evaluate the quality of GAN-generated images. However, it remains unclear to what extent this statistical measure reflects human perceptual judgments, which ultimately define image realism in practical applications. To address this, we conducted a psychophysical study in which participants (n = 20) performed a two-alternative forced-choice task, assessing actual photographs and GAN-generated images as real or fake. We show that while FID provides a reliable global ordering of image quality, it systematically fails for localized semantic artifacts (e.g., eyewear and skin texture) that disproportionately affect human realness judgments. This demonstrates that FID and human perception are not merely noisy versions of the same signal, but that FID has systematic blind spots for localized semantic artifacts that disproportionately drive human realism judgments.
dc.description.sectionheaders	Faces, Characters & Human Modeling
dc.description.seriesinformation	Eurographics 2026 - Short Papers
dc.identifier.doi	10.2312/egs.20261007
dc.identifier.isbn	978-3-03868-299-8
dc.identifier.issn	2309-5059
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/egs20261007
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/egs20261007
dc.publisher	The Eurographics Association
dc.rights	CC-BY-4.0
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Models of computation
dc.subject	Interactive computation
dc.subject	Computer Graphics
dc.title	Beyond FID: Human Perceptual Judgments Reveal Systematic Blind Spots in GAN Face Evaluation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: egs20261007.pdf
Size:: 1.43 MB
Format:: Adobe Portable Document Format

Download

Collections

EG 2026 - Short Papers