Dickson, AnthonyKnott, AlistairZollmann, StefanieLee, Sung-Hee and Zollmann, Stefanie and Okabe, Makoto and Wünsche, Burkhard2021-10-142021-10-142021978-3-03868-162-5https://doi.org/10.2312/pg.20211394https://diglib.eg.org:443/handle/10.2312/pg20211394The capture and creation of 3D content from a device equipped with just a single RGB camera has a wide range of applications ranging from 3D photographs and panoramas to 3D video. Many of these methods rely on depth estimation models to provide the necessary 3D data, mainly neural network models. However, the metrics used to evaluate these models can be difficult to interpret and to relate to the quality of 3D/VR content derived from these models. In this work, we explore the relationship between the widely used depth estimation metrics, image similarly metrics applied to synthesised novel viewpoints, and user perception of quality and similarity on these novel viewpoints. Our results indicate that the standard metrics are indeed a good indicator of 3D quality, and that they correlate with human judgements and other metrics that are designed to follow human judgements.General and referenceEvaluationComputing methodologiesComputer graphicsNeural networksComputer visionUser-centred Depth Estimation Benchmarking for VR Content Creation from Single Images10.2312/pg.2021139471-72