Harvey, CarloDebattista, KurtBashford-Rogers, ThomasChalmers, AlanChen, Min and Zhang, Hao (Richard)2017-03-132017-03-1320171467-8659https://doi.org/10.1111/cgf.12793https://diglib.eg.org:443/handle/10.1111/cgf12793A major challenge in generating high‐fidelity virtual environments (VEs) is to be able to provide realism at interactive rates. The high‐fidelity simulation of light and sound is still unachievable in real time as such physical accuracy is very computationally demanding. Only recently has visual perception been used in high‐fidelity rendering to improve performance by a series of novel exploitations; to render parts of the scene that are not currently being attended to by the viewer at a much lower quality without the difference being perceived. This paper investigates the effect spatialized directional sound has on the visual attention of a user towards rendered images. These perceptual artefacts are utilized in selective rendering pipelines via the use of multi‐modal maps. The multi‐modal maps are tested through psychophysical experiments to examine their applicability to selective rendering algorithms, with a series of fixed cost rendering functions, and are found to perform significantly better than only using image saliency maps that are naively applied to multi‐modal VEs.A major challenge in generating high‐fidelity virtual environments (VEs) is to be able to provide realism at interactive rates. The high‐fidelity simulation of light and sound is still unachievable in real time as such physical accuracy is very computationally demanding. Only recently has visual perception been used in high‐fidelity rendering to improve performance by a series of novel exploitations; to render parts of the scene that are not currently being attended to by the viewer at a much lower quality without the difference being perceived. This paper investigates the effect spatialized directional sound has on the visual attention of a user towards rendered images. These perceptual artefacts are utilized in selective rendering pipelines via the use of multi‐modal maps.multi-modalcross-modalsaliencysoundgraphicsselective renderingI.3.3 [Computer Graphics]: Picture/Image Generation—Viewing AlgorithmsI.4.8 [Computer Graphics]: Image Processing and Computer Vision—Scene Analysis - Object RecognitionI.4.8 [Computer Graphics]: Image Processing and Computer Vision—Scene Analysis - TrackingMulti-Modal Perception for Selective Rendering10.1111/cgf.12793