Antwerpen, Dietger vanCarsten Dachsbacher and William Mark and Jacopo Pantaleoni2016-02-182016-02-182011978-1-4503-0896-02079-8687https://doi.org/10.1145/2018323.2018330Monte Carlo Light Transport algorithms such as Path Tracing (PT), Bi-Directional Path Tracing (BDPT) and Metropolis Light Transport (MLT) make use of random walks to sample light transport paths. When parallelizing these algorithms on the GPU the stochastic termination of random walks results in an uneven workload between samples, which reduces SIMD efficiency. In this paper we propose to combine stream compaction and sample regeneration to keep SIMD efficiency high during random walk construction, in spite of stochastic termination. Furthermore, for BDPT and MLT, we propose to evaluate all bidirectional connections of a sample in parallel in order to balance the workload between GPU threads and improve SIMD efficiency during sample evaluation. We present efficient parallel GPU-only implementations for PT, BDPT, and MLT in CUDA.We show that our GPU implementations outperform similarCPU implementations by an order of magnitude.I.3.7 [Computer Graphics]Three DimensionalGraphics and Realism Ray TracingMonte Carlo Light TransportPath TracingGPUImproving SIMD Efficiency for Parallel Monte Carlo Light Transport on the GPU10.1145/2018323.201833041-50