SampleMono: Multi-Frame Spatiotemporal Extrapolation of 1-spp Path-Traced Sequences via Transfer Learning
| dc.contributor.author | Derin, Mehmet Oguz | en_US |
| dc.contributor.editor | Christie, Marc | en_US |
| dc.contributor.editor | Han, Ping-Hsuan | en_US |
| dc.contributor.editor | Lin, Shih-Syun | en_US |
| dc.contributor.editor | Pietroni, Nico | en_US |
| dc.contributor.editor | Schneider, Teseo | en_US |
| dc.contributor.editor | Tsai, Hsin-Ruey | en_US |
| dc.contributor.editor | Wang, Yu-Shuen | en_US |
| dc.contributor.editor | Zhang, Eugene | en_US |
| dc.date.accessioned | 2025-10-07T06:05:17Z | |
| dc.date.available | 2025-10-07T06:05:17Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Path-traced sequences at one sample per pixel (1-spp) are attractive for interactive previews but remain severely noisy, particularly under caustics, indirect lighting, and volumetric media. We present SampleMono, a novel approach that performs multi-frame spatiotemporal extrapolation of low-resolution and low-sample Monte Carlo sequences without requiring auxiliary buffers or scene-specific information. We transfer and prune a pre-trained video generation backbone and fine-tune it on SampleMono GYM, a synthetic Monte Carlo dataset, to generate four clean high-resolution frames from a longer window of noisy inputs, thereby decoupling render and presentation timelines. Our experiments demonstrate that by combining a frozen VAE encoder-decoder and training of a video generation model pruned to two transformer layers, our pipeline can both provide spatial upsampling and temporal extrapolation to a long sequence of 16 RGB frames of 50 milliseconds time delta between frames at 256×144 resolution with severe Monte Carlo noise, generating subsequent four RGB frames of 12.5 milliseconds time delta between frames at 1280×720 resolution with substantially reduced noise at varying quality while fitting VRAM budget of 5GB. We plan to publish the code for data GYM, model pruning, pipeline training, and rendering. | en_US |
| dc.description.sectionheaders | Posters and Demos | |
| dc.description.seriesinformation | Pacific Graphics Conference Papers, Posters, and Demos | |
| dc.identifier.doi | 10.2312/pg.20251307 | |
| dc.identifier.isbn | 978-3-03868-295-0 | |
| dc.identifier.pages | 2 pages | |
| dc.identifier.uri | https://doi.org/10.2312/pg.20251307 | |
| dc.identifier.uri | https://diglib.eg.org/handle/10.2312/pg20251307 | |
| dc.publisher | The Eurographics Association | en_US |
| dc.rights | Attribution 4.0 International License | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | CCS Concepts: Computing methodologies → Transfer learning; Ray tracing; Tracking | |
| dc.subject | Computing methodologies → Transfer learning | |
| dc.subject | Ray tracing | |
| dc.subject | Tracking | |
| dc.title | SampleMono: Multi-Frame Spatiotemporal Extrapolation of 1-spp Path-Traced Sequences via Transfer Learning | en_US |