Controllable Cinemagraph Generation from A Still Image

Abstract
We present a training-free framework for controllable cinemagraph generation, enabling the creation of hybrid visuals that combine still imagery with subtle, localized motion. Unlike prior approaches, which offer limited spatial and temporal control, our method allows users to explicitly specify static regions through user-provided masks and to modulate motion intensity across different areas of the scene. Built upon text-guided image-to-video diffusion models, we introduce a soft latent blending strategy that leverages the user-specified mask to seamlessly generate foreground motion while preserving a frozen background. In addition, we propose a new temporal spacing representation which is compatible in positional embedding space to enable fine-grained adjustment of motion characteristics—such as speed and amplitude—within a single video. To avoid motion collapse and unnatural dynamics caused by strong constraints on the first and last frames (i.e., enforcing identical frames), we introduce a two-stage generation strategy that first generates unconstrained motion and then softly enforces seamless looping to the initial frame. Our approach produces high-quality, user-controllable cinemagraphs with precise spatial and temporal fidelity, significantly expanding creative flexibility compared to existing methods.
Description

        
@inproceedings{
10.2312:egs.20261010
, booktitle = {
Eurographics 2026 - Short Papers
}, editor = {
Musialski, Przemyslaw
and
Lim, Isaak
}, title = {{
Controllable Cinemagraph Generation from A Still Image
}}, author = {
Le, Van Thanh
and
Ito, Daichi
and
Mahapatra, Aniruddha
and
Mai, Long
and
Singh, Krishna Kumar
and
Kulkarni, Kuldeep
and
Liu, Feng
and
Fu, Yun
and
Yoon, Jae Shin
}, year = {
2026
}, publisher = {
The Eurographics Association
}, ISSN = {
2309-5059
}, ISBN = {
978-3-03868-299-8
}, DOI = {
10.2312/egs.20261010
} }
Citation