Story2Board: A Training-Free Approach for Expressive Visual Storytelling

dc.contributor.authorDinkevich, David
dc.contributor.authorLevy, Matan
dc.contributor.authorAvrahami, Omri
dc.contributor.authorSamuel, Dvir
dc.contributor.authorLischinski, Dani
dc.contributor.editorMasia, Belen
dc.contributor.editorThies, Justus
dc.date.accessioned2026-04-17T08:01:29Z
dc.date.available2026-04-17T08:01:29Z
dc.date.issued2026
dc.description.abstractWe present Story2Board, a training-free framework for expressive storyboard generation from natural language. Existing methods narrowly focus on subject identity, overlooking key aspects of visual storytelling such as spatial composition, background evolution, and narrative pacing. To address this, we introduce a lightweight consistency framework composed of two components: Latent Panel Anchoring, which preserves a shared character reference across panels, and Reciprocal Attention Value Mixing, which softly blends visual features between token pairs with strong reciprocal attention. Together, these mechanisms enhance coherence without architectural changes or fine-tuning, enabling state-of-the-art diffusion models to generate visually diverse yet consistent storyboards. To structure generation, we use an off-the-shelf language model to convert free-form stories into grounded panel-level prompts. To evaluate, we propose the Rich Storyboard Benchmark, a suite of open-domain narratives designed to assess layout diversity and background-grounded storytelling, in addition to consistency. We also introduce a new Scene Diversity metric that quantifies spatial and pose variation across storyboards. Our qualitative and quantitative results, as well as a user study, show that Story2Board produces more dynamic, coherent, and narratively engaging storyboards than existing baselines. Project page: https://daviddinkevich.github.io/Story2Board/
dc.description.number2
dc.description.sectionheadersTemporal Vision: Video Generation, Pose, and Narrative
dc.description.seriesinformationComputer Graphics Forum
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume45
dc.identifier.doi10.1111/cgf.70319
dc.identifier.issn1467-8659
dc.identifier.pages26 pages
dc.identifier.urihttps://diglib.eg.org/handle/10.1111/cgf70319
dc.identifier.urihttps://doi.org/10.1111/cgf.70319
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.
dc.rightsCC-BY-4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCCS Concepts: Imaging/Video → Neural Image/Video Synthesis; Methods/Applications → Artificial Intelligence/Machine Learning;
dc.subjectCCS Concepts
dc.subjectImaging/Video → Neural Image/Video Synthesis
dc.subjectMethods/Applications → Artificial Intelligence/Machine Learning
dc.titleStory2Board: A Training-Free Approach for Expressive Visual Storytelling
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
cgf70319.pdf
Size:
71.39 MB
Format:
Adobe Portable Document Format