Story2Board: A Training-Free Approach for Expressive Visual Storytelling

Dinkevich, David; Levy, Matan; Avrahami, Omri; Samuel, Dvir; Lischinski, Dani

Story2Board: A Training-Free Approach for Expressive Visual Storytelling

dc.contributor.author	Dinkevich, David
dc.contributor.author	Levy, Matan
dc.contributor.author	Avrahami, Omri
dc.contributor.author	Samuel, Dvir
dc.contributor.author	Lischinski, Dani
dc.contributor.editor	Masia, Belen
dc.contributor.editor	Thies, Justus
dc.date.accessioned	2026-04-17T08:01:29Z
dc.date.available	2026-04-17T08:01:29Z
dc.date.issued	2026
dc.description.abstract	We present Story2Board, a training-free framework for expressive storyboard generation from natural language. Existing methods narrowly focus on subject identity, overlooking key aspects of visual storytelling such as spatial composition, background evolution, and narrative pacing. To address this, we introduce a lightweight consistency framework composed of two components: Latent Panel Anchoring, which preserves a shared character reference across panels, and Reciprocal Attention Value Mixing, which softly blends visual features between token pairs with strong reciprocal attention. Together, these mechanisms enhance coherence without architectural changes or fine-tuning, enabling state-of-the-art diffusion models to generate visually diverse yet consistent storyboards. To structure generation, we use an off-the-shelf language model to convert free-form stories into grounded panel-level prompts. To evaluate, we propose the Rich Storyboard Benchmark, a suite of open-domain narratives designed to assess layout diversity and background-grounded storytelling, in addition to consistency. We also introduce a new Scene Diversity metric that quantifies spatial and pose variation across storyboards. Our qualitative and quantitative results, as well as a user study, show that Story2Board produces more dynamic, coherent, and narratively engaging storyboards than existing baselines. Project page: https://daviddinkevich.github.io/Story2Board/
dc.description.number	2
dc.description.sectionheaders	Temporal Vision: Video Generation, Pose, and Narrative
dc.description.seriesinformation	Computer Graphics Forum
dc.description.seriesinformation	Computer Graphics Forum
dc.description.volume	45
dc.identifier.doi	10.1111/cgf.70319
dc.identifier.issn	1467-8659
dc.identifier.pages	26 pages
dc.identifier.uri	https://diglib.eg.org/handle/10.1111/cgf70319
dc.identifier.uri	https://doi.org/10.1111/cgf.70319
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.
dc.rights	CC-BY-4.0
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	CCS Concepts: Imaging/Video → Neural Image/Video Synthesis; Methods/Applications → Artificial Intelligence/Machine Learning;
dc.subject	CCS Concepts
dc.subject	Imaging/Video → Neural Image/Video Synthesis
dc.subject	Methods/Applications → Artificial Intelligence/Machine Learning
dc.title	Story2Board: A Training-Free Approach for Expressive Visual Storytelling

Files

Original bundle

Now showing 1 - 1 of 1

Name:: cgf70319.pdf
Size:: 71.39 MB
Format:: Adobe Portable Document Format

Download

Collections

45-Issue 2
EG 2026 - Full Papers - CGF 45-Issue 2