DataWink: Reusing and Adapting SVG-based Visualizations with LMMs

Motivation

Turning data into expressive, personalized visualizations can feel out of reach for anyone without design expertise or technical know-how. But what if staring from a high-fidelity example? Can we leverage the power of Large Multimodal Models (LMMs/MLLMs) to convert an instance into a reusable template then gradually refine the design?

Why General-Purpose LMM Is Not a Panacea?

While many of us may easily grasp the underlying data mapping from a well-designed visualization, this is not intuitive for LMMs, despite their strong potentials in visual understanding and coding [1,2]. Essentially, LMMs are not trained on many high-quality information visualizations, which blend data-driven elements with stylistic nuances. They often struggle to navigate raw SVG files with complex data transformations and intricate graphical dependencies.

Here is an example using Gemini-2.5-Flash for replicating Krisztina Szűcs's work featuring beautiful window shadows. There are several issues: (1) The bar shadows share the same polygon shape instead of changing with the window heights. (2) The bar shadows do not preserve the original transparent style. (3) The window bars unnecessarilly share the same width of the reference, thus do not fill the entire width of the window, leaving a strange gap on the right side. (4) The chart origin shifts down a little.

Original Visualization from Krisztina Szűcs — Reference

→

Generated Visualization from Gemini-2.5-Flash — VLM's Replication

Here are links to the chat history and the generated code.

Our Work: DataWink

We investigate how to facilitate the reuse and adaptation of high-fidelity visualization designs, which often involve intricate graphical dependencies. We propose a pipeline that transforms SVG-based visualizations into reusable templates. These templates maintain both structural and stylistic elements while supporting dynamic adaptation to new datasets and granular refinements of the data encoding scheme. With the layered representation along the reverse-engineering pipeline, we introduce an interactive authoring tool that incorporates with a dynamic interfaces for flexible design iteration.

The core idea of our approach is to reduce the complexity of the inferrence task by breaking down the SVG into different layers and then generating intermediate representations as contexts for the LMMs.

Links

arXiv

Video

Slides

Design Implications for GenAI-powered Content Creation

Blurring the line between "content creator" and "tool developer". Anyone can create tools to iterate on their own digital content, but a core challenge is not only surfacing unfamiliar internal functional capabilities, but also overcoming a fundamental literacy gap. DataWink addresses this by translating a user's descriptive goals into technical operations and then externalizing those operations as new, understandable UI widgets, effectively teaching the user the "grammar" of the design system in a just-in-time manner.
Reverse engineering GenAI output back to a familiar and malleable action space for effective user refinement. GenAI often produces "write-once" artifacts—complete outputs that are difficult to refine/debug. DataWink’s pipeline counters this by deconstructing the visual artifact into a semantically rich, layered template. This transforms a static design into a live, understandable system, empowering users to move from being passive recipients of a generated result to active participants who can confidently modify it.
Templating with common design patterns and dynamic interfaces for flexible re-design. DataWink reframes the template not as a static scaffold, but as a live, extensible program. Through conversational interaction, the LMM can dynamically modify the template's underlying code, transforming it from a restrictive constraint into a springboard for creative exploration.

Open Questions

For effective human steering, how to communicate LMM knowledge of the schema/contexts without overwhelming users? Can LMMs self steer?
For expressiveness, how to decode imprecise visual mappings in personal vis like free-hand drawings?
For higher agency, how to design dynamic interfaces that allow users to flexibly refine the design from high-level intentions to low-level implementations?

Relevant Works

Yang et al. ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation. ICLR'25
Snyder et al. Challenges & Opportunities with LLM-Assisted Visualization Retargeting. ArXiv' 25
Vaithilingam et al. DynaVis: Dynamically Synthesized UI Widgets for Visualization Editing . CHI'24
Xie et al. WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization. UIST'24

Citation


@article{xie2025datawink,
    title = { {DataWink}: Reusing and Adapting {SVG}-based Visualizations with {LMM}s },
    author = { Liwenhan Xie and Yanna Lin and Can Liu and Huamin Qu and Xinhuan Shu },
    journal = { IEEE Transactions on Visualization and Computer Graphics (Proc. VIS) },
    year = { 2025 },
    publisher = { IEEE },
    note = { Just Accepted }
}