DataWink: Reusing and Adapting SVG-based Visualizations with LMMs

HKUST
HKUST
Nanyang Tech
HKUST
Newcastle
Teaser Image

    Motivation

    Turning data into expressive, personalized visualizations can feel out of reach for anyone without design expertise or technical know-how. But what if staring from a high-fidelity example? Can we leverage the power of Large Multimodal Models (LMMs/MLLMs) to convert an instance into a reusable template then gradually refine the design?

    Why General-Purpose LMM Is Not a Panacea?

    While many of us may easily grasp the underlying data mapping from a well-designed visualization, this is not intuitive for LMMs, despite their strong potentials in visual understanding and coding [1,2]. Essentially, LMMs are not trained on many high-quality information visualizations, which blend data-driven elements with stylistic nuances. They often struggle to navigate raw SVG files with complex data transformations and intricate graphical dependencies.


    Here is an example using Gemini-2.5-Flash for replicating Krisztina Szűcs's work featuring beautiful window shadows. There are several issues: (1) The bar shadows share the same polygon shape instead of changing with the window heights. (2) The bar shadows do not preserve the original transparent style. (3) The window bars unnecessarilly share the same width of the reference, thus do not fill the entire width of the window, leaving a strange gap on the right side. (4) The chart origin shifts down a little.

    Original Visualization from Krisztina Szűcs
    Reference
    Generated Visualization from Gemini-2.5-Flash
    VLM's Replication

    Here are links to the chat history and the generated code.

    Our Work: DataWink

    We investigate how to facilitate the reuse and adaptation of high-fidelity visualization designs, which often involve intricate graphical dependencies. We propose a pipeline that transforms SVG-based visualizations into reusable templates. These templates maintain both structural and stylistic elements while supporting dynamic adaptation to new datasets and granular refinements of the data encoding scheme. With the layered representation along the reverse-engineering pipeline, we introduce an interactive authoring tool that incorporates with a dynamic interfaces for flexible design iteration.


    The core idea of our approach is to reduce the complexity of the inferrence task by breaking down the SVG into different layers and then generating intermediate representations as contexts for the LMMs.


    Design Implications for GenAI-powered Content Creation

    • Blurring the line between "content creator" and "tool developer". Anyone can create tools to iterate on their own digital content, but a core challenge is not only surfacing unfamiliar internal functional capabilities, but also overcoming a fundamental literacy gap. DataWink addresses this by translating a user's descriptive goals into technical operations and then externalizing those operations as new, understandable UI widgets, effectively teaching the user the "grammar" of the design system in a just-in-time manner.
    • Reverse engineering GenAI output back to a familiar and malleable action space for effective user refinement. GenAI often produces "write-once" artifacts—complete but outputs that are difficult to refine/debug. DataWink’s pipeline counters this by deconstructing the visual artifact into a semantically rich, layered template. This transforms a static design into a live, understandable system, empowering users to move from being passive recipients of a generated result to active participants who can confidently modify it.
    • Templating with common design patterns and dynamic interfaces for flexible re-design. DataWink reframes the template not as a static scaffold, but as a live, extensible program. Through conversational interaction, the LMM can dynamically modify the template's underlying code, transforming it from a restrictive constraint into a springboard for creative exploration.
    • Design Implications
      An illustration generated by OpenAI

    Open Questions

    • For effective human steering, how to communicate LMM knowledge of the schema/contexts without overwhelming users? Can LMMs self steer?
    • For expressiveness, how to decode imprecise visual mappings in personal vis like free-hand drawings?
    • For higher agency, how to design dynamic interfaces that allow users to flexibly refine the design from high-level intentions to low-level implementations?

    Citation


@article {xie2025datawink,
    title = { {DataWink}: Reusing and Adapting {SVG}-based Visualizations with LMMs },
    author = { Liwenhan Xie and Yanna Lin and Can Liu and Huamin Qu and Xinhuan Shu },
    journal = { IEEE Transactions on Visualization and Computer Graphics (Proc. VIS) },
    year = { 2025 },
    publisher = { IEEE },
    note = { Just Accepted }
}

    Updated on July 18, 2025.