StrokeFusion: Vector Sketch Generation via Joint Stroke-UDF Encoding and Latent Sequence Diffusion

Kavli Affiliate: Yi Zhou

| First 5 Authors: , , , ,

| Summary:

In the field of sketch generation, raster-format trained models often produce
non-stroke artifacts, while vector-format trained models typically lack a
holistic understanding of sketches, leading to compromised recognizability.
Moreover, existing methods struggle to extract common features from similar
elements (e.g., eyes of animals) appearing at varying positions across
sketches. To address these challenges, we propose StrokeFusion, a two-stage
framework for vector sketch generation. It contains a dual-modal sketch feature
learning network that maps strokes into a high-quality latent space. This
network decomposes sketches into normalized strokes and jointly encodes stroke
sequences with Unsigned Distance Function (UDF) maps, representing sketches as
sets of stroke feature vectors. Building upon this representation, our
framework exploits a stroke-level latent diffusion model that simultaneously
adjusts stroke position, scale, and trajectory during generation. This enables
high-fidelity sketch generation while supporting stroke interpolation editing.
Extensive experiments on the QuickDraw dataset demonstrate that our framework
outperforms state-of-the-art techniques, validating its effectiveness in
preserving structural integrity and semantic features. Code and models will be
made publicly available upon publication.

| Search Query: ArXiv Query: search_query=au:”Yi Zhou”&id_list=&start=0&max_results=3

Read More