Abstract
Why Destylization?
Destylization vs. Stylization

(a) Stylization-based data generation pipeline. (b) Destylization-based data generation pipeline (ours). Our method enables authentic supervision with high-quality and style-faithful data, in contrast to stylization-based pipelines that rely on pseudo-supervision, often artifacts-prone and style-unfaithful.
Destylization
DST: Text-Guided Destylization

(a) Destylization Dataset Construction: we use high-resolution images from HQ-50K and FFHQ as content images, covering six categories: humans, animals, plants, objects, scenes, and architecture. These images are stylized by four models, and captions are generated using InternVL2.5-7B. This yields triplets in the form of stylized-content-caption. (b) The architecture of DST model.

(a) Style image collection and (b) text-guided destylization pipeline.
DST-Filter
Multi-Stage Evaluation Pipeline

The pipeline of DST-Filter. DST-Filter assesses each <style, destylized> pair from two aspects: content preservation and style discrepancy, using GPT-4o with region-level and attribute-level Chain-of-Thought reasoning.
Dataset Overview
DST-100K Dataset Statistics

Overview of DST-100K dataset.
Quantitative Results
Quantitative comparison of different style transfer methods

Quantitative comparison of the image editing methods

Qualitative Comparison
Qualitative comparison with different style transfer methods

Qualitative comparison with different image editing models


More Results
Diverse Style Transfer Results

Our method produces a broader range of stylized results across diverse style categories, including 2D styles such as flat design, PS1 game style, cartoon, line art, illustration, and classic artworks, as well as 3D styles such as origami art, 3D voxel art and 3D low poly rendering.