Back to Projects

SSD-1B

SSD-1B is a distilled 50% smaller version of the Stable Diffusion XL (SDXL), offering a 60% speedup while maintaining high-quality text-to-image generation capabilities.

AI Reseach

2M+ Downloads

SSD-1B

Tech Stack

pytorchsdxldistillation

About This Project

The Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of the Stable Diffusion XL (SDXL), offering a 60% speedup while maintaining high-quality text-to-image generation capabilities. It has been trained on diverse datasets, including Grit and Midjourney scrape data, to enhance its ability to create a wide range of visual content based on textual prompts.

This model employs a knowledge distillation strategy, where it leverages the teachings of several expert models in succession, including SDXL, ZavyChromaXL, and JuggernautXL, to combine their strengths and produce impressive visual outputs.

Special thanks to the HF team 🤗 especially Sayak, Patrick and Poli for their collaboration and guidance on this work.

Image Comparision (SDXL-1.0 vs SSD-1B)

image/png

Speed Comparision

We have observed that SSD-1B is upto 60% faster than the Base SDXL Model. Below is a comparision on an A100 80GB.

image/png

Below are the speed up metrics on a RTX 4090 GPU.

image/png

Key Features

  • Text-to-Image Generation: The model excels at generating images from text prompts, enabling a wide range of creative applications.

  • Distilled for Speed: Designed for efficiency, this model offers a 60% speedup, making it a practical choice for real-time applications and scenarios where rapid image generation is essential.

  • Diverse Training Data: Trained on diverse datasets, the model can handle a variety of textual prompts and generate corresponding images effectively.

  • Knowledge Distillation: By distilling knowledge from multiple expert models, the Segmind Stable Diffusion Model combines their strengths and minimizes their limitations, resulting in improved performance.

Model Architecture

The SSD-1B Model is a 1.3B Parameter Model which has several layers removed from the Base SDXL Model

image/png

Multi-Resolution Support

image/jpeg

SSD-1B can support the following output resolutions.

  • 1024 x 1024 (1:1 Square)

  • 1152 x 896 (9:7)

  • 896 x 1152 (7:9)

  • 1216 x 832 (19:13)

  • 832 x 1216 (13:19)

  • 1344 x 768 (7:4 Horizontal)

  • 768 x 1344 (4:7 Vertical)

  • 1536 x 640 (12:5 Horizontal)

  • 640 x 1536 (5:12 Vertical)

Citation

{
@misc{gupta2024progressive,
      title={Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss}, 
      author={Yatharth Gupta and Vishnu V. Jaddipal and Harish Prabhala and Sayak Paul and Patrick Von Platen},
      year={2024},
      eprint={2401.02677},
      archivePrefix={arXiv},
      primaryClass={[cs.CV](http://cs.CV)}
}