How to choose a custom Stable Diffusion model

Where can I find the models?

There are multiple sites that offer free download of models. If you are not sure where to start, I recommend starting with Hugging Face.

Versions of models

For those new to Stable Diffusion, choosing the right model version can be a bit overwhelming. Currently, two versions are particularly popular:

  • Stable Diffusion v1-5: This version has a wide array of custom models available.
  • Stable Diffusion XL (SDXL): As a newer generation model, it supports 1024×1024 image generation by default.

There are other versions as well, such as Stable Diffusion v2 and v2-1, along with older models preceding v1-5. For detailed information on these older models, you can visit Hugging Face’s Stable Diffusion v1-5 page.

If you’re just starting out, Stable Diffusion v1-5 is an excellent choice due to its extensive support and range of custom models. Meanwhile, SDXL represents the cutting edge with its higher default resolution capabilities.

Types of models

When downloading models for Stable Diffusion, you will typically encounter the following types on model distribution websites:

  1. Main Model: This is the core model for Stable Diffusion. Unless the download page says the compatible Stable Diffusion model version, the page most likely won’t say it’s the main model, so you need to judge from the file size (ranges from 2GB to 7GB) and the lack of “ControlNet model” or “LoRA” model in the model description (see below).
  2. Inpainting Model: This is a special type of model to replace or fill a part of the existing image using the Inpainting feature of Stable Diffusion. The model description should say that it’s for Inpainting.
  3. ControlNet Model: This specialized model works in tandem with an extension software package known as ControlNet, akin to a plugin in conventional software. ControlNet allows for more precise control over the generated images, enhancing the customization capabilities of Stable Diffusion.
  4. LoRA Model: LoRA, short for Low-Rank Adaptation, acts as a helper model to adjust the main model’s behavior. It is specifically trained with custom data to tailor image generation more closely to those data, offering a way to personalize the output images.

Model file format

There are two different file formats in terms of organization:

  • Diffuser format
  • Single file format

For 1, you should see the following directories on the download website:

  1. feature_extractor
  2. safety_checker
  3. scheduler
  4. text_encoder
  5. tokenizer
  6. unet
  7. vae

Each of these directories should contain model file(s) and/or configurations. This is to use Stable Diffusion with the diffusers package. Unless you want to use Stable Diffusion with the package, this format is not for you.

Single file format is a format that contains required model functionality in a single file, and this is the format you need in order to use a custom model in regular use cases. Below are some examples of the model file names:

  • analogMadness_v60.safetensors
  • dreamshaper_8.safetensors
  • realisticVisionV51_v51VAE.safetensors

Further clarifications for a single file format

For a single file format, there can be multiple variants offered by the model creator. Here are some suggestions:

  • Do NOT download or use a model file that ends with .ckpt unless you can trust the source and the model creator. A ckpt file can theoretically contain malware. Use a model file that ends with “.safetensors”.
  • If there are multiple model files available and one says “with VAE”, “Baked VAE” and another says “no VAE”, pick the one with “VAE”.
  • If there are multiple model files available and one says “fp16” and another says “fp32”, or does not include the word “fp”, pick the one with “fp16”.

When choosing a model file format for Stable Diffusion, you may encounter multiple variants created by the model developers. To ensure safety and optimal performance, consider the following recommendations:

  1. Be Cautious with .ckpt Files: It’s advisable not to download or use model files ending with .ckpt unless you fully trust the source and the model creator. Files with this extension could potentially contain malware. Instead, opt for model files that end with .safetensors for enhanced security.
  2. Choosing Between ‘Baked VAE’ and ‘No VAE’ Models: If you find multiple model files and some are labeled as ‘Baked VAE’, while others are marked ‘no VAE’, it is generally better to select the model that is labeled with ‘Baked VAE’ instead of “No VAE”.
  3. FP16 vs. FP32 Model Files: When presented with a choice between ‘fp16’ and ‘fp32’ model files, or if the file doesn’t specify the floating-point precision, choose the ‘fp16’ version.

Where to put the model after you download

If you are using Automatic1111, put the downloaded model file under the models/Stable-diffusion directory of your Automatic1111 installation directory (e.g. stable-diffusion-webui directory). This directory should already exists upon successful installation of Automatic1111.

Leave a Comment

Your email address will not be published. Required fields are marked *