Complete guide to samplers in Stable Diffusion
Dive into the world of Stable Diffusion samplers and unlock the potential of image generation.
Introduction
As we saw in the article How Stable Diffusion works, when we ask Stable Diffusion to generate an image the first thing it does is generate an image with noise and then the sampling process removes noise through a series of steps that we have specified. It would be something like starting with a block of white marble and hammering it for several days until you get Michelangelo's David.
Several algorithms come into play in this process. The one known as sampler is in charge of obtaining a sample from the model that we are using in Stable Diffusion on which the noise estimated by the noise predictor is applied. It then subtracts this sample from the image it is cleaning, polishing the marble in each step.
This algorithm handles the how, while the algorithm known as the noise scheduler handles the how much.
If the noise reduction were linear, our image would change the same amount in each step, producing abrupt changes. A negatively sloped noise scheduler can remove large amounts of noise initially for faster progress, and then move on to less noise removal to fine-tune small details in the image.
Following the marble analogy, in the beginning it will probably be more useful to give it good hits and remove large chunks to advance quickly, while towards the end we will have to go very slowly to fine tune the details and not make an arm fall off.
A key aspect of the process is convergence. When a sampling algorithm reaches a point where more steps will not improve the result, the image is said to have converged.
Some algorithms converge very quickly and are ideal for testing ideas. Others take longer or require a greater number of steps but usually offer more quality. Others never do because they have no limit and offer more creativity.
With this article you will understand the nomenclature as well as the uses of the different methods without going into too much technical detail.
The image used in the demonstrations has been generated with the following parameters:
- Checkpoint:
dreamshaper_631BakedVae.safetensors
. - Positive prompt:
ultra realistic 8k cg, picture-perfect black sports car, desert wasteland road, car drifting, tires churns up the dry earth beneath creating a magnificent sand dust cloud that billows outwards, nuclear mushroom cloud in the background far away, sunset, masterpiece, professional artwork, ultra high resolution, cinematic lighting, cinematic bloom, natural light
. - Negative prompt:
paintings, cartoon, anime, sketches, lowres, sun
. - Width/Height:
512
/512
. - CFG Scale:
7
. - Seed:
1954306091
.
Samplers available
Depending on the software used you will find a varied list of possibilities. In this case we are going to analyze the samplers available in Automatic1111.
It is difficult to classify them into groups, although there are clearly two main approaches:
- Probabilistic models such as
DDPM
,DDIM
,PLMS
and theDPM
family of models. These generative models are able to generate an output based on the probability distribution estimated by the model. It would be like using a camera to photograph a landscape. - Numerical approach methods such as
Euler
,Heun
andLMS
. In each step, the solution to a particular mathematical problem is sought and the solution is estimated bit by bit. In this case it would be like using painting and a canvas to create the landscape and adding new details in each step.
DDPM
DDPM
(paper) (Denoising Diffusion Probabilistic Models) is one of the first samplers available in Stable Diffusion. It is based on explicit probabilistic models to remove noise from an image. It requires a large number of steps to achieve a decent result.
It is no longer available in Automatic1111.
DDIM
DDIM
(paper) (Denoising Diffusion Implicit Models) works similarly to DDPM
, using in this case implicit probabilistic models. This difference produces better results in a much smaller number of steps, making it a faster sampler with little loss of quality.
As can be seen in the cloud, better results are obtained with a high number of steps (100+). There are better alternatives as we will see below.
PLMS
PLMS
(paper) (Pseudo Linear Multi-Step) is an improvement over DDIM
. Using a 50-step process it is possible to achieve higher quality than a 1000-step process in DDIM
. Fascinating, isn't it? Well, read on, this is nothing.
In the case of PLMS
we cannot use few steps because it is not able to clean the noise, but between 50 and 100 steps it is already able to provide good results.
Euler
Euler
is possibly the simplest method. Based on ordinary differential equations (ODE), this numerical method eliminates noise linearly in each step. Due to its simplicity it may not be as accurate as we would like but it is one of the fastest.
Euler
is so fast that it is able to deliver good results even in 10 steps. Its strong point is between 30 and 50 steps.
Heun
Heun
is the perfectionist brother of Euler
. While Euler
only performs a linear approximation, Heun
performs two tasks in each step, making it a second-order sampler. It first uses a linear approximation for prediction and then a nonlinear approximation for correction. This improvement in accuracy offers higher quality in exchange for a small drawback: it takes twice as long.
Karl Heun developed this numerical method more than a century ago!
At 10 steps it still has some noise but it disappears in few more. As you can see, it offers high quality in 30 steps, although in 50 it offers a little more level of detail. In 100 steps it hardly changes the image and it is not worth getting old waiting for the result.
LMS
LMS
or Linear Multi-Step method is the cousin of PLMS
that uses a numerical rather than a probabilistic approach (PLMS
- P
= LMS
).
Moreover, unlike Euler
and Heun
, it uses information from previous steps to reduce noise in each new step. It offers better accuracy in exchange for higher computational requirements (slower).
Using few steps we have a sampler capable of generating psychedelic images imitating the effect of drugs. Jokes aside, it is a sampler that is not worth it because despite being fast it needs around 100 steps to offer something decent.
Family of DPM models
DPM
(Diffusion Probabilistic Models) are probabilistic models that offers improvements over DDPM
. Hence the similar name. There is also no implementation available in Automatic1111 because it has improved versions as we will see below.
DPM2
is an improvement over DPM
. You could say that it is version 2.
With 10 steps you already get an impressive quality (don't try 5 steps, you won't like the result). Around 30 to 50 steps is the ideal point. More steps are usually not worth it.
On the other hand we have DPM++
, which is also an improvement of DPM
. It uses a hybrid approach combining deterministic and probabilistic methods for sampling and subsequent noise reduction. There is no basic implementation of this sampler in Automatic1111, but it is combined with other methods. We will see this in the next section.
Thus, two versions with improvements were born from DPM
: DPM2
and DPM++
.
Faster DPM models (DPM-Solver
and UniPC
)
Diffusion Probabilistic Models (DPM
) are, as the name suggests, probabilistic. In each step, equations are not solved by deterministic numerical methods as in the case of Euler
, Heun
or LMS
, but the problem is approached by approximation to try to sample as accurately as possible.
Within these models there is a piece called solver, an algorithm that has an important role in calculating and approximating a probability distribution in sampling. And this is where a new technique known as DPM-Solver
is implemented that shortens the duration of each step.
In other words, models like DPM fast
(paper) or DPM++ 2S
/DPM++ 2M
(paper) implement a faster solver that will save time in the sampling process.
It will be fast (and not that fast either), but using few steps it is unusable. Interestingly it offers a different result to the rest of samplers and it seems that the cinematic effect is more pronounced.
In the case of DPM++ 2S
/DPM++ 2M
the number 2
means that they are second order. That is, they use both a predictor and a corrector to approximate the result accurately.
The S
stands for Single step
. A single calculation is performed in each step, so it is faster.
In contrast, the letter M
stands for Multi step
, an approach in which multiple calculations are performed in each step, taking into account information obtained in previous steps. This equates to more accurate and higher quality convergence at the cost of taking longer.
In both modalities this solver is faster than the default DPM
model solver.
There is no Automatic1111 implementation of DPM++ 2S
, only with A
, Karras
and SDE
variants (more on that later). So let's see some samples of DPM++ 2M
.
Little to say about this all-rounder sampler. It offers impressive results in 30 steps and if you give it some more time it can be squeezed even more.
As for UniPC
(paper), it is a solver that consists of two parts: a unified predictor (UniP
) and a unified corrector (UniC
). This method can be applied to any DPM
model and focuses on delivering the maximum possible sampling quality in the least amount of steps. Remember now when PLMS
brought down to 50 steps what DDIM
did in 1000? Well, in some cases UniPC
is able to generate quality images in as few as 5 or 10 steps.
Thus, UniPC
can be integrated in DPM
models both Single step
and Multi step
, making it comparable to DPM++ 2S
or DPM++ 2M
, with the particularity of offering better results when the number of steps is very low.
Even the UniC
corrector can be integrated into these sampling algorithms to achieve higher efficiency (e.g. DPM++ 2S
+ UniC
).
In this example 10 steps is not enough to generate an image without noise, but in 15 or 20 you will get it. In 30 steps it is magnificent and there is no need to go any further, although there is still some room for improvement.
More accurate DPM models (Adaptive
)
The DPM adaptive
model is an extension of the DPM
model in which it adapts the step size according to the difficulty of the problem is trying to solve.
In other words, it is as if the specified number of steps is ignored and the algorithm is free to sample more efficiently until the best possible convergence is achieved. It generates higher quality images at the expense of taking as long as it needs to (it is the slowest sampler).
In this case it has taken triple or quadruple the time with respect to other samplers but the result is amazing. The image composition is different from all samplers and is more like DPM fast
.
Other features
Only one sampling algorithm can be chosen. Either Euler
or DPM
can be used, but not both at the same time. Instead, when we talk about variants or extra features, these can be combined.
For example, we can use the sampler named DPM2 A Karras
. Let's see what these new values mean.
Ancestral variants
When a sampler contains the letter A
, it usually means that it belongs to the category of ancestral variants. These variants add, in each new step, random variables obtained from previous steps. It is as if after cleaning up the noise in one step, some of the previous noise is added back.
Samplers with this feature never converge because of the random noise added in each step. If there is always noise to remove, you can always go one step further.
This makes them more creative samplers. An extra step does not necessarily increase the quality, but rather gives another similar result.
If you try to reproduce an image generated with Stable Diffusion and you don't succeed even though you are using the same seed and the same parameters, it may be because you are using an ancestral sampler. This is normal! The noise that is re-added in each step is random and different implementations or versions of the sampler will almost certainly generate different results.
Some examples of samplers are Euler A
, DPM2 A
or DPM++ 2S A
.
Euler A
gives a great result in 25-30 steps being also very fast. In 50 steps the quality is worse and then in 100 steps it is better again. It is a lottery. Moreover, you can see how the image composition is constantly changing due to the random noise introduced in each step. Far from being a drawback it is perhaps its greatest advantage.
Karras Variants
Variants with the word Karras
(or K
) (paper) refer to work led by Nvidia engineer Tero Karras. This process introduces a series of improvements in some samplers, achieving improved efficiency in both the quality of the output and the computation required for sampling.
Some samplers using these changes are: LMS Karras
, DPM2 Karras
, DPM2 A Karras
, DPM++ 2S A Karras
, DPM++ 2M Karras
or DPM++ SDE Karras
.
Like DPM++ 2M
, this sampler offers very good results between 30 and 50 steps, but the Karras
version has the advantage of offering better results in a reduced number of steps as can be seen in the following example:
If you use a high number of steps you will have a hard time seeing the difference.
Stochastic variants
The SDE
(paper) variants use stochastic differential equations. Without going into further detail, using this type of differential equations allows to model the noise in a more sophisticated and accurate way, using information from previous steps, which in principle would generate higher quality images in exchange for being slower. Being stochastic, it never converges, so the higher the number of steps they do not offer higher quality, but rather more variations, just like ancestral samplers.
At the date of publication of this article we have DPM++ SDE
, DPM++ 2M SDE
, DPM++ SDE Karras
and DPM++ 2M SDE Karras
.
Stochastic samplers are slow but offer incredible results even with 10 steps. Their results are also more varied and creative. As they never converge they are an alternative to ancestral samplers.
What is the best sampler in Stable Diffusion?
Is a Ferrari or a Jeep better? Well it depends on whether you're going off-road, doesn't it?
Depending on what you need it's better to use one type of sampler or another. With the above information I hope it will be easy to choose, but here are some hints.
Image quality
If you are looking for quality it is a good idea to pursue convergence. That's the point at which you get the highest quality. If you don't want to sacrifice too much generation speed, forget about samplers like DDIM
that need hundreds of steps to converge. Heun
and LMS Karras
offer good results but it is better to use DPM++ 2M
or its Karras
version.
You can also try DPM adaptive
if you are not in a hurry, or UniPC
if you are.
With these samplers mentioned above you will get good results in 20-30 steps although it doesn't hurt to try a few extra steps.
Generation speed
If you are testing prompts you don't want to spend so much time waiting for results. In this case and where you are not looking for maximum quality but to test changes quickly I recommend using DPM++ 2M
or UniPC
with a small number of steps.
With just 10-15 steps you will get a very decent image.
If you don't care about reproducibility you also have Euler A
, a fast and good quality ancestral sampler. My favorite sampler!
Creativity and flexibility
This section is reserved exclusively for ancestral and stochastic samplers. They don't offer bad quality nor are they slow, they are just different.
The problem or advantage (depending on how you look at it) of these samplers is that if you have an image generated in 40 steps, having done it in 50 steps can make the image better or worse. You will have to test continuously. And this lottery makes these samplers more creative since you can always change the number of steps to obtain small variations.
Of course here Euler A
and DPM++ SDE Karras
stand out. Try generating images in 15 steps, 20 steps, 25 steps... and see how the result changes.
You can support me so that I can dedicate even more time to writing articles and have resources to create new projects. Thank you!