Skip to main content

Stable Diffusion 3 Interface

Stable Diffusion 3 Interface
Stable Diffusion 3 Interface

 

Welcome to this in-depth guide on the Stable Diffusion web interface, where we’ll delve into the various parameters and features that allow you to generate stunning images from text prompts. At the end of this article, the readers will be well-equipped to use Stable Diffusion based on the interface as described in this article.

Settings and Saving Images

Settings and Saving Images
Settings and Saving Images

Before we go further and move to generation of images, let’s consider settings and saving of the images. As a default, the options namely “Always save generated images” and “Generated Grids” turn ON by default. Some of these options include; ‘auto save every X minutes’, the ‘thumbnail cache’, ‘auto download every Y images’ and ‘auto save every Z images’ As you can see, these options tend to clutter your disk space, therefore we suggest disabling them and using the ‘save’ button to save images you like.

Model Selector and Prompt Field

Model Selector and Prompt Field
Model Selector and Prompt Field

In the upper right-side, there is the model selection where you can choose from several pre-trained models. For now, we will not change the default model of the Monopoly game. Below the model selector is the prompt field where you input the text description of the image you wish to be generated.

Generating the First Image

Generating the First Image
Generating the First Image

So, without wasting much time let’s get our feet wet with an example of creating an image from a barebone input prompt. In the prompt field, enter “a woman” and press “Generate”. What you’ll be able to notice in the next few seconds is a simple black and white picture of a woman made with the help of a pencil. It is not exactly an impressive picture, one can adjust the prompt to get a more rational image.

Seed Inference and Step Guidance

Seed Inference and Step Guidance
Seed Inference and Step Guidance

It is a mode of the filter called stable diffusion where from a noise starting value, the image is step by step guided through the step guidance filtering. Thus, the model provides freedom to mention the number of iterations, and it will make its way to the image we want. The seed inference process is not performed, but its working is emulated and essentially all steps in the process are observable.

Negative Prompts and Specific Styles

Negative Prompts and Specific Styles
Negative Prompts and Specific Styles

To generate a color photo of a woman, we will need to include a negative prompt to rule out the undesired styles such as, grayscale, drawing or painting. We can also refine the search and get the image of the person of a certain age, with or without certain hair color or any other characteristic.

Batch Count and Batch Size
Batch Count and Batch Size

 

There are some advanced features of the model such as a batch count feature which instead of creating one image at a time we can create multiple images at one go. It is worth remembering that generating many pictures burdens the VRAM, so use it with care.

Variations and CFG Scale

Variations and CFG Scale
Variations and CFG Scale

 

Once one has an image they find satisfactory, additional images can be generated using the CFG scale which indicates how closely an image should adhere to the given prompt. It means that when the values are low the fantasy creates and the high values cause more realistic pictures to be created.

Sampling Method and Steps

Sampling Method and Steps
Sampling Method and Steps

Another dimension of creating variations is through the change in the sample size or the procedure of selecting one or the other. As is the case with many software products, there are numerous versions of samplers out there with their advantages and disadvantages. We will discuss these in another article devoted to this topic.

Restore Faces and Other Options

Restore Faces and Other Options
Restore Faces and Other Options

The most beneficial feature is a specific “Restore Faces” option that is essential for photorealistic images because it allows improving the image’s quality of faces. We can also try other rather radical ones, for example, pay dialing and high-resolution fixes that will be discussed in the following articles.

Width and Height

Width and Height
Width and Height

The ImageNet resizing defaults to a size of 512×512 which is the size the model was trained on. But we can try other sizes, for instance, 768×768 and get the 2:3 ratio that we need. One should remember that increasing the size may turn into varying results.

 

0
    0
    Your Cart
    Your cart is emptyReturn to Courses