Getting Started with Stable Diffusion (2026)

Step-by-step tutorial on installing Stable Diffusion WebUI on Windows, writing prompts that work, using negative prompts, img2img, inpainting, ControlNet, LoRA models for free.

What Is Stable Diffusion?

Stable Diffusion is an open-source AI image generation model that runs locally on your computer. Unlike Midjourney or DALL-E (which run on cloud servers and charge monthly fees), Stable Diffusion is completely free once you have the hardware to run it. You own every image you create, there are no filters, no censorship, and unlimited generations.

As of 2026, Stable Diffusion 3.5 and SDXL produce images that rival or exceed commercial alternatives. Combined with tools like ControlNet and LoRA models, the possibilities are essentially unlimited.

Hardware Requirements

You do not need a supercomputer, but you do need a dedicated GPU:

Minimum specs: NVIDIA GPU with 4GB+ VRAM (GTX 1060 6GB or newer), 8GB system RAM, 10GB free storage (models are 2-7GB each), Windows 10/11 (Linux works too).

Recommended specs: NVIDIA GPU with 8GB+ VRAM (RTX 3060 12GB, RTX 4060, RTX 4070), 16GB system RAM, 25GB+ SSD storage (for multiple models + outputs).

No NVIDIA GPU? You can use CPU-only mode (very slow), Google Colab (free but time-limited), or cloud GPU rentals starting at $0.20/hour.

Installing AUTOMATIC1111 WebUI

AUTOMATIC1111's Stable Diffusion WebUI is the most popular interface. Here is the fastest way to install it:

  • Install Python 3.10.6 from python.org (check Add Python to PATH)
  • Install Git from git-scm.com
  • Create a folder called sd-webui somewhere convenient
  • Open Command Prompt in that folder and run these commands:
  • `` git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git cd stable-diffusion-webui webui-user.bat ``
  • Wait - First run downloads about 5-8GB of dependencies (one-time only)
  • A browser window opens automatically when ready
  • That is it. The WebUI runs locally at http://127.0.0.1:7860.

    Your First Image: Prompt Basics

    Stable diffusion uses natural language prompts. Here is the formula for great results:

    Structure: (subject), (quality tags), (style), (composition), (lighting)

    Example prompt showing good structure:
    a majestic owl perched on ancient oak branch, intricate feather details, sharp focus, fantasy art style, dappled sunlight through leaves, golden hour lighting, highly detailed, 8k resolution

    Key principles:

    • Put the most important elements first (order matters!)

    • Be descriptive, not vague (golden retriever is better than dog)

    • Add artist names for style direction

    • Include quality boosters: highly detailed, sharp focus, 8k

    Negative Prompts: What to Exclude

    Negative prompts tell the AI what you do NOT want. A solid default negative prompt includes:

    low quality, blurry, distorted anatomy, extra limbs, deformed hands, watermark, signature, text, ugly, duplicate, mutation, bad proportions, cropped, out of frame, oversaturated, underexposed

    When to customize negatives:

    • Portraits: add cross-eyed, asymmetric face, double chin

    • Architecture: add crooked lines, perspective error

    • Anime: add 3d render, realistic photo (to keep anime style clean)

    Img2img: Transforming Existing Images

    img2img lets you transform an existing image while preserving its composition:
  • Go to the img2img tab
  • Upload any image
  • Set denoising strength (the key parameter):
  • - 0.25-0.35 means subtle changes, preserves original closely - 0.45-0.55 means balanced transformation - 0.65-0.75 means dramatic changes, loose composition - 0.85+ means essentially ignores input, generates freely
  • Write your prompt describing the target result
  • Generate
  • Use cases: Colorize sketches, change art styles, upscale low-res photos, convert photos to paintings, fix composition issues.

    Inpainting: Selective Editing

    Inpainting lets you modify only parts of an image:

  • Generate or upload an image
  • Click Send to inpaint below the image
  • Use the brush tool to paint over the area you want to change
  • Write a prompt describing what should go in that area
  • Generate - Only the masked region changes
  • Pro applications: change clothing or accessories on people, replace backgrounds, fix distorted faces or hands, add or remove objects, extend images beyond their borders (outpainting).

    ControlNet: Precision Control

    ControlNet is revolutionary - it gives you precise control over pose, composition, edges, depth, and more:

    Popular ControlNet models:

    • Canny / Lineart - Preserve edges and outlines

    • Depth - Maintain 3D spatial relationships

    • Pose - Lock human body positions from a reference

    • Seg - Control semantic regions (sky, person, building)

    • Tile - Upscale and add detail to small images


    Workflow: Upload a reference image, select ControlNet type, adjust strength (0.5-1.0), and generate.

    LoRA Models: Fine-Tuned Styles

    LoRA (Low-Rank Adaptation) are small model files (10-200MB) that add specific styles, characters, or concepts:

    Where to find LoRAs: Civitai.com (largest library, 100K+ models), Hugging Face (open-source community).

    Popular categories: Character LoRAs for specific people or anime characters, Style LoRAs for photography styles or artistic mediums, Concept LoRAs for clothing items or aesthetic themes.

    Usage: Download .safetensors file, place in models/Lora/ folder, refresh WebUI, select in the LoRA dropdown, set weight (0.5-1.0 typical).

    Conclusion

    Stable Diffusion has a steeper learning curve than Midjourney, but the payoff is complete creative freedom with zero ongoing costs. Start with basics (prompt engineering, txt2img), then progressively explore img2img, inpainting, ControlNet, and LoRAs. Within a week of regular practice, you will be producing images that match or exceed commercial AI art tools.

    ← Back to All Guides Try Our Free AI Tools →