Can NVIDIA GeForce RTX 5070 run FLUX.1 Dev?

12B parameter Image Gen model on 12GB GDDR7

Yes — runs at 4-bit quantization
~6.1-8.4 img/min
SpeedModerate speed, usable for interactive chat
QualityGood quality with slight degradation on complex reasoning

VRAM Requirements

FLUX.1 Dev is a 12B parameter model. At full precision (FP16), it requires 32GB of VRAM. Your NVIDIA GeForce RTX 5070 has 12GB, so you'll need to quantize it to 4-bit (Q4) to fit.

FP16 (Full Precision)32GB (need 20GB more)

Maximum quality, no quantization

Q8 (8-bit)16GB (need 4GB more)

Near-lossless, ~50% size reduction

Q4 (4-bit)10GB (2GB free)

Good quality, ~75% size reduction

Your GPU VRAM: 12GB GDDR7 at 672 GB/s bandwidth
Recommended system RAM: 32GB DDR5 (2x GPU VRAM minimum for model overflow)

What This Means in Practice

At 4-bit precision, FLUX.1 Dev fits in VRAM but generation will be slower and you may need to limit resolution or batch size. Image quality is generally preserved at Q4, but very complex compositions may show minor artifacts.

How to Set It Up

Step 1: Install ComfyUI

git clone https://github.com/comfyanonymous/ComfyUI.git && cd ComfyUI && pip install -r requirements.txt

ComfyUI is the recommended UI for Stable Diffusion and FLUX models.

Step 2: Download the model

Download FLUX.1 Dev weights from HuggingFace and place them in ComfyUI/models/. The model is approximately 32GB at full precision.

Step 3: Launch and generate

python main.py

Open http://localhost:8188 in your browser. Use the FP8/NF4 quantized version for your VRAM.

NVIDIA GeForce RTX 5070 Specs

VRAM12GB GDDR7
Memory Bandwidth672 GB/s
TDP250W
CUDA Cores6,144
Street Price~$620
AI Rating6/10

Other Image Gen Models on NVIDIA GeForce RTX 5070

About FLUX.1 Dev

State-of-the-art image generation. 16GB comfortable at FP8.

Category: Image Gen · Parameters: 12B · CUDA required: Recommended