← Back to GPUs

NVIDIA · RTX 40
NVIDIA GeForce RTX 4080
$850$1199 MSRP
The original RTX 4080 launched at $1,199 but has dropped significantly on the used market since the SUPER variant replaced it. With 16GB GDDR6X and strong 4K gaming performance, it remains a capable card. For AI, the 16GB VRAM handles 14B models at Q4. A solid used-market pick if found under $850.
Best ForUsed market 4K gaming with 16GB VRAM
VerdictSuperseded by the SUPER — only buy used at a significant discount.
AI
7/10
Gaming
9/10
Specifications
VRAM16GB GDDR6X
Memory Bandwidth717 GB/s
CUDA Cores9,728
Boost Clock2505 MHz
TDP320W
Power Connector1x 16-pin
Length304mm
Form FactorTriple Slot
Release Year2022
AI Capabilities
Capable16GB VRAM
Runs most popular models with quantization. The minimum for serious AI work.
Can run (Q4 quantized)
Llama 3.1 8BQwen 2.5 14BMistral 7BFLUX.1 DevStable Diffusion XLStable Diffusion 3.5 LargeHunyuanVideoCogVideoX-5BMochi 1LTX VideoStable Video DiffusionWan Video 14BCodestral 22BAlphaFold 2ESMFold (ESM-2 15B)ESM-2 3BscGPTRFdiffusionFine-tune Llama 8BTrain SDXL LoRATrain FLUX LoRA
Tight fit (may need CPU offload)
Qwen 2.5 32B (20GB Q4)Qwen 2.5 Coder 32B (20GB Q4)LLaVA 1.6 34B (20GB Q4)
Recommended system RAM for AI: 32GB+ (2x GPU VRAM for model overflow)
Performance Estimates
Estimated tokens/sec for LLM inference based on 717 GB/s memory bandwidth — not hardware benchmarks. Methodology · What is Q4/Q8?
Llama 3.1 8B8B
FP16~25-30 tok/sUsableQwen 2.5 32B32B
Offload~1-3 tok/sVery slowQwen 2.5 14B14B
Q8~30-37 tok/sFastMistral 7B7B
FP16~28-35 tok/sUsableCodestral 22B22B
Q4~34-42 tok/sFastQwen 2.5 Coder 32B32B
Offload~1-3 tok/sVery slowPros
- +Strong 4K gaming
- +16GB VRAM
- +Available cheaper used since SUPER launch
Cons
- -Overpriced at MSRP
- -Superseded by 4080 SUPER
- -16-pin connector
gamingai
Where to Buy
Will It Run?
Llama 3.1 8B8B
FP16Qwen 2.5 32B32B
OffloadQwen 2.5 14B14B
Q8Mistral 7B7B
FP16FLUX.1 Dev12B
Q8Stable Diffusion XL6.6B
FP16Stable Diffusion 3.5 Large8B
Q8HunyuanVideo13B
Q4