← All Guides

Best GPU for AlphaFold & Protein Structure Prediction

A bioinformatician's guide to buying the right GPU for structural biology, single-cell analysis, and computational protein design.

The Short Answer

AlphaFold 2 / ColabFold: 16GB for most proteins. 24GB for long sequences and multimers. The model weights are small — it's the attention matrices that eat VRAM.

ESMFold: 16GB at FP16 for most proteins. 24-32GB for the full ESM-2 15B at FP16. Much faster than AlphaFold — seconds vs minutes per prediction.

scGPT (single-cell): 8GB for small experiments. 12-16GB for 50K cells. 24GB+ for atlas-scale datasets.

Best all-around: RTX 3090 (24GB, ~$900 used) or RTX 4090 (24GB, ~$2,100). Either handles every tool on this page without VRAM anxiety.

Why Most GPU Guides Are Useless for Scientists

Every GPU buying guide ranks cards by gaming FPS and AI chatbot performance. Neither is relevant if you're running AlphaFold on a protein library, analyzing 200K single-cell transcriptomes, or designing binders with RFdiffusion.

Scientific computing GPU requirements are fundamentally different from consumer AI:

  • 1.VRAM usage scales with input size, not model size. AlphaFold's weights are ~200MB. But predicting the structure of a 1,500-residue protein needs 16GB+ of VRAM for the attention matrices. A 300-residue protein needs 4GB. No consumer AI guide tells you this.
  • 2.CUDA is not optional. Every tool in computational biology — AlphaFold, ESM, PyTorch Geometric, RAPIDS for single-cell — assumes NVIDIA CUDA. AMD ROCm support is theoretical for most scientific packages. Intel GPUs don't exist in this ecosystem.
  • 3.System RAM matters as much as VRAM. AlphaFold's MSA databases are huge. Single-cell AnnData objects live in CPU memory. Genomics pipelines before the GPU step are RAM-hungry. Budget 64-128GB system RAM for serious work.
  • 4.Sustained throughput matters more than peak speed. Scientific jobs run for hours. Thermal throttling on a cheap cooler kills throughput. Data center cards with blower coolers need proper airflow or aftermarket cooling.

GPU Requirements by Tool

AlphaFold 2 / ColabFold

Protein structure prediction from sequence + MSA

AlphaFold 2's VRAM usage is dominated by the Evoformer's attention mechanism, which scales quadratically with sequence length and MSA depth. The model weights themselves are tiny (~200MB). This means VRAM requirements depend entirely on what you're predicting:

InputVRAMTime
Short protein (<300 residues)~4-6GB2-5 min
Medium protein (300-800 residues)~8-12GB5-15 min
Long protein (800-1500 residues)~12-16GB15-45 min
Very long / multimer~16-24GB+30-120 min

ColabFold tip: ColabFold uses MMseqs2 instead of the full BFD/Uniclust databases. This eliminates the 2.5TB database download and reduces CPU RAM needs from 128GB to 16-32GB. Quality is nearly identical for most proteins. If you're setting up locally for the first time, start with ColabFold.

Batch processing: If you're predicting structures for hundreds of proteins (e.g., from a proteomics experiment), VRAM determines your maximum protein size but not throughput. The GPU processes one protein at a time. Throughput comes from having a fast CPU for MSA search and fast NVMe storage for the databases.

ESMFold & ESM-2

Protein language models + single-sequence structure prediction

ESM-2 is Meta's family of protein language models. The key difference from AlphaFold: ESMFold predicts structure from a single sequence — no MSA needed. This makes it dramatically faster (seconds vs minutes) at the cost of slightly lower accuracy.

ModelVRAM (FP16)Use Case
ESM-2 3B6GBEmbeddings, annotation
ESM-2 15B30GBBest embeddings
ESMFold (15B)16GB*Structure prediction

*ESMFold at FP16 uses ~16GB for most proteins due to optimized inference, even though the full ESM-2 15B model is 30GB.

When to use ESMFold vs AlphaFold: Use ESMFold when you need speed — screening hundreds of sequences, exploring mutant libraries, or getting quick structural hypotheses. Use AlphaFold when accuracy matters most — publication figures, drug target analysis, or complex multimer modeling. Many labs use ESMFold as a first pass and AlphaFold for the hits.

ESM-2 embeddings: Beyond structure prediction, ESM-2 embeddings are the new standard for protein representation. Use them for function prediction, variant effect scoring (think EVE/ESM-1v), binding site identification, and protein family classification. The 3B model runs on any 8GB+ GPU. The 15B model is better but needs 32GB+ for FP16.

scGPT (Single-Cell Analysis)

Foundation model for single-cell RNA-seq

scGPT brings the foundation model paradigm to single-cell biology. Instead of building analysis pipelines from scratch (Scanpy + scVI + CellTypist + ...), you fine-tune a single pre-trained model for your specific task: cell type annotation, perturbation prediction, gene network inference, or batch integration.

VRAM scaling: The model itself is only 50M parameters (~200MB). VRAM is dominated by the gene expression matrix during training and fine-tuning:

Dataset SizeVRAMUse Case
5K cells~4GBSmall experiment
10K-30K cells~6-8GBTypical study
50K-100K cells~12-16GBLarge experiment
100K-500K cells~16-24GBAtlas-scale

Practical tip: If your dataset exceeds your VRAM, use gradient checkpointing (trades speed for memory) or subsample during training. For inference only (applying a fine-tuned model to new data), VRAM requirements are much lower — the model processes cells in batches.

RFdiffusion (Protein Design)

Generative protein design via diffusion

RFdiffusion from the Baker Lab generates novel protein structures through a diffusion process. Use cases include de novo binder design, motif scaffolding, and symmetric assembly generation. VRAM requirements depend on protein size:

  • +Small designs (<200 residues): 8GB
  • +Medium designs (200-500 residues): 10-12GB
  • +Complex multi-chain assemblies: 16GB+

Unlike AlphaFold, RFdiffusion is generative — you typically run hundreds of design trajectories and filter the results. Throughput matters. A faster GPU (RTX 4090 vs 3090) generates more designs per hour, even though both have enough VRAM.

GPU Compatibility Matrix for Scientific Computing

Which GPUs can run which tools. "Full" means large workloads fit. "Small" means only small inputs.

Our Recommendations

Budget (<$400)

NVIDIA Tesla P40

$300

24GB VRAM gets you started with AlphaFold on medium proteins, ESM-2 3B for embeddings, and small scGPT experiments. The cheapest path into GPU-accelerated structural biology. Look for used cards to stretch the budget further.

Full specs →
Best Value (Our Pick)

NVIDIA Tesla P40

$300

24GB VRAM handles everything: AlphaFold on any single protein, ESMFold at FP16, scGPT on 100K+ cells, and RFdiffusion for complex designs. This is the GPU most computational biologists should buy. The 346 GB/s bandwidth means predictions finish fast, not just fit.

Full specs →
Performance (No Compromises)

NVIDIA GeForce RTX 4090

$1400

Same 24GB VRAM but with 1008 GB/s bandwidth and newer architecture. If you're processing hundreds of proteins per day or generating thousands of RFdiffusion designs, the throughput difference justifies the premium over the mid-range pick.

Full specs →
Research Lab (Multi-GPU / Training)

NVIDIA A100 80GB

$8000

80GB HBM means ESM-2 15B at FP16, Llama 70B at Q8, and massive single-cell datasets. For labs doing serious training or processing very large protein libraries. NVLink support for multi-GPU scaling. No display output — you need a separate card for your monitor.

Full specs →

Beyond the GPU: Building a Complete Bioinformatics Workstation

The GPU gets the headlines, but scientific computing workstations have specific needs for every component:

CPU: Many Cores for Preprocessing

AlphaFold's MSA search (jackhmmer/hhblits) is CPU-bound and benefits from many threads. Single-cell preprocessing (doublet detection, normalization, HVG selection) runs on CPU before the GPU step. Aim for 12-16 cores. AMD Ryzen 9 9950X (16C/32T) or Intel i9-14900K are solid choices.

RAM: 64GB Minimum, 128GB for Genomics

AlphaFold's database files need significant RAM for fast access. Single-cell AnnData objects with 100K+ cells eat 32-64GB easily. FASTQ processing and genome alignment are RAM-hungry. 64GB DDR5 is the floor. 128GB if you work with large genomics datasets.

Storage: Fast NVMe + Bulk Storage

AlphaFold's reduced databases are ~500GB (full BFD: 2.5TB). Genomics data (BAM/FASTQ) accumulates fast. Use a 1-2TB NVMe for active work and databases. Add a large HDD or second NVMe for raw data storage. Read speed matters for database lookups.

PSU: Don't Skimp

Scientific workloads run both CPU and GPU at sustained high load. Budget PSU wattage for worst case: GPU TDP + CPU TDP + 200W headroom. An RTX 4090 (450W) + Ryzen 9950X (170W) needs an 850W PSU minimum. Transient power spikes can trip undersized PSUs.

Software Stack Setup

The scientific computing GPU stack in 2026:

# Base

Ubuntu 22.04 LTS (or 24.04)

NVIDIA Driver 550+ (check nvidia-smi)

CUDA 12.x (via nvidia-cuda-toolkit)

# Package management

Conda / Mamba (miniforge recommended)

# Scientific stack

PyTorch 2.x (with CUDA)

JAX + jaxlib (for AlphaFold)

fair-esm (ESM-2 / ESMFold)

colabfold (easier than full AlphaFold)

scanpy + anndata (single-cell)

Key tip: Use separate Conda environments for each tool. AlphaFold (JAX), ESM (PyTorch), and scGPT (PyTorch) can have conflicting CUDA dependencies. Separate environments prevent version hell.