Glossary
Structural biology terms
Amino acid - The basic building block of proteins; 20 standard amino acids form peptide chains through covalent peptide bonds.
Polypeptide / Chain - A single continuous sequence of amino acids connected by peptide bonds. In PDB files, each chain is labelled with a unique identifier (e.g. chain A, chain B).
Protein - A functional biological molecule composed of one or more polypeptide chains, folded into a specific three-dimensional structure.
Residue - A single amino acid within a polypeptide chain (after incorporation into the chain).
Fold - The overall 3D arrangement of secondary structure elements (helices, sheets, loops) that defines a protein’s topology and shape.
Domain - A compact, independently folding unit within a protein, often associated with a specific function. A protein may have one or multiple domains.
Subunit - A single polypeptide chain within a larger multi-chain assembly. For example, each chain in a dimer or tetramer is a subunit.
Complex - An assembly of two or more protein subunits or different biomolecules (e.g. DNA, RNA, ligands) that interact to perform a function.
Oligomer - A complex of multiple subunits (can be identical or different).
Homo-oligomer - A complex composed of identical subunits (e.g. a homodimer or homotetramer).
Hetero-oligomer - A complex composed of different subunits (e.g. a heterodimer).
Quaternary structure - The arrangement and interactions of multiple subunits within a protein complex.
Ligand - A small molecule, ion, or cofactor that binds to a protein and may regulate or stabilise its function.
Cofactor / Prosthetic group - A non-protein chemical compound (e.g. metal ion, heme, or [Fe-S] cluster) essential for the protein’s biological activity.
Binding site - A region on a protein surface where a ligand, cofactor, or another protein binds specifically.
Conformation - The spatial arrangement of atoms in a protein at a given time; proteins may adopt multiple conformations during function.
Interface - The region where two protein subunits (or proteins) make contact within a complex.
Model - A computational or experimentally determined representation of a protein’s 3D structure.
Databases
PDB (Protein Data Bank) - A public database of experimentally determined 3D structures of proteins and nucleic acids.
UniProt - A comprehensive database of protein sequence and functional information.
Computational prediction terms
Predicted model - A computationally generated structure (e.g. from AlphaFold or Rosetta), not determined experimentally.
Template - A known structure used as a reference or starting point for modelling a related protein.
RMSD (Root Mean Square Deviation) - A measure of the average distance between atoms (usually backbone atoms) of two superimposed protein structures, used to assess structural similarity.
pLDDT (predicted Local Distance Difference Test) - A confidence score provided by AlphaFold for each residue in a predicted protein structure, indicating the reliability of the prediction.
pTM (predicted Template Modelling score) - A confidence score provided by AlphaFold-Multimer for predicted protein complexes, indicating the reliability of the overall quaternary structure prediction.
ipTM (interface predicted Template Modelling score) - A confidence score provided by AlphaFold-Multimer that specifically assesses the quality of the predicted interfaces between subunits in a protein complex.
PAE (Predicted Aligned Error) - A matrix provided by AlphaFold that estimates the expected positional error between pairs of residues in a predicted structure, useful for assessing confidence in relative domain positions.