Classification: An In-Depth Guide
What Are Protein Domains?
Protein domains are distinct structural units within proteins.
Each domain folds independently into a stable 3D shape, often associated with a specific
function.
Proteins can have one or multiple domains, much like a multi-tool has different functional
parts.
Domains help proteins perform complex tasks by combining different functional modules.
CATH Classification System: Organizing Protein Domains by Structure
The CATH system groups protein domains based on their 3D structure and evolutionary
relationships. It is hierarchical, going from general to specific:
Level | Description | Detail
Class | Composition of secondary structures | Alpha helices, beta strands, or mixed
Architecture | Overall arrangement of secondary structures | How helices and strands are
arranged in space
Topology (Fold) | The precise fold or connectivity | How the protein chain moves from start
to end
Homologous Superfamily | Evolutionary relationship | Domains sharing common ancestry
and similar sequence
1. Class: Based on Secondary Structure Content
Mainly Alpha: Proteins dominated by alpha helices, which are spiral-shaped structures
stabilized by hydrogen bonds.
Mainly Beta: Proteins dominated by beta strands, which are extended chains forming sheets
stabilized by hydrogen bonding between strands.
Mixed Alpha-Beta: Proteins with both alpha helices and beta strands in significant amounts,
without a clear majority.
, 2. Architecture: Arrangement of Secondary Structures
This level looks at the 3D arrangement of secondary structures without considering the
connectivity.
For example, proteins can have the same secondary structure composition but different
ways the helices and strands are packed or oriented.
3. Topology (Fold): Connectivity and Path of the Polypeptide Chain
Topology specifies how the secondary structures are connected in the protein chain, from
N-terminus to C-terminus.
This defines the fold — the exact 3D structure, including which strands connect to which
helices and in what order.
Proteins with the same fold often share evolutionary origins (homologous).
Different folds have different functional and evolutionary implications.
Example: Within the Mainly Alpha class and orthogonal bundle architecture, you might find
distinct folds like the Annexin fold, Globin-like fold, or DNA-binding fold.
4. Homologous Superfamily
Groups of proteins with significant sequence similarity and the same fold are clustered into
homologous superfamilies, reflecting evolutionary relatedness.
These groups often have conserved functions and structures.
Mainly Alpha Class: Proteins Dominated by Alpha Helices
Alpha helices have about 3.6 amino acids per turn, stabilized by hydrogen bonds between
backbone atoms.
Coiled-coils (Leucine Zippers):
- Consist of two alpha helices wound around each other.
- Have repeating seven-residue sequences where leucine (a hydrophobic amino acid) faces
inward, stabilizing the structure by forming a hydrophobic core.
- Charged amino acids on the edges stabilize the structure through ionic interactions.
- Outside surfaces are polar, allowing interaction with water.
- Example: The GCN4 transcription factor uses a coiled-coil for dimerization.