Mutation Impact & Drug Discovery
A Detailed Study Guide
1. Comparative Modeling (Homology Modeling)
Comparative modeling is a technique used to predict the 3D structure of a protein whose
structure is unknown by using the known structure of a related protein (called the
template).
This method relies on the principle that proteins with similar sequences have similar
structures.
The Four Main Steps:
1. Fold Recognition: Search protein structure databases (like the Protein Data Bank, PDB) to
find a template protein whose 3D structure is similar to your target sequence.
2. Alignment: Align your target protein sequence with the sequence of the chosen template,
ensuring amino acids correspond as closely as possible.
3. Model Building: Using the aligned template structure, place the amino acids of your target
sequence onto the template’s backbone, effectively 'copying' the fold but using the target
sequence.
4. Model Refinement and Validation: If the model looks poor, refine the alignment or search
for better templates and repeat. Evaluate model quality using various metrics (e.g., RMSD).
, Model Accuracy and Limitations:
Structural similarity tends to be more conserved than sequence similarity, meaning
proteins can have very different sequences but similar structures.
When sequence similarity between target and template is high (>50%), models are
generally quite accurate; below 30%, accuracy decreases significantly.
Errors often come from: Using a poor or incorrect template; Misalignments leading to
misplaced residues; Regions in the target sequence not present in the template (gaps),
which must be modeled ab initio or left uncertain; Minor structural differences causing
distortions, especially in loops and surface regions; Incorrect packing of side chains, since
even small sequence differences can alter side chain interactions.
The protein core is usually conserved and modeled more accurately than loops or surface
regions. The active site (if conserved) can often be predicted reasonably well.
Added Value of Modeling: While the template provides a fold, modeling the target sequence
onto it allows for better prediction of specific features, like unique side chains or loops,
giving more accurate insights into active sites or binding pockets.
Added value is roughly: Model Accuracy – Template:Target Similarity.
Limitations: Can only model structures based on already known folds; novel folds unknown
to databases cannot be reliably modeled.
Prediction quality depends heavily on the availability of suitable templates.