6BBB0333
L19 & 20 – Novel Protein Design
Construction of Novel Proteins
1. Modification of existing proteins
- Site-directed mutagenesis (SDM)
- Chemical modification
- In hopes that these methods will change the structure of the protein
2. Utilise framework of known protein
- e.g. Ig fold and SDM -> new binding sites
o Introduce metabinding in framework of energy domain
o Put histidines on loops connecting beta strand in Ig fold
o Have introduced new biological function - in this case, the binding of a
meta
- i.e. introduce biological function into a known framework
3. Design of novel proteins by analogy to known structures
- Homology modelling
4. In vitro random mutagenesis and selection
- Mutagenesis in vitro
- Perhaps will yield new fold
5. In silico mutagenesis and energy calculation
- Mutagenesis in silico
- Perhaps will yield new fold
6. Entirely novel design
- Something entirely new not seen in nature
Motivation: not only to generate useful molecules, but also to learn about
protein folding, the “inverse folding problem” and also the evolution of new
protein folds
- Development of proteins with practical application – industry and medicine
- Helps understanding of inverse folding
Inverse Folding
- Can you invent a sequence to fold in a defined way?
o Usually we start with primary sequence and try to understand how it
fold into tertiary structure
,6BBB0333
o Inverse folding = opposite
What kind of primary sequence do we need to achieve to
achieve a specific fold?
- Remember: CONTEXT of sequence important
o The primary sequence of protein encodes the tertiary sequence
o But this is not always true…must consider the context (in what context
i.e. environment is your primary sequence folding in?)
- OVERALL sequence -> unique fold
Same sequence of amino-acids can fold differently, depending on the context
Example:
- 11-amino-acid sequence (dubbed the 'chameleon' sequence) that folds as an
α-helix when in one position but as a β-sheet when in another position of the
primary sequence of the IgG-binding domain of protein G (GUI).
- Can fold into alpha helix or beta strand structure depending on where in the
tertiary structure of this Ig domain it exists
o Where in the tertiary structure this AA sequence sits determines its
structure
Paracelsus challenge
- Aim: To transform the conformation of one globular protein into that of another
by changing < 50% of the sequence.
- I.e. can you design a protein with 50% sequence identity to a protein with a
different fold.
Method
- Started with protein G (has alpha-beta fold)
- Changed it into 4-helix bundle fold (Janus)
- Protein G and Janus have 50% identity, but a different fold
- This 4-helix bundle fold is similar to other protein that exists in nature called
ROP
,6BBB0333
Janus vs ROP
- Despite having 50% identity, they have different folds
- 2 different 3D structures!
- However, Janus does not exist in nature – this is artificial
Homologous sequences usually imply similar tertiary structures
- Above is artificial…this is what we really know:
- If 2 proteins are homologous have over 30% homology in primary structure, it
is likely they will fold into similar tertiary structure
- Janus clearly contradicts this “rule” – Janus is a synthetic protein
o So can this phenomenon only be synthesized? Or does this exist in
nature?
- But does this type of transformation/behavior ever occur in nature?
o YES!...
, 6BBB0333
Homologous proteins with 40% sequence identity but different folds
- Pfl6 and Xfaso1
- Similar in primary sequence, but different tertiary structure
- Their biological functions are similar
- Both proteins are Cro family repressors (must have both been derived from
common ancestor)
- They control the genetic switch that determines the lytic vs lysogenic post
infection cycle. (lytic cycle: viruses quickly take over the host cell, make many
copies, break the cell, and infect other cells. lysogenic cycle: viruses sneak
into the host's DNA, stay hidden)
o They control whether phage goes through lytic cycle or lysogenic cycle
Lytic: make many copies of itself and kill host cell
Lysogenic: only integrates itself into bacterial DNA, does not
replicate itself.
o Phages are viruses for bacteria
- Biological function is similar, sequence identify also similar, but tertiary fold is
different
- i.e. they are functionally related, and therefore must have diverged and
changed their structure.
Bottom line: structure with similar primary sequence can have different folds
Further evidence for this as follows…
L19 & 20 – Novel Protein Design
Construction of Novel Proteins
1. Modification of existing proteins
- Site-directed mutagenesis (SDM)
- Chemical modification
- In hopes that these methods will change the structure of the protein
2. Utilise framework of known protein
- e.g. Ig fold and SDM -> new binding sites
o Introduce metabinding in framework of energy domain
o Put histidines on loops connecting beta strand in Ig fold
o Have introduced new biological function - in this case, the binding of a
meta
- i.e. introduce biological function into a known framework
3. Design of novel proteins by analogy to known structures
- Homology modelling
4. In vitro random mutagenesis and selection
- Mutagenesis in vitro
- Perhaps will yield new fold
5. In silico mutagenesis and energy calculation
- Mutagenesis in silico
- Perhaps will yield new fold
6. Entirely novel design
- Something entirely new not seen in nature
Motivation: not only to generate useful molecules, but also to learn about
protein folding, the “inverse folding problem” and also the evolution of new
protein folds
- Development of proteins with practical application – industry and medicine
- Helps understanding of inverse folding
Inverse Folding
- Can you invent a sequence to fold in a defined way?
o Usually we start with primary sequence and try to understand how it
fold into tertiary structure
,6BBB0333
o Inverse folding = opposite
What kind of primary sequence do we need to achieve to
achieve a specific fold?
- Remember: CONTEXT of sequence important
o The primary sequence of protein encodes the tertiary sequence
o But this is not always true…must consider the context (in what context
i.e. environment is your primary sequence folding in?)
- OVERALL sequence -> unique fold
Same sequence of amino-acids can fold differently, depending on the context
Example:
- 11-amino-acid sequence (dubbed the 'chameleon' sequence) that folds as an
α-helix when in one position but as a β-sheet when in another position of the
primary sequence of the IgG-binding domain of protein G (GUI).
- Can fold into alpha helix or beta strand structure depending on where in the
tertiary structure of this Ig domain it exists
o Where in the tertiary structure this AA sequence sits determines its
structure
Paracelsus challenge
- Aim: To transform the conformation of one globular protein into that of another
by changing < 50% of the sequence.
- I.e. can you design a protein with 50% sequence identity to a protein with a
different fold.
Method
- Started with protein G (has alpha-beta fold)
- Changed it into 4-helix bundle fold (Janus)
- Protein G and Janus have 50% identity, but a different fold
- This 4-helix bundle fold is similar to other protein that exists in nature called
ROP
,6BBB0333
Janus vs ROP
- Despite having 50% identity, they have different folds
- 2 different 3D structures!
- However, Janus does not exist in nature – this is artificial
Homologous sequences usually imply similar tertiary structures
- Above is artificial…this is what we really know:
- If 2 proteins are homologous have over 30% homology in primary structure, it
is likely they will fold into similar tertiary structure
- Janus clearly contradicts this “rule” – Janus is a synthetic protein
o So can this phenomenon only be synthesized? Or does this exist in
nature?
- But does this type of transformation/behavior ever occur in nature?
o YES!...
, 6BBB0333
Homologous proteins with 40% sequence identity but different folds
- Pfl6 and Xfaso1
- Similar in primary sequence, but different tertiary structure
- Their biological functions are similar
- Both proteins are Cro family repressors (must have both been derived from
common ancestor)
- They control the genetic switch that determines the lytic vs lysogenic post
infection cycle. (lytic cycle: viruses quickly take over the host cell, make many
copies, break the cell, and infect other cells. lysogenic cycle: viruses sneak
into the host's DNA, stay hidden)
o They control whether phage goes through lytic cycle or lysogenic cycle
Lytic: make many copies of itself and kill host cell
Lysogenic: only integrates itself into bacterial DNA, does not
replicate itself.
o Phages are viruses for bacteria
- Biological function is similar, sequence identify also similar, but tertiary fold is
different
- i.e. they are functionally related, and therefore must have diverged and
changed their structure.
Bottom line: structure with similar primary sequence can have different folds
Further evidence for this as follows…