CONCEPTS OF PROTEIN
TECHNOLOGY AND APPLICATIONS
CONTENT TABLE
Introduction ................................................................................................................................. 4
Definition proteomics ...................................................................................................................... 5
Why proteomics? ............................................................................................................................ 5
Proteomics as part of systems biology .............................................................................................. 7
The different faces of proteomics ..................................................................................................... 7
Identification of proteins: principles ................................................................................................. 8
Workflows for ‘shotgun’ proteomics ................................................................................................12
Sample preparation ..................................................................................................................... 13
Reverse Phase HPLC in proteomics .............................................................................................. 26
Standard conditions for RP-HPLC ....................................................................................................27
Stationary phase ........................................................................................................................ 27
Mobile phase ............................................................................................................................. 28
Organic solvent ..................................................................................................................... 28
pH of mobile phase ................................................................................................................ 30
Agents for ion-pairing ............................................................................................................. 30
Microcolumn Reverse Phase Chromatography.................................................................................32
2D-LC ............................................................................................................................................34
Mass spectrometry in proteomics: instruments ............................................................................ 38
Some essential terms .....................................................................................................................38
Ionization .......................................................................................................................................41
Electrospray ionization (ESI) ....................................................................................................... 42
MALDI ............................................................................................................................................46
Mass analyzers...............................................................................................................................49
Magnetic sector analyzers .......................................................................................................... 49
Quadrupoles ............................................................................................................................. 50
3D ion trap ................................................................................................................................. 51
2D ion trap (linear ion trap) ......................................................................................................... 54
Fourier-Transform ion cyclotron resonance (FT-ICR) .................................................................... 54
The orbitrap or electrostatic trap................................................................................................. 56
Time of flight (TOF) ..................................................................................................................... 57
Detectors .......................................................................................................................................61
Mass spectrometry in proteomics: tandem mass spectrometry – MSMS ........................................ 64
1
, Ion activation .................................................................................................................................67
Collision induced dissociation (CID) ........................................................................................... 67
Other methods for ion activation................................................................................................. 68
Hybrid instruments .........................................................................................................................69
Triple-quadrupole ...................................................................................................................... 69
Q-TOF ....................................................................................................................................... 70
Q-Trap ....................................................................................................................................... 70
TOF-TOF .................................................................................................................................... 71
QExactive: Quad + CID + CTRAP + Orbitrap.................................................................................. 72
Fragmentation spectra ...................................................................................................................73
Nomenclatuur (Roepstorff-Hohlmann)........................................................................................ 74
Fragmentation ........................................................................................................................... 75
Formation of ions, characteristic for lateral side chains ............................................................... 78
Sequence determination ............................................................................................................ 79
Influence of collision method on the fragmentation pattern ......................................................... 81
De novo sequencing .......................................................................................................................81
‘Top-down’ strategy ........................................................................................................................82
Protein identification ......................................................................................................................82
Peptide Mass Fingerprint (PMF)................................................................................................... 82
Peptide Fragment Fingerprint (PFF) ............................................................................................. 84
Mass spectrometry in proteomics – LC-MS ................................................................................... 86
Mass spectrometry in proteomics – applications .......................................................................... 90
Protein identification ......................................................................................................................90
Detection and characterization of mutations ...................................................................................90
Verification of structure and purity of proteins and peptides ..............................................................91
Non-covalent protein complexes and 3D structure information ........................................................92
Quantification ................................................................................................................................93
Quantification methods on the basis of isotopes ......................................................................... 94
Absolute quantification after addition of isotopic peptides (AQUA) .......................................... 94
Stable isotope labeling by amino acids in cell culture (SILAC) .................................................. 96
Isotope-coded protein labels (ICPL) ........................................................................................ 99
Isobaric tags for relative and absolute quantification (iTRAQ/TMT) ......................................... 100
Label-free quantitative methods ............................................................................................... 105
EmPAI ................................................................................................................................. 105
Spectral counting ................................................................................................................ 105
Overview of methods .................................................................................................................... 108
Exercises – Xaveer Van Ostade ................................................................................................... 110
Search engines .......................................................................................................................... 116
Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) ................................................ 125
PollEv quizes ............................................................................................................................. 137
2
,3
,INTRODUCTION
• Read article & understand (mostly proteomics and technology part)
o During examination à questions about this article!
o You can print out the article and bring it!
§ You can highlight and underlune, but NO writing!
• Faculty Research Day à students are required to attend
o Poster session with a poster of every lab
• Examination: approximately 6 questions
o 2 of Prof. Lermyte, may include an exercise (10 points)
o 3 of Prof. Van Ostade, usually includes an exercise (20 points)
§ Exercises usually about sequencing about peptides
o 1 or 2 about article (10 points)
• PollEv questions in class: on examination questions will be more extended
4
,DEFINITION PROTEOMICS
• Determination of the complete set of proteins that is present in a system, under
specific circumstances:
(Exception: sometimes we just look at 1 protein, e.g. for post-translational
modification)
o System:
§ Protein complex
§ Subcellular compartment
(nucleus, mitochondrium, lysosome, …)
§ Cell (usually this)
§ Tissue
§ Organism (yeast, drosophila, …)
o Circumstances:
§ Treatment
§ Time after treatment!
§ Condition of the cell (age, normal, infected, tumor…)
§ …
WHY PROTEOMICS?
• 1.1: Compared to genomics and transcriptomics, proteomics is “the real thing”
o Proteins are the working motor of the cell
o You need to assemble the whole machine to get a working cell
o Assembly is what we do with proteomics
• 1.2: Genome sequencing
o Estimation: humans ± 20-40.000 genes, yeast 6000, Drosophila 13000,
Caenorhabditis 18.000, plant 26.000
§ We don’t have many more genes but we are a lot more complex, so
amount of genes doesn’t say everything about the complexity of an
organism
§ Still difficult to predict genes: verification of gene product by
proteomic analysis is still necessary
à “Proteogenomics”
5
,• 2: mRNA vs Protein profiling
o No direct correlation:
microarrays/RNAseq are
insufficient to measure protein
expression
o Large study: comparison
amount RNA (x-axis) vs protein
(y-axis)
§ Lots of RNA that is not converted into protein
§ So high amount of RNA doesn’t necessarily mean lots of protein
• 3: More (6-8) proteins/gene à 1 gene can produce many proteins
o Post-translational modifications (PTMs)
o Alternative splicing à isoforms
§ Transcription à splicing à translation
§ Normal splicing and subsequent translation results in a certain
protein
§ Because of alternative splicing some parts of the protein may be
missing or there is a frameshift and there are completely different
parts of proteins
• 4: Protein interaction networks
o Most proteins interact with each other à proteomics is “the real thing”
§ Genes and mRNA’s interact hardly with each other
o Higher order of complexity without drastic increase of number of
components. About 78% of yeast proteins is involved in complex
o Most cellular processes are regulated by protein complexes instead of
individual proteins
o Functional proteomics: definition of protein as an element in an
interaction network (‘contextual function’), rather than ascribing it to one
function
6
, • 5: Cellular localization
o Depending on the biological state of the cell, a protein can be localized in
one or different cellular locations (nucleus, cytosol, plasma membrane
mitochondria, ER…)
§ STAT-protein is localized in cellular membrane but translocates to
nucleus and binds to promotor sequences to activate expression
of some genes
§ So depending on de time after IFN activation you will get a different
result
o Different binding partners in different locations à 1 protein can have
several functions, depending on the localization in the cell
• All these features cannot be predicted by genome sequencing à proteomics
PROTEOMICS AS PART OF SYSTEMS BIOLOGY
• In order to understand the dynamic complexity of an organism, an integrated
image of all aspects of proteins needs to be developed (so far only the average of
all possible states is measured):
o mRNA and protein profiles and how these change over time, e.g. during
development or changing conditions (e.g. pathological)
o Knowledge of the state and properties of all proteins:
§ Posttranslational modifications
§ Cellular localization
§ Binding of ‘metabolomic’ ligands: e.g.. haem ring, metal ions,
glucose, ATP, ADP, GTP, GDP…
§ Alternative splicing
§ Proteolytic degradation . Hence, synthesis, localization and
activity status of a protease are regulating factors
§ Oligomeric state and contribution in complexes
§ Structure, conformation and allosteric mechanisms
o All protein-protein interactions in space and time in one cell
• Together with genomic & metabolomic data (in space & time) à systems biology
THE DIFFERENT FACES OF PROTEOMICS
• Proteomics sensu strictu
o Large scale identification and characterization of proteins, inclusive their
posttranslational modifications (‘shotgun’ proteomics)
• Differential Proteomics
o Large scale comparison of protein expression levels
• Cell-mapping proteomics
o Protein-protein interaction studies
7
,IDENTIFICATION OF PROTEINS: PRINCIPLES
• Extract proteins from sample, tissue, …
• You can separate extracted proteins (but not obligatory)
• Then digest protein with a protease (often trypsin) à then you have a whole
amount of peptides
o If you digest with trypsin you have about 40 peptides per protein
• These need to be separated with peptide liquid chromatography
• Electrospray ionization: ionizes peptide (positive charge) and brings peptide in
gas phase
• Ion mobility: separation of peptides based on their shape (optional)
• Mass analyzer
o Many different types, usage depends on what you want to study
• Data analysis
o Identification of proteins, and even further a functional analysis à does
the protein serve any process in the cells?
• Sample preparation
o 1: Break up tissues or cells, extract protein fraction
o 2: Modification of proteins for further analysis (denaturation, reduction
etc). Dependent on the forthcoming methods for separation/purification
and identification
§ Sometimes you need native proteins, then you can’t use trypsin
o 3: Important variables that determine the success of separation /
purification and identification:
§ Method of cell lysis, type of detergent
§ pH
§ T°
§ Proteolytic degradation (addition of protease inhibitors)
o 4: In many cases: trial and error, so sometimes…
8
, o Protein fractionation: gel electrophoresis
§ You can do this to pre-fractionate the sample already
§ Good so you don’t have thousands of peptides in chromatography
cause then they won’t be separated
§ Cut out some pieces of gel and digest the proteins inside the
pieces à introduce them in mass analyzer to identify the proteins
o Prepare proteins for MS analysis: protease treatment (can be done before
or after protein separation)
§ Proteases hydrolyze (specific) peptide bonds:
> Trypsin: C-terminal of Lys and Arg residues
- Trypsin is used in the vast majority of cases
> Chymotrypsin: C-terminal of large hydrophobic residues
(Tyr, Phe, Trp)
• Protein/peptide separation
o Protein enrichment needed: dynamic range of a protein in a cell is very
wide: between 1.000.000 en 10 copies/cel → extremely difficult to detect
low abundant proteins
o X-axis: how many cells you need for an amount of weight of cells
o Some proteins are very low/high abundant in cells
o When you have a certain amount (mol) of protein in tube it corresponds
with a certain weight
o When we use Coomassie gel we can only visualize proteins that are at a
medium or very high abundancy, the low ones can’t be visualized
o With silver stain you can visualize low abundant proteins but only in high
amount! So It’s very hard to see those à we have to concentrate/enrich
them from a sample to see them not only in the gel, but especially in the
mass spectrometer
9
, o Plasma: protein concentrations differ in concentrations by 11 orders of
magnitude!
§ Proteins that are high abundant will mask proteins that are low
abundant, they will be saturated by high abundant proteins
§ You look for a needle in a haystack so difficult, but proteomics is a
way of finding it easier (like a metal detector), but you have to use a
good methodology/strategy
o Increasing separation capacity (peak capacity)
§ Use sequential separation techniques (“dimensions”) whereby
each dimension is a technique based on a different
physicochemical characteristic of the proteins or peptides
(orthogonal separation)
> Many dimensions of chromatography or gel electrophoresis
> Each dimension is a separation technique for a
physicochemical characteristic
> = orthogonal separation
§ Multiply separation capacities of each dimension
§ The number of peaks (that don’t overlap!) = max number of
components that can be separated by thus type of
chromatography = peak capacity
o Peptide separation: usually chromatography
§ Separation of biomolecules on the basis of their distribution over a
mobile and a stationary phase
o High pressure liquid chromatography (HPLC): principle
§ You need: solvent, pump, injector (of sample in system), column
(for separation), detector (usually UV-detector), then sample can
go to mass spectrometer instead of waste
§ Migration of proteins in solvent (mobile phase) through column
that is packed with beads (stationary phase)
§ Layer of proteins that stick to mobile phase and proteins in
solution in mobile phase (depends on affinity for stationary phase)
10
TECHNOLOGY AND APPLICATIONS
CONTENT TABLE
Introduction ................................................................................................................................. 4
Definition proteomics ...................................................................................................................... 5
Why proteomics? ............................................................................................................................ 5
Proteomics as part of systems biology .............................................................................................. 7
The different faces of proteomics ..................................................................................................... 7
Identification of proteins: principles ................................................................................................. 8
Workflows for ‘shotgun’ proteomics ................................................................................................12
Sample preparation ..................................................................................................................... 13
Reverse Phase HPLC in proteomics .............................................................................................. 26
Standard conditions for RP-HPLC ....................................................................................................27
Stationary phase ........................................................................................................................ 27
Mobile phase ............................................................................................................................. 28
Organic solvent ..................................................................................................................... 28
pH of mobile phase ................................................................................................................ 30
Agents for ion-pairing ............................................................................................................. 30
Microcolumn Reverse Phase Chromatography.................................................................................32
2D-LC ............................................................................................................................................34
Mass spectrometry in proteomics: instruments ............................................................................ 38
Some essential terms .....................................................................................................................38
Ionization .......................................................................................................................................41
Electrospray ionization (ESI) ....................................................................................................... 42
MALDI ............................................................................................................................................46
Mass analyzers...............................................................................................................................49
Magnetic sector analyzers .......................................................................................................... 49
Quadrupoles ............................................................................................................................. 50
3D ion trap ................................................................................................................................. 51
2D ion trap (linear ion trap) ......................................................................................................... 54
Fourier-Transform ion cyclotron resonance (FT-ICR) .................................................................... 54
The orbitrap or electrostatic trap................................................................................................. 56
Time of flight (TOF) ..................................................................................................................... 57
Detectors .......................................................................................................................................61
Mass spectrometry in proteomics: tandem mass spectrometry – MSMS ........................................ 64
1
, Ion activation .................................................................................................................................67
Collision induced dissociation (CID) ........................................................................................... 67
Other methods for ion activation................................................................................................. 68
Hybrid instruments .........................................................................................................................69
Triple-quadrupole ...................................................................................................................... 69
Q-TOF ....................................................................................................................................... 70
Q-Trap ....................................................................................................................................... 70
TOF-TOF .................................................................................................................................... 71
QExactive: Quad + CID + CTRAP + Orbitrap.................................................................................. 72
Fragmentation spectra ...................................................................................................................73
Nomenclatuur (Roepstorff-Hohlmann)........................................................................................ 74
Fragmentation ........................................................................................................................... 75
Formation of ions, characteristic for lateral side chains ............................................................... 78
Sequence determination ............................................................................................................ 79
Influence of collision method on the fragmentation pattern ......................................................... 81
De novo sequencing .......................................................................................................................81
‘Top-down’ strategy ........................................................................................................................82
Protein identification ......................................................................................................................82
Peptide Mass Fingerprint (PMF)................................................................................................... 82
Peptide Fragment Fingerprint (PFF) ............................................................................................. 84
Mass spectrometry in proteomics – LC-MS ................................................................................... 86
Mass spectrometry in proteomics – applications .......................................................................... 90
Protein identification ......................................................................................................................90
Detection and characterization of mutations ...................................................................................90
Verification of structure and purity of proteins and peptides ..............................................................91
Non-covalent protein complexes and 3D structure information ........................................................92
Quantification ................................................................................................................................93
Quantification methods on the basis of isotopes ......................................................................... 94
Absolute quantification after addition of isotopic peptides (AQUA) .......................................... 94
Stable isotope labeling by amino acids in cell culture (SILAC) .................................................. 96
Isotope-coded protein labels (ICPL) ........................................................................................ 99
Isobaric tags for relative and absolute quantification (iTRAQ/TMT) ......................................... 100
Label-free quantitative methods ............................................................................................... 105
EmPAI ................................................................................................................................. 105
Spectral counting ................................................................................................................ 105
Overview of methods .................................................................................................................... 108
Exercises – Xaveer Van Ostade ................................................................................................... 110
Search engines .......................................................................................................................... 116
Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) ................................................ 125
PollEv quizes ............................................................................................................................. 137
2
,3
,INTRODUCTION
• Read article & understand (mostly proteomics and technology part)
o During examination à questions about this article!
o You can print out the article and bring it!
§ You can highlight and underlune, but NO writing!
• Faculty Research Day à students are required to attend
o Poster session with a poster of every lab
• Examination: approximately 6 questions
o 2 of Prof. Lermyte, may include an exercise (10 points)
o 3 of Prof. Van Ostade, usually includes an exercise (20 points)
§ Exercises usually about sequencing about peptides
o 1 or 2 about article (10 points)
• PollEv questions in class: on examination questions will be more extended
4
,DEFINITION PROTEOMICS
• Determination of the complete set of proteins that is present in a system, under
specific circumstances:
(Exception: sometimes we just look at 1 protein, e.g. for post-translational
modification)
o System:
§ Protein complex
§ Subcellular compartment
(nucleus, mitochondrium, lysosome, …)
§ Cell (usually this)
§ Tissue
§ Organism (yeast, drosophila, …)
o Circumstances:
§ Treatment
§ Time after treatment!
§ Condition of the cell (age, normal, infected, tumor…)
§ …
WHY PROTEOMICS?
• 1.1: Compared to genomics and transcriptomics, proteomics is “the real thing”
o Proteins are the working motor of the cell
o You need to assemble the whole machine to get a working cell
o Assembly is what we do with proteomics
• 1.2: Genome sequencing
o Estimation: humans ± 20-40.000 genes, yeast 6000, Drosophila 13000,
Caenorhabditis 18.000, plant 26.000
§ We don’t have many more genes but we are a lot more complex, so
amount of genes doesn’t say everything about the complexity of an
organism
§ Still difficult to predict genes: verification of gene product by
proteomic analysis is still necessary
à “Proteogenomics”
5
,• 2: mRNA vs Protein profiling
o No direct correlation:
microarrays/RNAseq are
insufficient to measure protein
expression
o Large study: comparison
amount RNA (x-axis) vs protein
(y-axis)
§ Lots of RNA that is not converted into protein
§ So high amount of RNA doesn’t necessarily mean lots of protein
• 3: More (6-8) proteins/gene à 1 gene can produce many proteins
o Post-translational modifications (PTMs)
o Alternative splicing à isoforms
§ Transcription à splicing à translation
§ Normal splicing and subsequent translation results in a certain
protein
§ Because of alternative splicing some parts of the protein may be
missing or there is a frameshift and there are completely different
parts of proteins
• 4: Protein interaction networks
o Most proteins interact with each other à proteomics is “the real thing”
§ Genes and mRNA’s interact hardly with each other
o Higher order of complexity without drastic increase of number of
components. About 78% of yeast proteins is involved in complex
o Most cellular processes are regulated by protein complexes instead of
individual proteins
o Functional proteomics: definition of protein as an element in an
interaction network (‘contextual function’), rather than ascribing it to one
function
6
, • 5: Cellular localization
o Depending on the biological state of the cell, a protein can be localized in
one or different cellular locations (nucleus, cytosol, plasma membrane
mitochondria, ER…)
§ STAT-protein is localized in cellular membrane but translocates to
nucleus and binds to promotor sequences to activate expression
of some genes
§ So depending on de time after IFN activation you will get a different
result
o Different binding partners in different locations à 1 protein can have
several functions, depending on the localization in the cell
• All these features cannot be predicted by genome sequencing à proteomics
PROTEOMICS AS PART OF SYSTEMS BIOLOGY
• In order to understand the dynamic complexity of an organism, an integrated
image of all aspects of proteins needs to be developed (so far only the average of
all possible states is measured):
o mRNA and protein profiles and how these change over time, e.g. during
development or changing conditions (e.g. pathological)
o Knowledge of the state and properties of all proteins:
§ Posttranslational modifications
§ Cellular localization
§ Binding of ‘metabolomic’ ligands: e.g.. haem ring, metal ions,
glucose, ATP, ADP, GTP, GDP…
§ Alternative splicing
§ Proteolytic degradation . Hence, synthesis, localization and
activity status of a protease are regulating factors
§ Oligomeric state and contribution in complexes
§ Structure, conformation and allosteric mechanisms
o All protein-protein interactions in space and time in one cell
• Together with genomic & metabolomic data (in space & time) à systems biology
THE DIFFERENT FACES OF PROTEOMICS
• Proteomics sensu strictu
o Large scale identification and characterization of proteins, inclusive their
posttranslational modifications (‘shotgun’ proteomics)
• Differential Proteomics
o Large scale comparison of protein expression levels
• Cell-mapping proteomics
o Protein-protein interaction studies
7
,IDENTIFICATION OF PROTEINS: PRINCIPLES
• Extract proteins from sample, tissue, …
• You can separate extracted proteins (but not obligatory)
• Then digest protein with a protease (often trypsin) à then you have a whole
amount of peptides
o If you digest with trypsin you have about 40 peptides per protein
• These need to be separated with peptide liquid chromatography
• Electrospray ionization: ionizes peptide (positive charge) and brings peptide in
gas phase
• Ion mobility: separation of peptides based on their shape (optional)
• Mass analyzer
o Many different types, usage depends on what you want to study
• Data analysis
o Identification of proteins, and even further a functional analysis à does
the protein serve any process in the cells?
• Sample preparation
o 1: Break up tissues or cells, extract protein fraction
o 2: Modification of proteins for further analysis (denaturation, reduction
etc). Dependent on the forthcoming methods for separation/purification
and identification
§ Sometimes you need native proteins, then you can’t use trypsin
o 3: Important variables that determine the success of separation /
purification and identification:
§ Method of cell lysis, type of detergent
§ pH
§ T°
§ Proteolytic degradation (addition of protease inhibitors)
o 4: In many cases: trial and error, so sometimes…
8
, o Protein fractionation: gel electrophoresis
§ You can do this to pre-fractionate the sample already
§ Good so you don’t have thousands of peptides in chromatography
cause then they won’t be separated
§ Cut out some pieces of gel and digest the proteins inside the
pieces à introduce them in mass analyzer to identify the proteins
o Prepare proteins for MS analysis: protease treatment (can be done before
or after protein separation)
§ Proteases hydrolyze (specific) peptide bonds:
> Trypsin: C-terminal of Lys and Arg residues
- Trypsin is used in the vast majority of cases
> Chymotrypsin: C-terminal of large hydrophobic residues
(Tyr, Phe, Trp)
• Protein/peptide separation
o Protein enrichment needed: dynamic range of a protein in a cell is very
wide: between 1.000.000 en 10 copies/cel → extremely difficult to detect
low abundant proteins
o X-axis: how many cells you need for an amount of weight of cells
o Some proteins are very low/high abundant in cells
o When you have a certain amount (mol) of protein in tube it corresponds
with a certain weight
o When we use Coomassie gel we can only visualize proteins that are at a
medium or very high abundancy, the low ones can’t be visualized
o With silver stain you can visualize low abundant proteins but only in high
amount! So It’s very hard to see those à we have to concentrate/enrich
them from a sample to see them not only in the gel, but especially in the
mass spectrometer
9
, o Plasma: protein concentrations differ in concentrations by 11 orders of
magnitude!
§ Proteins that are high abundant will mask proteins that are low
abundant, they will be saturated by high abundant proteins
§ You look for a needle in a haystack so difficult, but proteomics is a
way of finding it easier (like a metal detector), but you have to use a
good methodology/strategy
o Increasing separation capacity (peak capacity)
§ Use sequential separation techniques (“dimensions”) whereby
each dimension is a technique based on a different
physicochemical characteristic of the proteins or peptides
(orthogonal separation)
> Many dimensions of chromatography or gel electrophoresis
> Each dimension is a separation technique for a
physicochemical characteristic
> = orthogonal separation
§ Multiply separation capacities of each dimension
§ The number of peaks (that don’t overlap!) = max number of
components that can be separated by thus type of
chromatography = peak capacity
o Peptide separation: usually chromatography
§ Separation of biomolecules on the basis of their distribution over a
mobile and a stationary phase
o High pressure liquid chromatography (HPLC): principle
§ You need: solvent, pump, injector (of sample in system), column
(for separation), detector (usually UV-detector), then sample can
go to mass spectrometer instead of waste
§ Migration of proteins in solvent (mobile phase) through column
that is packed with beads (stationary phase)
§ Layer of proteins that stick to mobile phase and proteins in
solution in mobile phase (depends on affinity for stationary phase)
10