Molecular Biology of the Cell
Lecture 1 (06/09/2022) – Replication, sequencing & PCR
The central dogma → replication-transcription-translation
DNA → RNA → protein → metabolite → phenotype
DNA replication → semi-conservative
➔ Every new double stranded molecule
consists of one old and one new strand
➔ Old strand → template for new strand
DNA synthesis → the process of formation of phosphodiester bonds
while hydrolyzing the matching dNTP molecule
- Direction of synthesis is 5’ end to 3’ end due to the OH-group
DNA polymerase → synthesizes DNA from a double stranded “primer”
Replisome → molecular machine, consists of multiple proteins, all involved in
replicating the DNA strands
Mistakes in DNA synthesis → can be restored through
proof-reading activity of the DNA polymerase complex
➔ Frequency of mistakes:
o without proof-reading → 1 error per 105 NTs
o with proof-reading → 1 error per 107 NTs
o with strand-directed
mismatch repair → 1 error per
1010 NTs
5’→3’ polymerization
3’→5’ exonucleolytic proofreading
Chemical changes in DNA bases can cause mutations
- Depurination → loss of purine base (A or G)
- Deamination → loss of amino group (C→U)
➔ Point mutations or deletions
,DNA sequencing → determine the order of the bases A, C, G, and T
PCR → Polymerase Chain Reaction → DNA amplification
Dideoxy sequencing (Sanger) → DNA synthesis while incorporating chain terminators in separate reactions
➔ Chain terminators (dideoxynucleotides) → lack the 3’-OH necessary for strand extension
- With a high concentration of all dNTPs and low
concentrations of ddATP (ddCTP , ddGTP, ddTTP) in
separate reactions DNA synthesis will continue, but
occasionally synthesis will stop when a dideoxy
nucleotide is incorporated
- With four separate reactions and a labelled primer,
fragments of different lengths are generated →
separated alongside electrophoresis every visible
fragment represents a termination in DNA synthesis
Genome sequencing → BAC libraries and “shotgun” fragment sequencing
➔ Assembly by comparison of overlapping sequences
Shotgun sequencing
Contigs → assembly of smaller DNA sequences into one continuous
The “innovation” of massive parallel sequencing → reactions take place in
picolitre volume on microscopic beads
Sequencing technique → pyrosequencing
➔ Incorporation of a dNTP
molecule supplies PPi that is
converted to ATP. Every ATP
molecule is consumed by luciferase
to yield 1 flash of light
Same principle with fluorescence →
Ion torrent sequencing → incorporation of a nucleotide results in release
of a proton. Resulting pH change can be measured on a chip
Massive parallel sequencing → differences between techniques
- Bulk DNA sequence data
- Read length
- Error rate
Whole genome sequencing is used for reconstruction of genomes of
extinct species like cave bears, mammoth, neanderthal
, Polymerase
Chain Reaction
(PCR) →
amplification of
1 specific DNA
fragment
PCR → cloning genomic fragments
- Knowledge required → homologous DNA
sequence & amino acid sequence protein
- Advantages → sensitive & fast
- Restrictions → fragment length (< 20kb) &
homologous DNA sequences
RT-qPCR → cloning of specific cDNA fragments
Techniques for detecting DNA polymorphisms
- PCR-BASED → VNTR (Variable
Number Tandem Repeats)
- PCR and sequencing-
BASED/melting curve → SNP (Single
Nucleotide Polymorphism)
, Lecture 2 (06/09/2022) – Genomes, omics & bioinformatics
Bioinformatics → how to handle the enormous amount of information produced by DNA sequencing, RNA
expression, protein patterns, metabolite contents, digitalized phenotypes
There is no correlation between genome size and organismal complexity → amoeba has a greater genome
Only 2% of the human genome codes for proteins
Repeated DNA sequences → important for regulation of gene
expression and maintaining DNA structure
Genomics
➔ DNA sequence analysis, genome fragments
➔ DNA database (NCBI, Genbank, EMBL)
➔ DNA markers (SNPs etc.)
Utilization of databases:
- Compare unknown fragments
- Identification gene fragments (annotation)
- Chromosome structure/synteny
- Intron/exon boundaries
- Phylogeny
Chromosomes contain many duplicated segments
→ intra and inter-chromosomal duplications
Genome annotation → process of identifying the
locations of genes and all of the coding regions in
a genome and determining what those genes do.
Once a genome is sequenced, it needs to be
annotated to make sense of it
Exon length is conserved between human, fly and worm → suggests functional restriction splicing machinery
Intron length is much more variable in human, peaking at 87 bp, but trailing till 3,300 bp...
➔ Suggests that exons (not introns) might have a limit in size in order to splice the mRNA
Alternative splicing → splicing on places that you did not predict (not on normal splice sites)
- 35% of the genes have alternative splicing
- 70% in the coding sequence → alters the protein
- 20% terminal exon added
Differential splicing → one gene produces different mRNAs that
code for different proteins
Synteny → preserved order of genes between related organisms
- Since the order of genes mostly has a neutral effect in
eukaryotes, an organism will have no ill effects from having genes re-arranged
- The order of genes is generally preserved best between tightly related species → conservation of the
order of a cluster of genes suggests a functional relation
Synteny helps you formulate a hypothesis for gene function
Natural selection → changes in DNA that do or do not affect the encoded protein
- Ka = non-synonymous substitution ratio → base change leads to different amino acid
- Ks = synonymous substitution ratio → base change leads to same amino acid
o Ka/Ks<1 strong selection; Ka/Ks>1 NO selection
Lecture 1 (06/09/2022) – Replication, sequencing & PCR
The central dogma → replication-transcription-translation
DNA → RNA → protein → metabolite → phenotype
DNA replication → semi-conservative
➔ Every new double stranded molecule
consists of one old and one new strand
➔ Old strand → template for new strand
DNA synthesis → the process of formation of phosphodiester bonds
while hydrolyzing the matching dNTP molecule
- Direction of synthesis is 5’ end to 3’ end due to the OH-group
DNA polymerase → synthesizes DNA from a double stranded “primer”
Replisome → molecular machine, consists of multiple proteins, all involved in
replicating the DNA strands
Mistakes in DNA synthesis → can be restored through
proof-reading activity of the DNA polymerase complex
➔ Frequency of mistakes:
o without proof-reading → 1 error per 105 NTs
o with proof-reading → 1 error per 107 NTs
o with strand-directed
mismatch repair → 1 error per
1010 NTs
5’→3’ polymerization
3’→5’ exonucleolytic proofreading
Chemical changes in DNA bases can cause mutations
- Depurination → loss of purine base (A or G)
- Deamination → loss of amino group (C→U)
➔ Point mutations or deletions
,DNA sequencing → determine the order of the bases A, C, G, and T
PCR → Polymerase Chain Reaction → DNA amplification
Dideoxy sequencing (Sanger) → DNA synthesis while incorporating chain terminators in separate reactions
➔ Chain terminators (dideoxynucleotides) → lack the 3’-OH necessary for strand extension
- With a high concentration of all dNTPs and low
concentrations of ddATP (ddCTP , ddGTP, ddTTP) in
separate reactions DNA synthesis will continue, but
occasionally synthesis will stop when a dideoxy
nucleotide is incorporated
- With four separate reactions and a labelled primer,
fragments of different lengths are generated →
separated alongside electrophoresis every visible
fragment represents a termination in DNA synthesis
Genome sequencing → BAC libraries and “shotgun” fragment sequencing
➔ Assembly by comparison of overlapping sequences
Shotgun sequencing
Contigs → assembly of smaller DNA sequences into one continuous
The “innovation” of massive parallel sequencing → reactions take place in
picolitre volume on microscopic beads
Sequencing technique → pyrosequencing
➔ Incorporation of a dNTP
molecule supplies PPi that is
converted to ATP. Every ATP
molecule is consumed by luciferase
to yield 1 flash of light
Same principle with fluorescence →
Ion torrent sequencing → incorporation of a nucleotide results in release
of a proton. Resulting pH change can be measured on a chip
Massive parallel sequencing → differences between techniques
- Bulk DNA sequence data
- Read length
- Error rate
Whole genome sequencing is used for reconstruction of genomes of
extinct species like cave bears, mammoth, neanderthal
, Polymerase
Chain Reaction
(PCR) →
amplification of
1 specific DNA
fragment
PCR → cloning genomic fragments
- Knowledge required → homologous DNA
sequence & amino acid sequence protein
- Advantages → sensitive & fast
- Restrictions → fragment length (< 20kb) &
homologous DNA sequences
RT-qPCR → cloning of specific cDNA fragments
Techniques for detecting DNA polymorphisms
- PCR-BASED → VNTR (Variable
Number Tandem Repeats)
- PCR and sequencing-
BASED/melting curve → SNP (Single
Nucleotide Polymorphism)
, Lecture 2 (06/09/2022) – Genomes, omics & bioinformatics
Bioinformatics → how to handle the enormous amount of information produced by DNA sequencing, RNA
expression, protein patterns, metabolite contents, digitalized phenotypes
There is no correlation between genome size and organismal complexity → amoeba has a greater genome
Only 2% of the human genome codes for proteins
Repeated DNA sequences → important for regulation of gene
expression and maintaining DNA structure
Genomics
➔ DNA sequence analysis, genome fragments
➔ DNA database (NCBI, Genbank, EMBL)
➔ DNA markers (SNPs etc.)
Utilization of databases:
- Compare unknown fragments
- Identification gene fragments (annotation)
- Chromosome structure/synteny
- Intron/exon boundaries
- Phylogeny
Chromosomes contain many duplicated segments
→ intra and inter-chromosomal duplications
Genome annotation → process of identifying the
locations of genes and all of the coding regions in
a genome and determining what those genes do.
Once a genome is sequenced, it needs to be
annotated to make sense of it
Exon length is conserved between human, fly and worm → suggests functional restriction splicing machinery
Intron length is much more variable in human, peaking at 87 bp, but trailing till 3,300 bp...
➔ Suggests that exons (not introns) might have a limit in size in order to splice the mRNA
Alternative splicing → splicing on places that you did not predict (not on normal splice sites)
- 35% of the genes have alternative splicing
- 70% in the coding sequence → alters the protein
- 20% terminal exon added
Differential splicing → one gene produces different mRNAs that
code for different proteins
Synteny → preserved order of genes between related organisms
- Since the order of genes mostly has a neutral effect in
eukaryotes, an organism will have no ill effects from having genes re-arranged
- The order of genes is generally preserved best between tightly related species → conservation of the
order of a cluster of genes suggests a functional relation
Synteny helps you formulate a hypothesis for gene function
Natural selection → changes in DNA that do or do not affect the encoded protein
- Ka = non-synonymous substitution ratio → base change leads to different amino acid
- Ks = synonymous substitution ratio → base change leads to same amino acid
o Ka/Ks<1 strong selection; Ka/Ks>1 NO selection