The genetic code and transcription I (KCSPK 13)
• This chapter focuses on the first phases of gene expression
and tries to answer mainly 2 questions:
- How is genetic information encoded?
- How is the information encoded by DNA transferred to
RNA?
The central dogma of molecular biology
A big picture
Characteristics of the genetic code
• Linear, ribonucleotide bases – RNA
• Three ribonucleotides – triplet codon
• Unambiguous – one codon corresponds to one amino acid
• Degenerate/redundant – 18 Aas out of 20 have more than
one codon
• Punctuation marks – start and stop codons
• Commas? No internal “punctuation”
• Non-overlapping – any single base is part of only one triplet
• Colinear – amino acid sequence makes up the protein
• Universal – used by almost all viruses, prokaryotes,
archaea and eukaryotes
AA – abbreviation for amino acid
, Developing and deciphering the genetic code
• DNA direct interaction with ribosomes?
- DNA in nucleus – ribosomes in cytoplasm
- mRNA discovered by Jacob & Monod
• 4 letters specify 20 words?
• Size of code words: 2, 3 or 4? (20 amino acids; Brenner)
Not enough codons
for 20AAs
Early days (late 1550s)
DNA encodes proteins directly
First hypothesis
• Ribosomes were known
• Information in DNA was transferred to the RNA of the
Know the stop codons ribosome → protein synthesis in the cytoplasm
AUG = start codon • But evidence showed the presence of an unstable
intermediate (mRNA)
Example
• And RNA of the ribosome (rRNA) was extremely stable
• How could only four bases specify 20 amino acids?
• Overlapping of DNA codes?
• Serine has 6 codons: AGU,
• Evidence did not support overlapping
AGC, UCU, UCC, UCA, UCG
• Tryptophan has only one
codon: UGG
Note that triplets are written 5’ – 3’
, Next steps (early 1960s) Frameshift mutations for beginners
The genetic code is a triplet (Sydney Brenner) One nucleotide
• 42 = 4x4 = 16; 43=4x4x4 = 64; 44=4×4 x 4 x 4 = 256
• Experimental work provided solid evidence for a triplet
code (Crick, Brenner et al.) Before mutation
• Deletion and insertion mutations in the rII locus of phage T4
• Mutagenic - acridine dye proflavine intercalates in DNA
causing indels of one or more nucleotides during
replication - FRAMESHIFT MUTATIONS
• Wild-type T4 - lysis and plaque formations (infection) in
strains B and K12 of E. coli
• T4 with ril mutations - no infection in strain K12 and
infection in strain B
Deciphering the genetic code Two nucleotides
E.coli K12, phage T4 and proflavine
• Frameshift mutations – first proof for triplet nature of code Before mutation
• Crick et al. - E. coli K12, phage T4 and proflavine
• Gain/loss of 1 or 2 nucleotides caused frameshift mutation
(+/-; ++/--)
• Mutations of 3 nucleotides restored reading frame (+++/---)
Insertion of one nucleotide in T4 Insertion of one triplet in T4
After mutation
T4 does not replicate in E. coli K12
• This chapter focuses on the first phases of gene expression
and tries to answer mainly 2 questions:
- How is genetic information encoded?
- How is the information encoded by DNA transferred to
RNA?
The central dogma of molecular biology
A big picture
Characteristics of the genetic code
• Linear, ribonucleotide bases – RNA
• Three ribonucleotides – triplet codon
• Unambiguous – one codon corresponds to one amino acid
• Degenerate/redundant – 18 Aas out of 20 have more than
one codon
• Punctuation marks – start and stop codons
• Commas? No internal “punctuation”
• Non-overlapping – any single base is part of only one triplet
• Colinear – amino acid sequence makes up the protein
• Universal – used by almost all viruses, prokaryotes,
archaea and eukaryotes
AA – abbreviation for amino acid
, Developing and deciphering the genetic code
• DNA direct interaction with ribosomes?
- DNA in nucleus – ribosomes in cytoplasm
- mRNA discovered by Jacob & Monod
• 4 letters specify 20 words?
• Size of code words: 2, 3 or 4? (20 amino acids; Brenner)
Not enough codons
for 20AAs
Early days (late 1550s)
DNA encodes proteins directly
First hypothesis
• Ribosomes were known
• Information in DNA was transferred to the RNA of the
Know the stop codons ribosome → protein synthesis in the cytoplasm
AUG = start codon • But evidence showed the presence of an unstable
intermediate (mRNA)
Example
• And RNA of the ribosome (rRNA) was extremely stable
• How could only four bases specify 20 amino acids?
• Overlapping of DNA codes?
• Serine has 6 codons: AGU,
• Evidence did not support overlapping
AGC, UCU, UCC, UCA, UCG
• Tryptophan has only one
codon: UGG
Note that triplets are written 5’ – 3’
, Next steps (early 1960s) Frameshift mutations for beginners
The genetic code is a triplet (Sydney Brenner) One nucleotide
• 42 = 4x4 = 16; 43=4x4x4 = 64; 44=4×4 x 4 x 4 = 256
• Experimental work provided solid evidence for a triplet
code (Crick, Brenner et al.) Before mutation
• Deletion and insertion mutations in the rII locus of phage T4
• Mutagenic - acridine dye proflavine intercalates in DNA
causing indels of one or more nucleotides during
replication - FRAMESHIFT MUTATIONS
• Wild-type T4 - lysis and plaque formations (infection) in
strains B and K12 of E. coli
• T4 with ril mutations - no infection in strain K12 and
infection in strain B
Deciphering the genetic code Two nucleotides
E.coli K12, phage T4 and proflavine
• Frameshift mutations – first proof for triplet nature of code Before mutation
• Crick et al. - E. coli K12, phage T4 and proflavine
• Gain/loss of 1 or 2 nucleotides caused frameshift mutation
(+/-; ++/--)
• Mutations of 3 nucleotides restored reading frame (+++/---)
Insertion of one nucleotide in T4 Insertion of one triplet in T4
After mutation
T4 does not replicate in E. coli K12