Exam: 8 open questions
Ch1: Variability & its measurement
Introduction & variability
Classical studies of variability (pre genomics era)
Types of variability
Discrete variability => clearly distinct classes
- Discrete polymorphism
- E.g. eye colour, blood group, coat colour mice
- Allowed Mendel to discover genes as particles
- Genes do not blend/mix they stay intact
Quantitative variability => continuous variation
- E.g. human height, fitness, crop yield
- Often normal or Gaussian distributed
- Can continuous variation agree with genes as discrete particles?
↓
Modern evolutionary synthesis (1920s)
- Union of Mendel + Darwin Fisher, Haldane & Wright
- Normal distribution with many genes & environmental effects
- Complex traits affected by many genes
- Phenotypes can be continuous even with discrete genes
Both types have Mendelian basis
Normal distribution
Phenotypic measurement of continuous variation
- Mean (µ)
- Standard deviation (σ)
- Coefficient of variation (CV) = σ/ µ
- Complex traits: no longer typical Mendel ratio (not 1:3)
Other types of traits:
Cryptic discrete polymorphisms => variability that cannot be seen from outside (discrete)
- E.g. blood group, chromosomal rearrangements, neutral markers (SNP)
- Conditionally expressed variation: e.g. diabetes type II polymorphisms only affect
phenotype when food is abundant
- Effect of mutations can be hidden by other genes
- Expression can depend on: background genotype (1) or on environment (2)
Categorical or meristic traits => trait values with limited number of categories (quant)
- E.g. litter size, number of seeds
- Countable traits (so not body weight: can be measured, but not counted)
Threshold traits => special type of categorical traits (quant)
- E.g. many complex diseases
- Phenotype discrete (0/1), genotype continuous (genetic liability)
, - Many genotypes 2 phenotypes
Genetic limits of classical studies
Does not look at genes, can not answer key questions of population genetics:
How much variation in average gene?
What causes & maintains this variation?
What is genetic architecture of complex traits?
Molecular studies of variability (looks at genes: genomics era)
With DNA chips or DNA sequencers
Measuring variability at gene level:
Genotype frequency (fxx)
Allele frequency (p)
Proportion of polymorphic loci (P)
- #loci with more than 1 allele/total #loci
- Has cut-off on e.g. p=0.99
- Not very useful, relatively arbitrary
Number of distinct alleles (A)
- Depends on sample size (larger sample more alleles)
- Ignores allele frequency
Gene diversity (H) => probability that 2 randomly chosen alleles differ
- Expected heterozygosity
- Preferred measure
- Nog veranderen!! formule
Measuring DNA sequence variability at nucleotide level: (very important for this course)
Nucleotide diversity (π) => probability that 2 randomly chosen nucleotides at random site
are different
- Averages over all nucleotides (sites) in sample
- Analogous to H, but then for nucleotides instead of loci
- Sample size = k
Watterson’s theta (ϴW)
- Sk = #polymorphic sites
- ak = correction for sample size = 1 + ½ until 1/(k-1)
- Large samples: ak = ln(k) + 0.577
- On average ϴW = π (with constant k)
Genetic variation is abundant
,Lessons from sequence information: SNPs
SNP has + relationship with population size
- Due to drift (coalescent theory)
SNP => single nucleotide polymorphism
- Very common genetic variation abundant
Many SNPs are silent
- Silent = no change in amino acid
- Non-coding or synonymous suggests mutations are mostly deleterious
- Non-silent SNPs have apparently been removed by natural selection
InDels are common in non-coding regions
- Deleterious in coding regions
Hardy Weinberg
Haplotypes
Each diploid individual carries 2 haplotypes
- 1 from mother, 1 from father
#haplotypes not good diversity measure, because it depends on:
- Length of haplotypes
- Sample size
- Recombination generates new haplotypes
1 gene 1 phenotype
Simple Mendelian traits are exception, not rule
Complex relationship, not 1:1
- 1 trait influenced by many genes
- Pleiotropy => simple gene affects >1 trait
- Epistasis => interactions between genes
- Environmental influences
Cause & maintenance of variability
Hardy Weinberg equilibrium (HWE)
Only when no forces act on population
- No selection
- No mutation
- No migration
- No random events (large population no drift)
- Random mating
HW law
- Allele & genotype frequencies remain constant
- Variability maintained forever
- Serves as null hypothesis
, HW law: maintaining variation
1 locus, 2 alleles (A & B)
- A=p
- B=q
- p+q=1
cannot be used for HWE
Can only use p= √ p if you know HW is true, otherwise use p = f AA + ½ * fAB
2
Random mating
- Next generation: p’ & q’
Testing for deviations from HWE
36 AA (p2) p = 0.5 25 AA
28 AB q = 0.5 expected: 50 AB
36 BB 25 BB
Test for significance:
- Goodness of fit test
- Chi square distribution with df=1 (always)
- O & e are #individuals (not freq)
- P > 0.05 HWE, p<0.05 no HWE (different p, not
allele freq)
Use of HW law
- Testing whether gene is under selection
- Testing whether mating is non-random
- Calculate fraction of carriers of disease in population
- Calculate probability that individual will suffer from disease
- Calculate change of allele frequency due to selection
- Validity of quantitative genetics
HWE & recessive diseases: cystic fibrosis
Recessive diseases => only homozygous individuals affected
- q2 = affected
- 2pq = carrier
Recessive diseases linked to inbreeding & lack of migration
- More homozygous more affected by recessive disease
Non random mating & other deviations from HWE
Assortative mating
Ch1: Variability & its measurement
Introduction & variability
Classical studies of variability (pre genomics era)
Types of variability
Discrete variability => clearly distinct classes
- Discrete polymorphism
- E.g. eye colour, blood group, coat colour mice
- Allowed Mendel to discover genes as particles
- Genes do not blend/mix they stay intact
Quantitative variability => continuous variation
- E.g. human height, fitness, crop yield
- Often normal or Gaussian distributed
- Can continuous variation agree with genes as discrete particles?
↓
Modern evolutionary synthesis (1920s)
- Union of Mendel + Darwin Fisher, Haldane & Wright
- Normal distribution with many genes & environmental effects
- Complex traits affected by many genes
- Phenotypes can be continuous even with discrete genes
Both types have Mendelian basis
Normal distribution
Phenotypic measurement of continuous variation
- Mean (µ)
- Standard deviation (σ)
- Coefficient of variation (CV) = σ/ µ
- Complex traits: no longer typical Mendel ratio (not 1:3)
Other types of traits:
Cryptic discrete polymorphisms => variability that cannot be seen from outside (discrete)
- E.g. blood group, chromosomal rearrangements, neutral markers (SNP)
- Conditionally expressed variation: e.g. diabetes type II polymorphisms only affect
phenotype when food is abundant
- Effect of mutations can be hidden by other genes
- Expression can depend on: background genotype (1) or on environment (2)
Categorical or meristic traits => trait values with limited number of categories (quant)
- E.g. litter size, number of seeds
- Countable traits (so not body weight: can be measured, but not counted)
Threshold traits => special type of categorical traits (quant)
- E.g. many complex diseases
- Phenotype discrete (0/1), genotype continuous (genetic liability)
, - Many genotypes 2 phenotypes
Genetic limits of classical studies
Does not look at genes, can not answer key questions of population genetics:
How much variation in average gene?
What causes & maintains this variation?
What is genetic architecture of complex traits?
Molecular studies of variability (looks at genes: genomics era)
With DNA chips or DNA sequencers
Measuring variability at gene level:
Genotype frequency (fxx)
Allele frequency (p)
Proportion of polymorphic loci (P)
- #loci with more than 1 allele/total #loci
- Has cut-off on e.g. p=0.99
- Not very useful, relatively arbitrary
Number of distinct alleles (A)
- Depends on sample size (larger sample more alleles)
- Ignores allele frequency
Gene diversity (H) => probability that 2 randomly chosen alleles differ
- Expected heterozygosity
- Preferred measure
- Nog veranderen!! formule
Measuring DNA sequence variability at nucleotide level: (very important for this course)
Nucleotide diversity (π) => probability that 2 randomly chosen nucleotides at random site
are different
- Averages over all nucleotides (sites) in sample
- Analogous to H, but then for nucleotides instead of loci
- Sample size = k
Watterson’s theta (ϴW)
- Sk = #polymorphic sites
- ak = correction for sample size = 1 + ½ until 1/(k-1)
- Large samples: ak = ln(k) + 0.577
- On average ϴW = π (with constant k)
Genetic variation is abundant
,Lessons from sequence information: SNPs
SNP has + relationship with population size
- Due to drift (coalescent theory)
SNP => single nucleotide polymorphism
- Very common genetic variation abundant
Many SNPs are silent
- Silent = no change in amino acid
- Non-coding or synonymous suggests mutations are mostly deleterious
- Non-silent SNPs have apparently been removed by natural selection
InDels are common in non-coding regions
- Deleterious in coding regions
Hardy Weinberg
Haplotypes
Each diploid individual carries 2 haplotypes
- 1 from mother, 1 from father
#haplotypes not good diversity measure, because it depends on:
- Length of haplotypes
- Sample size
- Recombination generates new haplotypes
1 gene 1 phenotype
Simple Mendelian traits are exception, not rule
Complex relationship, not 1:1
- 1 trait influenced by many genes
- Pleiotropy => simple gene affects >1 trait
- Epistasis => interactions between genes
- Environmental influences
Cause & maintenance of variability
Hardy Weinberg equilibrium (HWE)
Only when no forces act on population
- No selection
- No mutation
- No migration
- No random events (large population no drift)
- Random mating
HW law
- Allele & genotype frequencies remain constant
- Variability maintained forever
- Serves as null hypothesis
, HW law: maintaining variation
1 locus, 2 alleles (A & B)
- A=p
- B=q
- p+q=1
cannot be used for HWE
Can only use p= √ p if you know HW is true, otherwise use p = f AA + ½ * fAB
2
Random mating
- Next generation: p’ & q’
Testing for deviations from HWE
36 AA (p2) p = 0.5 25 AA
28 AB q = 0.5 expected: 50 AB
36 BB 25 BB
Test for significance:
- Goodness of fit test
- Chi square distribution with df=1 (always)
- O & e are #individuals (not freq)
- P > 0.05 HWE, p<0.05 no HWE (different p, not
allele freq)
Use of HW law
- Testing whether gene is under selection
- Testing whether mating is non-random
- Calculate fraction of carriers of disease in population
- Calculate probability that individual will suffer from disease
- Calculate change of allele frequency due to selection
- Validity of quantitative genetics
HWE & recessive diseases: cystic fibrosis
Recessive diseases => only homozygous individuals affected
- q2 = affected
- 2pq = carrier
Recessive diseases linked to inbreeding & lack of migration
- More homozygous more affected by recessive disease
Non random mating & other deviations from HWE
Assortative mating