Introduction of bioinformatics
Hoorcollege 4 – 17 januari 2024
Transcriptomics // RNA-sequencing
Gene expression basics
Information stored in DNA (genetic information)
Proteins are the machines that do stuff in the cells
mRNA Gene expression
DNA in all the cells of your body are similar, but cells looks different. Proteins inside cell are
different.
So same DNA (template), but proteins are different different pieces of DNA (genes) are
expressed. Different genes are transcribed. Other kinds of mRNA
Measure it, how much and which mRNA are present
Gene expression determines how a cell behaves
Regulated by growth factors
Receptors activate by growth factor
Activate transcription factors determine with genes are on and off. Make proteins
Cancer mutation in signaling increased signaling increased expression increased
growth / uncontrolled cell grow
How many genes does a human approximately have?
C 20.000
Measure gene expression 20.000 genes
Not all genes are expressing content
Eukaryotic transcription
Genes exist of many exons / genes contain multiple exons
DNA consist exons and introns, only exons are left over in mRNA
Genes can have multiple different isoforms
One gene can form different proteins. They can be similar, but also very different.
Measuring gene expression: sequencing, mapping and quantifying mRNA
It’s much harder to measure proteins
, Methods to measure gene expression
- Low throughput (1-10 genes)
Cheap method, just for one mRNA
o qPCR
covid PCR test
o mRNA FISH
method to visualize mRNA molecules
you can see mRNA, don’t need to break down the cell. Not only which,
but also where. More high throughput then a couple of years ago
- high throughput (almost all genes = 20.000)
o micro-arrays
measure gene expression. Arrays (spots on glass place). Each
different gene stick to different spots
o (single cells) RNA-seq
Measure RNA in a single cell, but also possible for a lot of cells
RNA-seq workflow and data (pre)-processing
Sequencing:
- mRNA is fragmented and reverse transcribed into cDNA
- then, the cDNA is sequenced
mapping and quantification:
- map the reads to genes
- count how many reads map to each gene
o output: count matrix (genes x samples)
o more counts higher expression
RNA cut up in small pieces
Small DNA that are complement to RNA
Easy to sequence RNA measure it easy
Count how often found DNA
How more copies of DNA, more RNA, more counts
The more counts you have, the higher DNA copies?, the higher the expression
Count matrix: genes times the samples
Biases in RNA-seq: which effects determine how many reads I count for gene A in
sample X?
- Expression of gene A in sample x (This is what we want to measure!)
- Gene length. Longer gene gets more fragments, so more change to sequence it
- Position in the genome. Some parts in genome are easier to sequence
- Library size (more reads from sample more reads per gene)
Hoorcollege 4 – 17 januari 2024
Transcriptomics // RNA-sequencing
Gene expression basics
Information stored in DNA (genetic information)
Proteins are the machines that do stuff in the cells
mRNA Gene expression
DNA in all the cells of your body are similar, but cells looks different. Proteins inside cell are
different.
So same DNA (template), but proteins are different different pieces of DNA (genes) are
expressed. Different genes are transcribed. Other kinds of mRNA
Measure it, how much and which mRNA are present
Gene expression determines how a cell behaves
Regulated by growth factors
Receptors activate by growth factor
Activate transcription factors determine with genes are on and off. Make proteins
Cancer mutation in signaling increased signaling increased expression increased
growth / uncontrolled cell grow
How many genes does a human approximately have?
C 20.000
Measure gene expression 20.000 genes
Not all genes are expressing content
Eukaryotic transcription
Genes exist of many exons / genes contain multiple exons
DNA consist exons and introns, only exons are left over in mRNA
Genes can have multiple different isoforms
One gene can form different proteins. They can be similar, but also very different.
Measuring gene expression: sequencing, mapping and quantifying mRNA
It’s much harder to measure proteins
, Methods to measure gene expression
- Low throughput (1-10 genes)
Cheap method, just for one mRNA
o qPCR
covid PCR test
o mRNA FISH
method to visualize mRNA molecules
you can see mRNA, don’t need to break down the cell. Not only which,
but also where. More high throughput then a couple of years ago
- high throughput (almost all genes = 20.000)
o micro-arrays
measure gene expression. Arrays (spots on glass place). Each
different gene stick to different spots
o (single cells) RNA-seq
Measure RNA in a single cell, but also possible for a lot of cells
RNA-seq workflow and data (pre)-processing
Sequencing:
- mRNA is fragmented and reverse transcribed into cDNA
- then, the cDNA is sequenced
mapping and quantification:
- map the reads to genes
- count how many reads map to each gene
o output: count matrix (genes x samples)
o more counts higher expression
RNA cut up in small pieces
Small DNA that are complement to RNA
Easy to sequence RNA measure it easy
Count how often found DNA
How more copies of DNA, more RNA, more counts
The more counts you have, the higher DNA copies?, the higher the expression
Count matrix: genes times the samples
Biases in RNA-seq: which effects determine how many reads I count for gene A in
sample X?
- Expression of gene A in sample x (This is what we want to measure!)
- Gene length. Longer gene gets more fragments, so more change to sequence it
- Position in the genome. Some parts in genome are easier to sequence
- Library size (more reads from sample more reads per gene)