Introduction to bioinformatics
Hoorcollege 2 – 10 januari 2024
Protein structures & sequence profiles
Typical BLAST output
High scores are red, low score are black. Colors for alignment score
Proteins to below
Length of the sequence (4000). It’s a query sequence
Matching sequences are drawn below
From top to bottom there are sorted. Best alignment on top Sorted by decreasing
alignment
Protein homologs are evolutionary related. Proteins look similar, because they started at the
same point
Possible output in more detail
Zoomed in
It’s a protein sequence, because there are more than 4 letters (all started by M)
Columns that are similar are highlighted, in a way they match the most
We want to recognize the conserved area’s / similar parts (*)
Protein family that is found in different species
Some are highly variable / some conserved, other don’t
Proteins that are conserved, is for a reason, called function
Common ancestor all sequence? What happens in between?
- Mutation happens randomly
- Over a long time, in both parts same amount of mutations
- Natural selection they died through an mutation
- Any mutation must be legal?
Running BLAST gives information about evolutionary
Summarize alignment, because there are big
Profiles are used PSSM (position specific scoring matrix) is a type of profile
Sequence are down (amino acids)
To the right are the scores (sequence position)
…
Hoorcollege 2 – 10 januari 2024
Protein structures & sequence profiles
Typical BLAST output
High scores are red, low score are black. Colors for alignment score
Proteins to below
Length of the sequence (4000). It’s a query sequence
Matching sequences are drawn below
From top to bottom there are sorted. Best alignment on top Sorted by decreasing
alignment
Protein homologs are evolutionary related. Proteins look similar, because they started at the
same point
Possible output in more detail
Zoomed in
It’s a protein sequence, because there are more than 4 letters (all started by M)
Columns that are similar are highlighted, in a way they match the most
We want to recognize the conserved area’s / similar parts (*)
Protein family that is found in different species
Some are highly variable / some conserved, other don’t
Proteins that are conserved, is for a reason, called function
Common ancestor all sequence? What happens in between?
- Mutation happens randomly
- Over a long time, in both parts same amount of mutations
- Natural selection they died through an mutation
- Any mutation must be legal?
Running BLAST gives information about evolutionary
Summarize alignment, because there are big
Profiles are used PSSM (position specific scoring matrix) is a type of profile
Sequence are down (amino acids)
To the right are the scores (sequence position)
…