Bioinformatics Final Exam
questions and answers
Bioinformatics - answer a field where you use
computational resources to store and retrieve
biological data and analyse it to deepen ones
understanding
Computational biology - answer it can include
molecular systems, but its most often applied to
biological data on the level of individual organisms
and populations and often involves application
mathematical models to such populations
What is a bit? - answer The number of pieces of
information required to determine the value of
something out of a number of possible values
for example, if we have 8 different information
states we need log(2)8 = 3 bits of information to
identify each of the 8 states (i.e we need to ask 3
questions to uniquely identify each of the 8 states)
Bits: DNA - answer normally we have 4 bases:
log(2)4 = 2 bits of information (2 questions asked)
to identify whether it is an A/C/G/T
How many questions one needs to ask to
unambiguously determine the identity of the
symbol
,We can have more than 4 letters in practice (when
we have methylated C / base J / etc)
How many bits are in a byte? - answer 8 bits in a
byte
How much information is in DNA? - answer bp = 3
bits (8 states)
human genome has approx 3 billion bp
therefore about 1 GB of information in the human
genome
Application: Cellular level - answer •Lipid
modifications
•DNA
•RNA
•Proteins and modifications
Application: Organism level - answer -Changes in
molecular components
-Foetal development and tissue differentiation
-See whats happening in different cells
-In different tissues
-Different physiological states = sick vs healthy
individual (differences in chemistry)
,Application: Tree of life - answer -Over evolutionary
periods
-Genomes and genome evolution in certain
branches
-universal common ancestor
-Engulfing of proteobacteria / cyanobacterium =
developed into mitochondria / chloroplast
(endosymbiont theory)
Why do we use bioinformatics to study
macromolecules - answer a living system is very
complex
we need to study all the genes and proteins in a
living system at once
too difficult without bioinformatics
Using bioinformatics to look at proteins - answer
Protein activity contributes to a specific cellular
phenotype
we can look at the response of a gene (produces
protein) to different environments
analyse protein structure to get insight into
structural characteristics
, why are there more proteins than genes in the
genome? - answer proteins can be modified
(methylation / splicing / covalent modification)
results in different isoforms of the protein encoded
by the same gene
mRNA (which is translated to a protein) can also be
modified by alternative splicing to code for
different proteins
Moore's Law - answer the number of transistors on
an integrated circuit will double every two years
(the number of nt in databases doubles every 18
months)
Primary database: definition - answer archival
databases
contains raw sequence data deposited by
researchers
lots of redundancy and duplication of data
primary database: Nucleotide sequence databases
- answer International Nucleotide Sequence
Database Collaboration
- GenBank (NCBI)
- European Nucleotide Archive (ENA) (EMBL-EBI)
questions and answers
Bioinformatics - answer a field where you use
computational resources to store and retrieve
biological data and analyse it to deepen ones
understanding
Computational biology - answer it can include
molecular systems, but its most often applied to
biological data on the level of individual organisms
and populations and often involves application
mathematical models to such populations
What is a bit? - answer The number of pieces of
information required to determine the value of
something out of a number of possible values
for example, if we have 8 different information
states we need log(2)8 = 3 bits of information to
identify each of the 8 states (i.e we need to ask 3
questions to uniquely identify each of the 8 states)
Bits: DNA - answer normally we have 4 bases:
log(2)4 = 2 bits of information (2 questions asked)
to identify whether it is an A/C/G/T
How many questions one needs to ask to
unambiguously determine the identity of the
symbol
,We can have more than 4 letters in practice (when
we have methylated C / base J / etc)
How many bits are in a byte? - answer 8 bits in a
byte
How much information is in DNA? - answer bp = 3
bits (8 states)
human genome has approx 3 billion bp
therefore about 1 GB of information in the human
genome
Application: Cellular level - answer •Lipid
modifications
•DNA
•RNA
•Proteins and modifications
Application: Organism level - answer -Changes in
molecular components
-Foetal development and tissue differentiation
-See whats happening in different cells
-In different tissues
-Different physiological states = sick vs healthy
individual (differences in chemistry)
,Application: Tree of life - answer -Over evolutionary
periods
-Genomes and genome evolution in certain
branches
-universal common ancestor
-Engulfing of proteobacteria / cyanobacterium =
developed into mitochondria / chloroplast
(endosymbiont theory)
Why do we use bioinformatics to study
macromolecules - answer a living system is very
complex
we need to study all the genes and proteins in a
living system at once
too difficult without bioinformatics
Using bioinformatics to look at proteins - answer
Protein activity contributes to a specific cellular
phenotype
we can look at the response of a gene (produces
protein) to different environments
analyse protein structure to get insight into
structural characteristics
, why are there more proteins than genes in the
genome? - answer proteins can be modified
(methylation / splicing / covalent modification)
results in different isoforms of the protein encoded
by the same gene
mRNA (which is translated to a protein) can also be
modified by alternative splicing to code for
different proteins
Moore's Law - answer the number of transistors on
an integrated circuit will double every two years
(the number of nt in databases doubles every 18
months)
Primary database: definition - answer archival
databases
contains raw sequence data deposited by
researchers
lots of redundancy and duplication of data
primary database: Nucleotide sequence databases
- answer International Nucleotide Sequence
Database Collaboration
- GenBank (NCBI)
- European Nucleotide Archive (ENA) (EMBL-EBI)