100% tevredenheidsgarantie Direct beschikbaar na je betaling Online lezen of als PDF Geen vaste maandelijkse kosten 4,6 TrustPilot
logo-home
Samenvatting

Large scale analysis of biomedical data summary

Beoordeling
-
Verkocht
-
Pagina's
50
Geüpload op
28-10-2025
Geschreven in
2024/2025

summary of all the powerpoints together with own notes

Voorbeeld van de inhoud

LARGE SCALE ANALYSIS OF
BIOMEDICAL DATA
1. LESSON 1


1.1 DEFINE GOALS AND OBJECTIVES
Data science process
- Collection
- Cleaning
- Exploratory data analysis
- Model building
- Model deployments


GOALS
f.e.
- Development and implementation of protein-based assay to diagnose a specific cancer type
- Evaluation of a novel drug to treat colon cancer
- Prediction of cardiovascular disease based on blood counts and genomic data


(DATA-MINING) OBJECTIVE



Focus on data mining objective (Focusing on a data mining objective means identifying a specific goal or
question that you want to answer using data mining techniques.)
f.e.
- What proteins are differentially expressed in healthy vs diseased tissue
- What regulatory pathways are affected upon drug treatment in cell lines
- Does treatment A result in more pronounced tumor shrinkage in mice compared to conventional
therapy
- Comparative analysis of the blood cell count in 2 patientsgroup
SMART ->
 Specific = who and what
 Measurable = by how much
 Achievable= how
 Relevant= why
 Time-bound= when

Know what research you will perform exploratory vs descriptive
 Exploratory  see the difference between f.e. healthy and unhealthy tissue
 Descriptive  you know specific what you want to check
 Then you can try to get your data




1.2 EXPERIMENTAL/ STUDY DESIGN: SAMPLE NUMBERS, COFOUNDING,…

,1. How will your study be designed?
 Garbage in= garbage out
 (start with which kind of data you need and what kind and how many samples you need)
 Data type samples experiment

2. What datatype is needed to answer the question?
 Clinical data
 Imaging data
 Transcriptomics, genomics data
 Flow cytometry
 Proteomics

3. Will you generate own data or repurpose published data?
 Sometimes database already available and you don’t need to go to the lab
There is a lot online  test your hypotheses= repurpose data

4. What sample will be used?
A prospective study watches for outcomes, such as the development of a disease, during the study
period and relates this to other factors such as suspected risk or protection factor(s). The study
usually involves taking a cohort of subjects and watching them over a long period. The outcome of
interest should be common; otherwise, the number of outcomes observed will be too small to be
statistically meaningful (indistinguishable from those that may have arisen by chance). All efforts
should be made to avoid sources of bias such as the loss of individuals to follow up during the study.
Prospective studies usually have fewer potential sources of bias and confounding than retrospective
studies. Prospective investigation is required to make precise estimates of either the incidence of an
outcome or the relative risk of an outcome based on exposure. Retrospective

A retrospective study looks backwards and examines exposures to suspected risk or protection factors
in relation to an outcome that is established at the start of the study. Many valuable case-control
studies were retrospective investigations. Most sources of error due to confounding and bias are more
common in retrospective studies than in prospective studies. For this reason, retrospective
investigations are often criticized. If the outcome of interest is uncommon, however, the size of
prospective investigation required to estimate relative risk is often too large to be feasible. In
retrospective studies the odds ratio provides an estimate of relative risk. You should take special care
to avoid sources of bias and confounding in retrospective studies.

case-control studies
Case-Control studies are usually but not exclusively retrospective. The following notes relate case-
control to cohort
studies:
 outcome is measured before exposure/biomarker test
 controls are selected on the basis of not having the outcome
 good for rare outcomes
 quicker to complete
 prone to selection bias
 prone to recall/retrospective bias (do not remember previous events)

, cohort studies
Cohort studies are usually but not exclusively prospective. The following notes relate cohort to case-
control studies:
 outcome is measured after exposure/biomarker test
 yields true incidence rates and relative risks
 may uncover unanticipated associations with outcome
 best for common outcomes
 takes a long time to complete
 prone to attrition bias (unequal loss of participants)
 prone to the bias of change in methods over time


WHAT IS ECPERIMENT DESIGN?  HOW DO YOU ORGANISE YOUR EXPERIMENT AND GENERATE
THE DATA TO LEARN ABOUT AN A PRIORI DEFINED HYPOTHESIS OR ANSWER THE BIOLOGICAL
QUESTION OF INTEREST

Define the factors you are interested in
 different treatment or combination to test
 concentrations of compound
be aware of confounding factor, influence the result but you are not interested in
= the inability to distinguish the effect of one factor from the effect of another: interesting vs nuisance
You can have biological effect, technical effect
 batch (tissue from different city= different moment, hospital, machine)
 plate effect (Example: Imagine a study investigating the effect of a new drug on blood pressure. If a
confounding variable (like age) is not controlled for, older patients—who may have higher blood
pressure and be more likely to be prescribed the drug—might skew the results, making it appear that
the drug is more effective or less effective than it actually is.)
example  see dia 25
 layout of 96 well-plate
 organisation of mice in cages
 batches of materials used
batches:
how to know whether you have batches fe in rna seq experiment
 were all RNA isolations performed on the same day
 were all library preparation performed on the same day
 did the same person perform the RNA isolation/ library preparation for all samples
 did you use the same reagents for all samples
 did you perform the rna isolation/ library preparation in the same location
 if any of the answer is no  batches

how bad is confounding
 complete confounding= impossible to fix after experiment
Dit treedt op wanneer de associatie tussen de blootstelling en de uitkomst volledig wordt vertekend
door de confounder. Met andere woorden, als de confounder niet wordt gecontroleerd, lijkt er een
sterke relatie tussen de blootstelling en de uitkomst te zijn, die in werkelijkheid volledig te wijten is
aan de confounder.
 with incomplete confounding  try to wark around it in the analysis, often statistical power suffers
Dit treedt op wanneer de confounder de associatie tussen de blootstelling en de uitkomst verstoort,
maar niet volledig. De confounder heeft invloed, maar er blijft nog steeds een deel van de associatie
dat niet door de confounder wordt verklaard.

,  Some confounding is worse than others, depends on the effect of the confounding factor

How to find confounding
 Be weary of unexpectedly good separation between groups
 Make plots for visualizing all factor in experiment
 Dia 30  design 2 is better  for design 1: if you see a different => don’t know if it is bcs of the
treatment or the line
Solution
 Easiest to avoid confounding
o Exclude nuisance factors if possible, balance biological factors if possible + randomise if
possible
o Full randomization= lot of work, but often acceptable compromise exist
o Third well is what you are going to do




to avoid confounding in mice experiment
 Ensure animal in each condition are all the same sex, age, litter, batch
 If not possible  ensure to split the animals equally between condition
Avoid batch effect
 Design experiment in a way to avoid batches, if possible
 If unable to avoid batches
o Do not confound your experiment by batch:
o Do split replicates of the different sample groups across batches  the more replicate the
better (effectiviteit van de behandeling niet moet verwarren met de batc)
o Do include batch information in your experimental metadata (Voeg informatie over de
batches toe aan je experimentdata. Dit helpt later bij de analyse.)
During the analysis, we can regress out the variation due to batch if not confounded so it
doesn’t affect our result if we have that information
o Important= randomization
Three important things to avoid batch effect
 No shuffling of an effect leads to uncorrectable confounding
 Randomization
 Blocking of an effect  balanced design

decision about type and number of replicate

genuine replication vs pseudoreplication
 Genuinee replicate  increases ample size

Documentinformatie

Geüpload op
28 oktober 2025
Aantal pagina's
50
Geschreven in
2024/2025
Type
SAMENVATTING

Onderwerpen

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
laurinevandewiele92
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
10
Lid sinds
3 jaar
Aantal volgers
5
Documenten
7
Laatst verkocht
3 maanden geleden

3,0

4 beoordelingen

5
0
4
1
3
2
2
1
1
0

Populaire documenten

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen