100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

summary / samenvattende nota's large scale analysis of biomedical data

Rating
-
Sold
-
Pages
38
Uploaded on
20-12-2025
Written in
2025/2026

Summary notes of the course 'large scale analysis of biomedical data' taught by professors De Preter, Everaert, Gabriels, Bouwmeester, Colpaert and Rashidian, among professors. Based on Powerpoints and lessons.

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
December 20, 2025
Number of pages
38
Written in
2025/2026
Type
Summary

Subjects

Content preview

LARGE SCALE ANALYSIS OF BIOMEDICAL DATA
Data-mining workflow ..........................................................................................................................2
Machine learning and data visualisation .............................................................................................. 20
Research data management ............................................................................................................... 26
Real world data .................................................................................................................................. 29
Generative AI ..................................................................................................................................... 33
Encryption ......................................................................................................................................... 36




1

,DATA-MINING WORKFLOW




FROM QUESTION TO DATA
DEFINE PROJECT AIM – QUESTION – HYPOTHESIS – OBJECTIVES
 Goal: broad – long-term outcome – vision – impact
 Broad visionary
 Aim: purpose – overall objective – research aim
 Focused and general
 Research question: central scientific question
 Precise and interrogative
 Hypothesis: testable statement – predicting relationship between variables
 Predictive and testable
 Objectives: specific measurable steps
 Concrete and actionable
o Define what proteins are differentially expressed in healthy  diseased tissues
o Identify the regulatory pathways that are affected upon drug treatment in cell lines
o Determine whether treatment A results in more pronounced tumour shrinkage mice
compared to conventional therapies
o Compare the blood cell counts in patient group 1 versus patient group 2
! explorative: tentative – little is known yet
 descriptive research: conclusive – explore and explain a situation


EXPERIMENTAL AND STUDY DESIGN
= how do you organise your experiment and generate the data to learn about an
a priori defined hypothesis or answer the biological question of interest




2

,FACTORS OF INTEREST
 What experiments will you set up – what samples/material will you analyse – collection
e.g. concentrations of compound
 Prospective  retrospective
 Prospective: watches for outcome + relates to other factors
o Take a cohort of subjects
o Watch over a long period
o Minimalize bias and loss of follow-up
! mostly cohort studies
- Outcome is measured after exposure/test
- Yields true incidence and relative risks
- May uncover unanticipated associations
- Best for common outcomes
- Takes a long time to complete
- Prone to attribution bias
- Prone to the bias of change in methods over time
 Retrospective: looks backwards
+ examine exposure to risk or protection factors
o minimalize bias and confounding
! mostly case-control
- outcome is measured before exposure/test
- controls: selected on not having the outcome
- good for rare outcomes
- quicker to complete
- prone to selection bias
- prone to recall/retrospective bias



CONFOUNDING
= influence the result – but not interested in them
e.g. layout of 96-well plate – organisation of mice in cages – batches of materials used

 Batches: performed on different days – by different people – different reagents – different location
! not all batch effects are confounding: random noise
 Inability to distinguish effect of one factor (interesting) from the effect of another (confounding)




 Severity
o Complete confounding: impossible to fix after the experiment
o Incomplete confounding: work around it in the analysis – but statistical power suffer
! dependent on the effect of the confounding factor




3

,  Detection
o Possible: unexpectedly good separation between groups
o Visualize factors in experiment: replicates next to each other (instead of underneath)




 Solution
o Avoid confounding during planning phase
- Exclude nuisance factors if possible
- Balance biological factors if possible
- Randomise if possible and relevant




o Include batch information in experimental metadata

SAMPLE NUMBERS – REPLICATES
 Replicates
 Types
o Genuine replicate: increases sample size N
Biological replicate: often but not always equivalent to genuine replicate
= use different biological samples of the same condition to measure the biological
variation between samples
o Pseudoreplicate: does not increase sample size N
Technical replicate: often but not always equivalent to pseudoreplicate
= use the same biological sample to repeat the technical or experimental steps in
order to accurately measure technical variation and remove it during analysis
! happens when observations share some important factor
e.g. same batch of reagents – treatment x all from the same litter – …

! research question: genuine replication on one level becomes pseudo on higher level
e.g. learning about lung cancer cell line: each replicate within cell line increases N
 learning about lung cancer: each replicate within a particular line is pseudo
 Effect: pseudoreplicates don’t contain the same amount of info as genuine replicates
= falsely shrinks uncertainty estimates and results in too low/significant p-values




4

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
emmapot Universiteit Gent
Follow You need to be logged in order to follow users or courses
Sold
107
Member since
2 year
Number of followers
17
Documents
24
Last sold
2 days ago

4.1

7 reviews

5
2
4
4
3
1
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions