100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary of Data Science in Biomedicine

Beoordeling
-
Verkocht
-
Pagina's
41
Geüpload op
08-10-2022
Geschreven in
2022/2023

Summary for the subject of Data Science in Biomedicine. Every subject compulsory for the test is summarized.












Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
8 oktober 2022
Aantal pagina's
41
Geschreven in
2022/2023
Type
Samenvatting

Voorbeeld van de inhoud

Summary Data Science in Biomedicine

College 1: Introduction to Data Science in Biomedicine
Datum: 26-09-2022

- Patients Data collection -> Biomedical data: Electronic Health Record (EHR) and
Omics -> Personalised Health Data Analysis: Large Volume Data, Data
Management and High Performance Computing

- Translate large data sets to something you can understand and discuss

There are many types of (big) data available:
- Numerical
- Textual
- Categorical
- Imaging
- Clinical
- Demographic
- Psychosocial
- Lifestyle
- Environmental
- Genomic
- DNA
- Genes
- proteins
- RNA
- SNPs
- ncRNA
-Splice variants
-RNA expression levels

Next Generation Sequencing (NGS)
- 1 illimina NovaSeq6000 run will read 6,000,000,000,000 (6,000,000,000 kb,
6,000,000 Mb, 6,000Gb, 6Tb) bases in ~44hr (computers and software is necessary)
- Bioinformatics pipelines, e.g., Analyzing NGS data
Reference mapping -> transcript assembly, comparison, merging -> detection of
differentially expressed genes/transcripts (understand input and output of programs,
know your statistics, modify the graphical output).

Using R or Python
R -> Retrieve data from a database, apply statistical analyses and visualize results
Python -> What if the data is in a wrong format then write a small Python script

,R vs Python
- R is dedicated to statistics
- R is very popular in research
- Many good libraries for R; Genomics, GWAS, Proteomics, Transcriptomics,
Metabolomics etc.
- R is not a real programming language but more a statistical scripting tool
- Python is easier and much better in handling text files and data text files
- R and Python are slower than C++
- Although loads of people R there will be a decline so why still learn?

R
- open source package for Statistics
- most popular statistics program in bioinformatics
- Also popular -> Python data analysis library - pandas
- MATLAB

R vs Excel
- In excel you can load data by opening a file or copy paste a data table
- You can edit this data in excel
- You can NOT edit data in R

R Graphics
- popular Graphics library is ggplot2 (also in Python)
- you can also log the data by log(my_data)
- How to plot multiple classes: multiple_classes <- c(“N”, “O”, “P”) and
my_multi_subset <- subset(my_annotated_subset, classID%in% multiple_classes
- C() is a list
- to add dimensional data to the graph, often the graphs are plotted in a matrix
- You have: Script, Data Sets, Text output and graphic output

,College 2: Data Science in Biomedicine Basis Statistics 1
Datum: 27-09-2022

What is statistics?
- Why do we need statistics?
- when difference?
- p-value?
- impact of risk
- identify problems
- where does the data come from?
- which data and conclusions are trustworthy
- properties?
- Reliable p-value

Measurements
- experiments -> variation
- variation between persons, equipments and time of the day
- define the experiments properly
- what is the main source of variation
- after standardization; do we always get exactly the same value
- measurements show variation!

P-value
- a p- value is the probability of a an observed result
- 0.05
- x axis = set of possible results
- y-axis is probability density
- same statistics, same p-value, different “impact of risk”
- you can calculate p-values but it never tells you if it’s good or bad
- especially in Biomedical sciences this can be an ethical discussion: Risk for
treating/not treating patients and until which age should you treat a patient
- 0.05 is a good starting point but always evaluate this assumptiom
- p-value cutoff = This means that, if your null hypothesis is indeed correct
and there is no difference between the groups, the result that you
obtained is very rare. You would expect to obtain such a result fewer than 1
in 20 times if you collected samples over and over again.

Generating data
- A statistician want: a good designed study, trustworthy data and many
replicates
- a statistician know how to: analyze data and calculate p-values
- a statistician does not know; detailed theoretical background, impact of risk
(threshold) and potential pitfalls.

, Some basic statistics in this course
- t-test
- linear regression
- permutation testing
- FDR testing
- Fischer’s exact test
- Chi-squared test
- Pearson’s vs Spearman correlation
- PCA

T-statistic
- Compares two data sets and tells you if they are different from each other
- e.g. compare two groups, one treated with a drug the other with a placebo
- Pearson 1857
- Fisher 1890
- Neyman 1894 (Random stats)
- Bayes 1702 (probability stats)
- A t-test is a statistical test that is used to compare the means of two
groups. It is often used in hypothesis testing to determine whether a process
or treatment actually has an effect on the population of interest, or whether
two groups are different from one another

Types of T test
1. independent Samples: compares the means for two independent groups
2. Paired Samples: compares means from the same group (e.g. at different time
points
3. One: test the mean of a single group against a known mean (a standard or
reference

Paired data
- group of mice (8) before and after albumin treatment
- the null hypothesis is that the pairwise difference between the two tests is
equal (h0:μd =0)
€5,49
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper
Seller avatar
willemdevries99

Maak kennis met de verkoper

Seller avatar
willemdevries99 Rijksuniversiteit Groningen
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
0
Lid sinds
3 jaar
Aantal volgers
0
Documenten
1
Laatst verkocht
-

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen