100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Samenvatting van het statistiek deel van 'Onderzoeksmethoden voor Informatica'

Beoordeling
-
Verkocht
2
Pagina's
14
Geüpload op
26-05-2021
Geschreven in
2020/2021

Samenvatting van het statistiek deel van 'Onderzoeksmethoden voor Informatica' aan de Universiteit Utrecht, gebaseerd op de hoorcolleges en het gratis boek 'OpenIntro Statistics'.










Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
26 mei 2021
Aantal pagina's
14
Geschreven in
2020/2021
Type
Samenvatting

Voorbeeld van de inhoud

Observations, variables and data matrices
Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting and
presenting empirical data. Statistics tries to answer uncertain questions and uses numerical evidence to draw valid
conclusions.


Data can be represented as a data matrix, in which every row is a case/observational unit. The columns represent
variables.


There are multiple types of variables:




Relationships among variables
When two variables show some connection with one another, they are called associated (or dependent) variables.
If two variables are not associated, i.e. there is no evident connection between the two, then they are said to be
independent.




Sampling principles
To have the most accurate observation, you would like to including the whole target population (e.g. the whole Dutch
population), which is called census (Dutch: volkstelling). In almost all cases this is nearly impossible, thus you will instead
take a sample (Dutch: steekproef) which is a subset of all cases.




Descriptive to inferential statistics
When you get results based upon a sample, you have descriptive statistics. When you generalize and conclude something
about the whole group, that is an inference.


For your inference to be valid, the sample needs to be representative of the entire pot.

, Obtaining Good Samples
Almost all statistical methods are based on the notion of implied randomness. Most commonly used random sampling
techniques are simple, stratified and cluster sampling.


Simple Random Sample: Randomly select cases from the population, where there is no implied connection between
points that are selected.

Stratified Sample: Strata are made up of similar observations. We take a simple random sample from each stratum
(e.g. sex, income level).

Cluster Sample: Clusters are usually not very different from one another. We take a simple random sample of
clusters, and then sample all observations in that cluster. Clusters can be provinces, cities, schools etc.

Multistage sample: Like cluster sample, but instead of keeping all observations in each cluster, we collect a
random sample within each selected cluster.



Sampling bias
If a sample is biased, that means it is not representative: the sample will not yield an accurate prediction. May have
different causes:


Non-response: If only a small fraction of the randomly sample people choose to respond to a survey.
Voluntary response: Occurs when the sample consists of people who volunteer to respond because they have strong
opinions on the issue.
Convenience sample: Individuals who are easily accessible are more likely to be included in the sample.


Large samples are preferable, but even when the sample size is huge, if the sample is biased, the sample will not yield an
accurate prediction.




Observational studies vs. experiments
Researchers perform an observational study when they collect data in a way that does not directly interfere with how the
data arise. They can provide evidence of a naturally occurring association between variables, but they cannot by
themselves show a causal connection as association does not imply causation.


To investigate the possibility of a causal connection, researchers conduct an experiment. Experiments use assignment of
treatment. Usually there will both be an explanatory and response variable. When the assignment includes randomization,
e.g. using a coin flip, it is called a randomized experiment.



Principles of experimental design
Randomized experiments are generally built on four principles:


Control: Researchers assign treatments to cases, and they do their best to control any other differences in the
groups.
Randomize: Randomly assign subjects to treatments, and randomly sample from the populations whenever possible,
because of variables that cannot be controlled.
Replicate: The more cases observed, the more accurately an estimation of the effect of the explanatory variable on

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
pactasuntservanda Universiteit Utrecht
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
151
Lid sinds
7 jaar
Aantal volgers
137
Documenten
0
Laatst verkocht
5 maanden geleden

3,3

36 beoordelingen

5
7
4
9
3
11
2
4
1
5

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen