Samenvatting

Summary of the book, The Analysis of Biological Data Chapter 7-14

Beoordeling

Verkocht

Pagina's

Geüpload op

23-10-2021

Geschreven in

2021/2022

Summary of the book ' The Analysis of Biological Data' Chapter 7-14. Very well explained basic statistics, especially for biologists.

Instelling

Vak

Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Meld schending auteursrecht

Gekoppeld boek

Michael C. Whitlock, Dolph Schluter The Analysis of Biological Data

Uitgave:2020
ISBN:9781319325343
Druk:Onbekend

Geschreven voor

Instelling: Universiteit Leiden (UL)
Studie: Molecular Biology
Vak: Statistics for biologists II

Alle documenten voor dit vak (3)

Documentinformatie

Heel boek samengevat?: Nee
Wat is er van het boek samengevat?: 7-14
Geüpload op: 23 oktober 2021
Aantal pagina's: 16
Geschreven in: 2021/2022
Type: Samenvatting

Onderwerpen

stockholm university
basic statistics
easy
statistics for biologists ii
the analysis of biological data
michael c whitlock and dolph schluter

Voorbeeld van de inhoud

Summary of the book, The Analysis of Biological Data
Chapter 7-14
Authors: Withlock – Schluter
Second edition

Chapter 7, analysing proportions

In this chapter, we’ll describe how best to estimate a population proportion using a random
sample, including how to calculate its confidence interval.

Consider a measurement made on individuals that divides them into two mutually exclusive
groups, such as success or failure, alive or dead, left-handed or right-handed, or diabetic or
nondiabetic. In the population, a fixed proportion p of individuals fall into one of the two
groups (call it “success”) and the remaining individuals fall into the other group (call it
“failure”).

If we take a random sample of n individuals from this population, the sampling distribution
for the number of individuals falling into the success category is described by the binomial
distribution. The term “binomial” reveals its meaning: there are only two (bi-) possible
outcomes, and both are named (-nomial) categories.

The binomial distribution provides the probability distribution for the number of
“successes” in a fixed number of independent trials, when the probability of success is the
same in each trial.

The binomial distribution assumes that
- The number of trials (n) is fixed,
- Separate trials are independent, and
- The probability of success (p) is the same in every trial.

p = 0.25 of the individuals are successes and
1 − p = 0.75 of the individuals are failures.

The sample size (n) is in the denominator of the standard error equation, so the standard
error decreases as the sample size increases. Larger samples yield more precise estimates.
The improvement in precision as sample size increases is called the law of large numbers.

The binomial test is used when a variable in a population has two possible states (i.e.,
“success” and “failure”), and we wish to test whether the relative frequency of successes in
the population (p) matches a null expectation (p0).
The hypothesis statements look like this:
- H0: The relative frequency of successes in the population is p0.
- HA: The relative frequency of successes in the population is not p0.
The null expectation (p0) can be any specific proportion between zero and one, inclusive.

,The binomial test uses data to test whether a population proportion (p) matches a null
expectation (p0) for the proportion.

The standard deviation of the sampling distribution for an estimate is known as the standard
error of that estimate.

Confidence interval is the range of most-plausible values of the parameter we are trying to
estimate, based on the data. The 95% confidence interval of a proportion will enclose the
true value of the proportion 95% of the time that it is calculated from new data.

The most commonly used method to determine a confidence interval for a proportion is
called the Wald method.

Chapter 8 Fitting probability models to frequent data

A goodness-of-fit test is a method for comparing an observed frequency distribution with
the frequency distribution that would be expected under a simple probability model
governing the occurrence of different outcomes.

Under the proportional model, each day of the week should have the same probability of a
birth, that is, 1/7 (see Example 8.1). This is the simplest possible model, so it’s our null
hypothesis:

H0: The probability of birth is the same on every day of the week.
HA: The probability of birth is not the same on every day of the week.

The χ2 statistic measures the discrepancy between observed frequencies from the data and
expected frequencies from the null hypothesis. It’s important to notice that the χ2
calculations use the absolute frequencies (i.e., counts) for the observed and expected
frequencies, not proportions or relative frequencies. Using proportions in the calculation of
χ2 will give the wrong answer.

The number of degrees of freedom of a χ2 statistic specifies which χ2 distribution to use as
the null distribution.

A critical value is the value of a test statistic that marks the boundary of a specified area in
the tail (or tails) of the sampling distribution under H0. Because our observed χ2 value
(15.05) is greater than 12.59 (i.e., further out in the right tail of the distribution), χ2 values
of 15.05 or greater occur more rarely under the null hypothesis than 5% of the time.
Therefore, our P-value must be less than 0.05, P=Pr[χ62≥15.05]<0.05, so we reject the null
hypothesis.

Assumptions of the x2 goodness of fit test:
- Individuals in the data set are a random sample from the whole population (counts
for every test)
- None of the categories should have an expected frequency less than 1.
- No more than 20% of the categories should have expected frequencies less than 5.

, If one of these conditions is not met, then we have two options. One option, if possible, is
to combine some of the categories having small expected frequencies to yield fewer
categories having larger expected frequencies (remember to change the degrees of freedom
accordingly).

The Poisson distribution describes the number of successes in blocks of time or space, when
successes happen independently of each other and occur with equal probability at every
instant in time or point in space. Rejecting a null hypothesis of a Poisson distribution of
successes implies that successes are not independent or that the probability of a success
occurring is not constant over time or space.

One unusual property of the Poisson distribution is that the variance in the number of
successes per block of time (the square of the standard deviation) is equal to the mean (μ).
In an observed frequency distribution, if the variance is greater than the mean, then the
distribution is clumped. If the variance is less than the mean, successes are more evenly
distributed than expected by the Poisson distribution.

Chapter 9, Contingency analysis (associations between categorical variables).

Contingency analysis estimates and tests for an association between two or more
categorical variables. Contingency analysis allows us to determine whether, and to what
degree, two (or more) categorical variables are associated. In other words, a contingency
analysis helps us to decide whether the proportion of individuals falling into different
categories of a response variable is the same for all groups.

The odds ratio measures the magnitude of association between two categorical variables
when each variable has only two categories. One of the variables is the response variable—
let’s call its two categories “success” and “failure,” where success just refers to the focal
outcome of interest. The other variable is the explanatory variable, whose two categories
identify the two groups whose probability of success is being compared. The odds ratio
compares the proportion of successes and failures between the two groups.

The odds of success are the probability of success divided by the probability of failure.
The odds ratio is the odds of success in one group divided by the odds of success in a second
group.
If the odds ratio is equal to one, then the odds of success in the response variable are
independent of treatment; the odds of success are the same for both groups. If the odds
ratio is greater than one, then the event has higher odds in the first group than in the

€4,49

Krijg toegang tot het volledige document:

100% tevredenheidsgarantie

Direct beschikbaar na je betaling

Lees online óf als PDF

Geen vaste maandelijkse kosten

Maak kennis met de verkoper

biovandijk

3,3

(7)

Maak kennis met de verkoper

biovandijk University of Gothenburg

Bekijk profiel

Volgen

Verkocht

Lid sinds

6 jaar

Aantal volgers

Documenten

Laatst verkocht

1 jaar geleden

3,3

7 beoordelingen

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper biovandijk. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €4,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 41730 samenvattingen verkocht Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen