Samenvatting

Summary DSCI Tutorial 1 - tutorial_inference1_solution (2022)

Beoordeling

Verkocht

Pagina's

Geüpload op

11-04-2022

Geschreven in

2021/2022

Solutions for tutorial 11 inference2

Instelling

Vak

Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Meld schending auteursrecht

Geschreven voor

Instelling: University of British Columbia (UBC )
Studie: Data Science
Vak: DSCI100 (DSCI100)

Alle documenten voor dit vak (5)

Documentinformatie

Geüpload op: 11 april 2022
Aantal pagina's: 7
Geschreven in: 2021/2022
Type: Samenvatting

Onderwerpen

dsci100
ubc
tutorial 11 solutions
inference 1 solutions
tutorial 11

Voorbeeld van de inhoud

Tutorial 11 - Introduction to Statistical Inference
Lecture and Tutorial Learning Goals:
After completing this week's lecture and tutorial work, you will be able to:

Describe real world examples of questions that can be answered with the statistical inference methods.
Name common population parameters (e.g., mean, proportion, median, variance, standard deviation) that are often estimated using sample data, and
use computation to estimate these.
Define the following statistical sampling terms (population, sample, population parameter, point estimate, sampling distribution).
Explain the difference between a population parameter and sample point estimate.
Use computation to draw random samples from a finite population.
Use computation to create a sampling distribution from a finite population.
Describe how sample size influences the sampling distribution.

In [ ]:

### Run this cell before continuing.
library(tidyverse)
library(repr)
library(digest)
library(infer)
options(repr.matrix.max.rows = 6)
source('tests.R')
source('cleanup.R')

Virtual sampling simulation
In this tutorial you will study samples and sample means generated from different distributions. In real life, we rarely, if ever, have measurements for our
entire population. Here, however, we will make simulated datasets so we can understand the behaviour of sample means.

Suppose we had the data science final grades for a large population of students.

In [ ]:

# run this cell to simulate a finite population
set.seed(20201) # DO NOT CHANGE
students_pop <- tibble(grade = (rnorm(mean = 70, sd = 8, n = 10000)))
students_pop

Question 1.0
{points: 1}

Visualize the distribution of the population ( students_pop ) that was just created by plotting a histogram using binwidth = 1 in the
geom_histogram argument. Name the plot pop_dist and give x-axis a descriptive label.

In [ ]:
options(repr.plot.width = 8, repr.plot.height = 6)
# ... <- ggplot(..., ...) +
# geom_...(...) +
# ... +
# ggtitle("Population distribution")

### BEGIN SOLUTION
pop_dist <- ggplot(students_pop, aes(grade)) +
geom_histogram(binwidth = 1) +
xlab("Grades") +
ggtitle("Population distribution") +
theme(text = element_text(size = 20))
### END SOLUTION
pop_dist

In [ ]:

test_1.0()

Question 1.1
{points: 3}

Describe in words the distribution above, comment on the shape, center and how spread out the distribution is.

, BEGIN SOLUTION
The distribution is bell-shaped, symmetric, with one large peak in the middle centered at about 70 percent. Students' scores ranged from just over 40 to
just under 100% but most students got between about 60 to 80%.

END SOLUTION

Question 1.2
{points: 1}

Use summarise to calculate the following population parameters from the students_pop population:

mean (use the mean function)
median (use the median function)
standard deviation (use the sd function)

Name this data frame pop_parameters which has the column names pop_mean , pop_med and pop_sd .

In [ ]:

### BEGIN SOLUTION
pop_parameters <- students_pop %>%
summarise(pop_mean = mean(grade),
pop_med = median(grade),
pop_sd = sd(grade))
### END SOLUTION
pop_parameters

In [ ]:

test_1.2()

Question 1.2.1
{points: 1}

Draw one random sample of 5 students from our population of students ( students_pop ). Use summarize to calculate the mean, median, and
standard deviation for these 5 students.

Name this data frame ests_5 which should have column names mean_5 , med_5 and sd_5 . Use the seed 4321 .

In [ ]:

set.seed(4321) # DO NOT CHANGE!
### BEGIN SOLUTION
ests_5 <- students_pop %>%
rep_sample_n(5) %>%
summarize(mean_5 = mean(grade),
med_5 = median(grade),
sd_5 = sd(grade))
### END SOLUTION
ests_5

In [ ]:

test_1.2.1()

Question 1.2.2 Multiple Choice:
{points: 1}

Which of the following is the point estimate for the average final grade for the population of data science students (rounded to two decimal places)?

A. 70.03

B. 69.76

C. 73.52

D. 8.05

Assign your answer to an object called answer1.2.2 . Your answer should be a single character surrounded by quotes.

€6,96

Krijg toegang tot het volledige document:

100% tevredenheidsgarantie

Direct beschikbaar na je betaling

Lees online óf als PDF

Geen vaste maandelijkse kosten

Maak kennis met de verkoper

travissmith1

3,6

(16)

Maak kennis met de verkoper

travissmith1 UBC

Bekijk profiel

Volgen

Verkocht

Lid sinds

4 jaar

Aantal volgers

Documenten

Laatst verkocht

3 maanden geleden

3,6

16 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper travissmith1. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,96. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 58915 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

Summary DSCI Tutorial 1 - tutorial_inference1_solution (2022)

Geschreven voor

Documentinformatie

Onderwerpen

Voorbeeld van de inhoud

Meer vakken binnen University of British Columbia (UBC ) > Data Science

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?