descriptive question - Answers asks about summarized characteristics of a data set without
interpretation (reports facts)
exploratory question - Answers asks if there are patterns, trends, or relationships within a data set (used
for hypotheses for future studies)
predictive question - Answers asks about predicting measurements or labels for individuals and focuses
on what things predict a specific outcome but not what causes the outcome
inferential question - Answers looks for patterns, trends, or relationships in a single data set AND asks
for qualification of how applicable findings are to a wider population
causal question - Answers asks abut whether or not changing one factor will lead to a change in another
factor, on average, in a wider population
mechanistic question - Answers asks about the underlying mechanism of the observed patterns, trends,
or relationships
read_csv("filepath") - Answers reads comma-separated values files
filter(tbl, condition) - Answers obtains a subset of rows with desired values from a data frame
select(tbl, columns_as_arguments) - Answers extracts specific columns from a data frame
arrange(tbl, by=desc/asc(columns_as_arguments)) - Answers orders the rows of a data frame by values
of a specific column
slice(tbl, row_range) - Answers keeps only the rows in a specific range
mutate(tbl, column_name = ...) - Answers modifies (use existing column name) or creates (use a new
column name) a column in a data set
ggplot(tbl, mapping) - Answers creates a plot for data in a tidy data set
geom_bar() - Answers bar graph
geom_point() - Answers scatterplot
read_csv("filepath", skip = n) - Answers tells R how many lines to skip before it should start reading in
the data (n = number of lines skipped)
read_tsv("filepath") - Answers reads tab-separated values files
read_delim("filepath", delim = "", col_names = FALSE) - Answers the generic version of read_csv and
read_tsv that can read either format
, download.file(url, "filepath") - Answers downloads data from a url into the specified filepath
read_excel("filepath") - Answers reads data stored in Excel files
read_csv2("filepath") - Answers reads semicolon separated files
dbConnect(RSQLite::SQLite(), "filepath") - Answers opens a communication channel that R can use to
send SQL commands to the database
dbListTables(tbl) - Answers lists all of the tables in a databse
tbl(dataset, "table") - Answers allows us to work with data stored in databases as if they were regular
data frames
show_query(tbl(dataset, "table")) - Answers looks at the SQL commands sent to the database from the
tbl command
collect(database_table) - Answers convert a database into a tibble
min(column) - Answers returns the minimum value in a numeric column
write_csv(tbl, "filepath") - Answers converts data into a csv file
read_html("url") - Answers tells R which page to scrape by providing the website's URL, directly
downloading the source code for the page at the specified URL
tidy data - Answers a data format that is suitable for analysis
data frame - Answers a table-like structure for storing observations, variable, and their values in R
variable - Answers a characteristic, number, or quantity that can be measured
observation - Answers all of the measurements for a given entity
value - Answers a single measurement of a single variable for a given entity
vectors - Answers objects that can contain one or more elements (elements must all be of the same data
type)
list - Answers objects that have multiple, ordered elements (elements do not all have to be of the same
data type)
data frame rules - Answers 1. each element must be either a vector or a list
2. each element (vector or list) must have the same length
tidy data rules - Answers 1. each row is a single observation
2. each column is a single variable