STUDY GUIDE 100% SOLVED
data warehouses - ANS-where data are recorded and stored electronically
big data - ANS-data sets so large that traditional methods of storage and analysis are inadequate
transactional data - ANS-data collected for recording the companies' transactions
data mining - ANS-the process of using transactional data to make other decisions and predictions,
sometimes called predictive analytics
business analytics - ANS-describes any use of statistical analysis to drive business decisions from data
observations - ANS-information collected regarding some subject. who, what, when, where, (if possible)
why, and how. sometimes called data values
case - ANS-an individual row of a data table corresponding to one set of data, identifies about whom we
record some characteristics, sometimes called records
respondents - ANS-individuals who answer a survey
subjects - ANS-people in an experiment, sometimes called participants
experimental units - ANS-animals, plants, websites, or other inanimate objects
variables - ANS-the characteristics recorded about each individual or case, identify what has been
measured
metadata - ANS-contains information about how, when, where, and possibly why the data were
collected; who each case represents; and the definitions of all the variables
relational database - ANS-when two or more separate data tables are linked together so that
information can be merged across them. each data table included in the database is a relation because it
as about a specific set of cases with information about each of these cases for all the variables
qualitative variable - ANS-when a variable names categories and answers questions about how cases fall
into those categories, also called categorical variable.
, quantitative variable - ANS-when a variable has measured numerical values with units and the variable
tells us about the quantity of what is measured
units - ANS-how each value has been measured, the corresponding scale of measurement, how much of
something we have, how far apart two values are
identifier variable - ANS-a unique identifier assigned to each individual or item in a group. do not have
units, useful in combining data from different sources, not variables to be analyzed. Ex: social security
number, student ID number, tracking number, transaction number
nominal variables - ANS-categorical variables used only to name categories that don't have order
ordinal values - ANS-when data values can be ordered. Ex: employees can be ranked according to the
number of months employed
time series data - ANS-variables that are measured at regular intervals over time, typical measuring
points are months, quarters, or years
cross-sectional data - ANS-when several variables are all measured at the same time point
frequency table - ANS-organizes data by recording totals and category names, the names of the
categories label each row, report counts or percentages or both
three rules of data analysis - ANS-make a picture, make a picture, make a picture - they reveal things
that can't be seen in a table of numbers, show important features and patterns in the data, and provide
an excellent means for reporting findings to others
area principle - ANS-the area occupied by a part of the graph should correspond to the magnitude of the
value it represents
bar chart - ANS-displays the distribution of a categorical variable, showing the counts for each category
next to each other for easy comparison
relative frequency bar chart - ANS-if the counts in a bar chart are replaced with percentages - will look
the same as the original bar chart but shows the proportion of visits in each category rather than counts
pie chart - ANS-shows the whole group of cases as a circle sliced into pieces with sizes proportional to
the fraction of the whole in each category
categorical data condition - ANS-the data are counts or percentages of individuals in categories
contingency table - ANS-shows how individuals are distributed along each variable depending on the
value of the other variables
marginal distribution - ANS-when a variable in a contingency table is the total count that occurs when
the value of that variable is held constant
cell - ANS-any intersection of a row and column of a contingency table