The term data is (singular/plural) _____.
Plural
A data set is made up of _____ that contain information on a specific entity.
Records
Each record is made of _____ that contain measurements of known types.
Fields
A data table is made up of rows containing _____ and columns containing _____.
Observations, variables
We say that data are tidy if each variable corresponds to a _____, each row an _____,
and each cell a _____.
column, observation, single value
A quick-serve restaurant chain records sales, staffing and customer traffic every day for
each store. You recognize this as a _____ data set where the unit of observation is the
store-day.
Panel
We distinguish 4 stages of data analysis and refer to them compactly as _____ (in all
caps).
ATAC
Name the stages of ATAC
acquisition, transformation, analysis, communication
The second stage involves, among other things, making sure the data are _____ (as the
Posit folks would say).
Tidy
In the third stage, the workhorse will be the _____.
CEF
A variable will not have _____ if it does not measure what it is supposed to.
Validity
How to handle missing data depends on whether they are missing _____.
Endogenously
,A national company has developed a new product and is offering it for sale at a discount
to introduce it to the market. Randomly surveying customer who purchased the product
in the initial discount period (would/would not) ______ generate a sample representing
the population of typical customers.
Would not
It is advisable to _____ the acquisition, transformation and analysis tasks.
Separate
One reason reproducibility matters is to protect and support your _____ self.
Future
Another reason reproducibility matters to guard against _____ and _____.
error, fraud
One important component is describing the exact _____ of your raw input data.
Source
You should view a reproducible analysis as a _____ that you should be able to produce
again and again.
Product
A _____ is a representation of the data structure comprising all of the attributes of the
data and their types.
data schema
This representation of the data structure identifies the _____ to which each observation
pertains.
unit of record
This representation of the data also makes clear what are the _____ that identify an
observation.
key variables
A terabyte is equal to a _____ bytes.
1 trillion
R stores real numbers as a _____ data type and allocates _____ bytes of data to each
number.
numeric, 8
A megabyte can store _____ num values, while a terabye can store roughly a _____
times that.
131072, 1 million
, First, use the library() and data() functions to load the wooldridge package and card
data set.
library(wooldridge)
data(card)
Card obtained the data from the _____.
NLSYM
The source of Card's data is a survey that began in _____ with _____ young men age
14-24.
1966, 5525
The same young men were surveyed again in selected years through _____ , effectively
creating a _____ data set where the unit of observation is the person- _____ .
1981, panel, year
The survey was not a random sample of the US population because men from
neighborhoods with a high concentration of _____ residents were over-sampled.
Black
Card's analysis is based on the 1976 survey when the youngest respondents are
_____. By 1976, attrition had reduced the sample size to _____ observations. After
filtering the sample on observations with valid education and wage data, Card is left with
an analysis sample of _____ young men.
24, 3694, 3010
The key variable in the data set is _____.
Id
The wage variable is measured in _____. The lwage variable is the _____
transformation of wage.
cents, log
The variable expert measures labor-market experience as ______.
age - educ – 6
The str() function, which provides an overview of the data type, size, and content in a
data set. Apply it to determine the structure of the card data set and answer the
questions that follow.
str(card)
The card data set contains _____ observations and _____ variables.
3010, 34
What data type is lwage? _____. How about wage? ______. (Use the full-name
description of the data type in your answers.)