Chapter 2 Questions & Answers
Hypothesis - ANSWERSa proposed explanation for some phenomenon used as the
basis for further investigation
digital divide - ANSWERSpeople's access to computing and the Internet differs based
on socioeconomic or geographic characteristics
README - ANSWERSa document providing background information about a data set
CSV - ANSWERSabbreviation of "comma-separated values," this is a widely-used
format for storing data
Raw data - ANSWERSthe original data as it was collected
Summary table - ANSWERSa table of aggregate information about a data set (e.g., the
average, sum, count of some values)
Limitations of the computer - ANSWERSusing computational tools to analyze data has
made it much easier to find trends and patterns in large data sets. When preparing data
for this kind of analysis, however, it's important to remember that the computer is much
less "intelligent" than we might imagine. Small discrepancies in the data may prevent
accurate interpretation of trends and patterns and can even make it impossible to use
the data at all. Cleaning data is therefore an important step in analyzing it, and in many
contexts, it may actually take the largest amount of time
free form text - ANSWERSdata that needs to be standardized by humans and not a
computer; categorical answers to open-ended questions are a matter of judgement
cleaning and filtering data - ANSWERSnecessary to ensure that data is in a form that is
better for computers to process
Pivot table - ANSWERSthe tool used by most spreadsheet programs to create a
summary table
1. used to quickly perform aggregate computations
and groupings on a set of raw data
2. used to generate a summarized view of a large
data set which is helpful for gaining insight