ACTUAL Exam Questions and CORRECT
Answers
Why is data visualization helpful? - CORRECT ANSWER -1. amplifies cognition
2. expands working memory
3. reduces search time
4. improves pattern detection
5. controls attention
Describe the difference between data processing and querying - CORRECT ANSWER -In
both instances, the user knows what they want. The difference is that with querying, they can
only describe it whereas in data processing they actually have a way to compute it.
Describe the difference between data exploration and navigation - CORRECT ANSWER -
In exploration, the user does not know what they want but wants to get an idea about the data. In
navigation, the user DOES know what they want but does not know how to describe/locate it.
What do these acronyms describe? INS, 3Vs, HMLE - CORRECT ANSWER -Data
challenges
What does INS stand for and mean? - CORRECT ANSWER -INS is a lit of data
challenges: Imprecision, Noise, Sparsity
What does 3Vs stand for and mean? - CORRECT ANSWER -3Vs is a list of data
challenges: Volume, Velocity, Variety
What does HMLE stand for and mean? - CORRECT ANSWER -HMLE is a list of data
challenges: High-dimensional, Multi-modal, Inter-Linked, Evolving
,What is a data schema? - CORRECT ANSWER -A set of constraints that...
1. describe the "properties" of data
2. describe the structure of data
3. enable validation & efficient storage of data
4. enable querying and retrieval of data
Advantages of a structured database? - CORRECT ANSWER -1. Easier to query
2. Easier to optimize
3. Easier to explore
Advantages of semi-structured database? - CORRECT ANSWER -Data organization is
flexible/malleable (easier to integrate and exchange).
Describe the curse of dimensionality - CORRECT ANSWER -The more dimensions we
have, the more data we need to discover patterns (prevent overfitting).
True or False: The distance between two points is equal to the length of the distance vector. -
CORRECT ANSWER -True
Give an example of data transformation - CORRECT ANSWER -Gender column
originally being a single column with "M" or "F", and then is transformed into two columns (one
for M and one for F) with 0s and 1s to confirm sex.
Which aspects of data should be handled by a scalable data exploratory system? - CORRECT
ANSWER -1. The amount of data
2. The diversity of the data types
3. The speed of new data generated
Give an example of prefix search - CORRECT ANSWER -Find all strings that start with
"tab"
, • "table"; "tabular"; "tablet";...
Give an example of subsequence search - CORRECT ANSWER -Find all strings that
contain the subsequence "ark"
• "marketing"; "spark"; "quark";...
Give an example of subsequence match - CORRECT ANSWER -- Find the longest
matching subsequence between "plasticity" and "scholastic"
- Find the most frequently repeating 3 character subsequence
• "abcbbbaabbaabcbbbaaabbc"
What is the edit distance between two sequences? - CORRECT ANSWER -The minimum
number of edit operations (insert, remove, replace) needed to convert one sequence to the other.
True or False: A skyline query is a type of exploratory query. - CORRECT ANSWER -
True. Other types can include: similarity queries, ranked, drill-down, frequent itemsets,
aggregate/iceberg queries.
The four types of visual variable data types are? - CORRECT ANSWER -Nominal,
Ordinal, Interval, Ratio
True or False: For nominal data, order matters. - CORRECT ANSWER -False. Nominal
data is data whose categories have no implied ordering.
Give an example of ordinal data - CORRECT ANSWER -Small, Medium, Large
What is ordinal data? - CORRECT ANSWER -Data that has a specified order, but no
specified distance metric.