100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

DATA MINING EXAM REVIW GUIDE QUESTIONS WITH VERIFIED ANSWERS

Rating
-
Sold
-
Pages
9
Uploaded on
26-03-2025
Written in
2024/2025

DATA MINING EXAM REVIW GUIDE QUESTIONS WITH VERIFIED ANSWERS

Institution
DATA MINING
Course
DATA MINING









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
DATA MINING
Course
DATA MINING

Document information

Uploaded on
March 26, 2025
Number of pages
9
Written in
2024/2025
Type
Exam (elaborations)
Contains
Unknown

Subjects

Content preview

DATA MINING EXAM REVIW GUIDE
QUESTIONS WITH VERIFIED
ANSWERS
True or False: A good predictive model is one that fits the data closely. - Answer-
False: A good predictive model predicts new cases accurately, whereas an
explanatory model fits data closely.

Categorical is one type of variable. Which of the following is not a categorical
variable?
A. Hair color
B. Gender
C. Integer
D. Political affiliation - Answer-C: There are two types of variables, categorical and
numeric. Categorical would be ordered (low,medium,high) or unordered (male or
female). Numeric variables are variables that are continuous or integers.

True/ False: The equation that describes how y is related to x and the error term is
called the regression model? - Answer-True: the simple linear regression model is
y=Bo+B1X+E. B0 and B1 are called parameters of the model and E is a random
variable called

True or False: Data Mining is a scientific approach to managerial decision making in
which raw data are processed and manipulated to produce meaningful information? -
Answer-True: In order to make those decisions you must extract data from large data
sets. With data analysis you can detect meaningful patterns and rules, ultimately
finding meaningful correlations, patterns, and trends.

The variable being predicted is called the ? , while variables being used to predict
the value are called the ? .

A: Independent variable, denoted by y / Dependent variables, denoted by x

B: Dependent variable, denoted by x / Independent variables, denoted by y

C: Dependent variable, denoted by y / Independent variables, denoted by x

D: Independent variable, denoted x / Dependent variables, denoted by y - Answer-C:
Y is dependent upon X. The relationship between these two or more variables help
make managerial decisions. Regression Analysis can be used to develop an
equation showing how the variables are related.

The usefulness of a data mining method depends on _________.

A. The size of dataset

B. The types of patterns that exist in the data

, C. Noisiness of data

D. The particular goal of the analysis

E. All of the above - Answer-E: Every method of data mining has some advantages
and disadvantages. The method that is most useful for the current goal should be
used

True or false: The goal of unsupervised learning is to segment data into meaningful
segments; detect patterns. - Answer-True: With unsupervised learning there is no
target variable to predict or classify.

What is the name of the type of regression that compares one independent variable
with one dependent variable?

A) Multiple Linear Regression
B) Simple Linear Regression
C) Logistic Regression
D) Regression Trees - Answer-B

T/F : Regression analysis is a poor way to show the relationship between the
dependent variable and independent variable(s) - Answer-False: Regression
analysis is one of the best ways to show the relationship between the two types of
variables

1.Out of the six core ideas in data mining, which are associated with unsupervised
learning algorithms?

a.) Association rules, classification, data reduction, data exploration

b.) Data reduction, prediction, data visualization, association rules

c.) Association rules, data visualization, data exploration, data reduction

d.) Prediction, data reduction, data exploration, classification - Answer-C:
Unsupervised learning algorithms are those used where there is no outcome variable
to predict or classify.

T/F: Training data refers to that portion of the data used to assess how well the
model fits. - Answer-False: Training data refers to that portion of the data used to fit
a model. Validation data refers to that portion of the data used to assess how well
the model fits.

True/False: The first step in trying to reduce the number of predictors should always
be to use domain knowledge - Answer-TRUE: This is the first step because it is very
important to understand what the various predictors are measuring and why. By
using domain knowledge, the user can ensure he or she has condensed the data to
a manageable level. This will make finding the solution much easier.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
biggdreamer Havard School
View profile
Follow You need to be logged in order to follow users or courses
Sold
247
Member since
2 year
Number of followers
68
Documents
17943
Last sold
1 week ago

4.0

38 reviews

5
22
4
4
3
6
2
2
1
4

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions