100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

DATA MINING PIPELINE EXAM QUESTIONS WITH VERIFIED ANSWERS

Rating
-
Sold
-
Pages
5
Grade
A+
Uploaded on
26-03-2025
Written in
2024/2025

DATA MINING PIPELINE EXAM QUESTIONS WITH VERIFIED ANSWERS

Institution
DATA MINING
Course
DATA MINING









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
DATA MINING
Course
DATA MINING

Document information

Uploaded on
March 26, 2025
Number of pages
5
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

DATA MINING PIPELINE EXAM
QUESTIONS WITH VERIFIED
ANSWERS
What are some examples of attribute selection? - Answer-Ø Forward selection
• Keep adding (most informative) attributes
Ø Backward elimination
• Keep removing (least informative) attributes
Ø Feature engineering
• Domain knowledge, decision tree induction, ...

What are some examples of numerosity reduction? - Answer-Ø Parametric methods
• Assume the data fits a certain model
• Estimate model parameters
• E.g., linear/multi-linear/log-linear regression
Ø Non-parametric methods
• Do not assume a certain model
• Use fewer/smaller data representations

Describe a data warehouse - Answer-William H. Inmon -- "a subject-oriented,
integrated, time-variant, and nonvolatile collection of data in support of
management's decision-making process."

Describe OLTP & OLAP - Answer-Ø Online Transactional Processing (OLTP)
• Transaction-oriented tasks: bank transfer, purchase, ...
• Daily operations: insert, update, delete
Ø Online Analytical Processing (OLAP)
• Complex queries on historical data
• Data analysis for insights and decision making

Describe facts & dimensions in a datawarehouse - Answer-Examples:
Ø Fact: Sales
• Customer, item, time
Ø Dimension: Customer
• Name, address, DOB
Ø Dimension: Time
• Year, month, date

What are some common schemas for data warehousing - Answer-Ø Star schema:
one fact table, multiple dimension tables
Ø Snowflake schema
• one fact table, multiple levels of dimension tables
Ø Fact constellation schema
• multiple fact tables, shared dimension tables

Describe a data cube - Answer-Ø Multi-dimensional data model
• Dimensions: cube attribute

, • E.g., year, product, color
• Facts: numeric measure
• E.g., sales volume/value

What are some data cube operations? - Answer-Ø Roll up: aggregation
• E.g., daily => monthly
Ø Drill down: reverse of roll up
• E.g., North America => USA, Mexico, Canada, ...
Ø Pivot: rotate (visualization)
• E.g., <country, item> => <item, country>
Ø Slicing: select along a single dimension
• E.g., country = "USA"
Ø Dicing: select along multiple dimensions
• E.g., county = "USA", year = "2011 - 2020"

What are the different levels of data cube materialization - Answer-Ø Full
materialization
• Pre-compute all cuboids and cells
Ø No materialization
• No precomputation, on-demand
Ø Partial materialization
• (heuristically) pre-compute some cuboids and cells

Describe ETL Staging - Answer-Ø Extract data from various data sources
Ø Transform data
Ø Load data into the data warehouse
What are the stages of a data mining pipeline - Answer-Data understanding
Data preprocessing
Data warehousing
Data modeling
Pattern evaluation

What makes up the central tendency of data? - Answer-Ø Mean
Ø Median
Ø Mode
Ø Midrange
• (Max - Min)/2

What is the dispersion of a dataset? - Answer-Ø How much a distribution is stretched
or squeezed
• Range: max - min
• Quartiles: Q1 (25%), Q3 (75%)
• IQR (interquartile range): Q3 - Q1
• Variance
• Standard deviation

What are some approaches to encoding relationships between nominal attributes -
Answer-Ø Similarity
• s = 1 if x = y; otherwise s = 0

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
biggdreamer Havard School
View profile
Follow You need to be logged in order to follow users or courses
Sold
247
Member since
2 year
Number of followers
68
Documents
17943
Last sold
1 week ago

4.0

38 reviews

5
22
4
4
3
6
2
2
1
4

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions