Exam (elaborations)

DATA MINING FINAL EXAM Q&A

Rating

Sold

Pages

Uploaded on

26-03-2025

Written in

2024/2025

DATA MINING FINAL EXAM Q&A

Institution

DATA MINING

Course

DATA MINING

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Written for

Institution: DATA MINING
Course: DATA MINING

Document information

Uploaded on: March 26, 2025
Number of pages: 4
Written in: 2024/2025
Type: Exam (elaborations)
Contains: Unknown

Subjects

data mining final exam qa

Content preview

DATA MINING FINAL EXAM Q&A
What constitutes a good cluster? What are our two main goals during clustering? -
Answer-A good cluster consists of a) minimized intra-cluster distances, b) maximized
inter-cluster distances, c) high intra-cluster similarity, and d) low inter-class similarity.

What are the two main clustering methods? Explain the differences between them. -
Answer-The two types of clustering are a) partitional and b) hierarchical. There are
two main differences between these clustering methods. One is that partitional
clustering develops pure clusters, whereas hierarchical clustering creates nested
clusters. Another difference is that hierarchical clustering doesn't have to assume
any particular number of clusters, whereas partitional does.

What are the trade-offs to consider while selecting a minimum support threshold
value? - Answer-If minsup is too, we could miss item sets involving interesting rare
items (e.g., expensive products). Alternatively, if minsup is set too low, it is
computationally expensive and the number of item sets is very large.

Briefly discuss the similarities and differences between association rule mining and
collaborative filtering. - Answer-Association rule mining (ARM) focuses on frequent
item combinations whereas collaborative filtering (CF) focuses on user preferences.
ARM's data rows are single transactions and ignore user dimension, whereas CF's
data rows are user purchases or ratings over time. ARM is used in displays (what
goes with what), whereas CF is useful for recommendations involving unusual items.

Define the Information Retrieval task in text analytics and briefly explain how a
typical Information Retrieval system works - Answer-Information Retrieval is finding
documents whose set of words most closely matches words in query. The system
works in three main steps - 1) taking in the input of the query string 2) ) cross-
referencing this query with the document corpus and 3) ranking the documents.

Support Formula - Answer-x & y / Total

Confidence Formula - Answer-x & y / x

Network Density - Potential Connections - Answer-[n * (n-1)] / 2 (where n = number
of nodes)

Network Density - Actual connections - Answer-# of links

Network Density - Answer-Actual/Potential - # of links / ([n * (n-1)]/ 2)

Distance between nodes - Answer-shortest path

Highest degree centrality - Answer-Node with most links

Briefly explain the initial centroid selection problem in k-means clustering and
suggest possible ways to overcome this problem - Answer-Initial centroid problem
occurs because the centroids used are randomly picked. This can effect the

$14.49

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

biggdreamer

4.0

(38)

Also available in package deal

Get to know the seller

biggdreamer Havard School

View profile

Sold

247

Member since

2 year

Number of followers

Documents

17943

Last sold

1 week ago

4.0

38 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller biggdreamer. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $14.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 47134 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

DATA MINING FINAL EXAM Q&A

Written for

Document information

Subjects

Content preview

Also available in package deal

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?