Summary

ECB3ADAVE2 - Applied Data Analysis and Visualization II - Full Summary

Name: ECB3ADAVE2 - Applied Data Analysis and Visualization II - Full Summary
SKU: doc_1376376
Rating: 4.82 (17 reviews)
Author: lisannelouwerse

Rating

4.8

(17)

Sold

Pages

Uploaded on

07-11-2021

Written in

2021/2022

A detailed summary of all the relevant unsupervised learning methods. Based on the book, articles, lecture slides, exercises & assignments and articles and videos I found through Google. Edit: I got told that the hyperlinks in the document don't work. Once you have bought the summary, please send me a message () and I'll send you the pdf with working hyperlinks through :)

Show more Read less

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Written for

Institution: Universiteit Utrecht (UU)
Study: Economics And Business Economics
Course: Applied Data Analysis and Visualization II (ECB3ADAVE2)

All documents for this subject (1)

Document information

Uploaded on: November 7, 2021
File latest updated on: November 8, 2021
Number of pages: 49
Written in: 2021/2022
Type: Summary

Subjects

data science
data analysis
unsupervised learning
utrecht university
association rule analysis
clustering
pca
nmf
plsa
ica
mds
universiteit utrecht
uu
economics and business economics
fa
ca

Content preview

Applied Data Analysis and Visualization II
Universiteit Utrecht – ECB3ADAVE2

Written by Lisanne Louwerse

Summary

,Table of content
WEEK 1 ............................................................................................................................................................. 3
SUPERVISED VS. UNSUPERVISED LEARNING.................................................................................................................... 3
ASSOCIATION RULE ANALYSIS ..................................................................................................................................... 3
WEEK 2 ............................................................................................................................................................. 6
WHAT IS CLUSTERING? ............................................................................................................................................. 6
K-MEANS CLUSTERING .............................................................................................................................................. 7
HIERARCHICAL CLUSTERING ..................................................................................................................................... 11
WEEK 3 ........................................................................................................................................................... 13
DIMENSION REDUCTION.......................................................................................................................................... 13
PRINCIPAL COMPONENT ANALYSIS (PCA) ................................................................................................................... 13
WEEK 4 ........................................................................................................................................................... 19
NON-NEGATIVE MATRIX FACTORIZATION (NMF) ......................................................................................................... 19
PROBABILISTIC LATENT SEMANTIC ANALYSIS (PLSA) .................................................................................................... 21
WEEK 5 ........................................................................................................................................................... 24
FACTOR ANALYSIS (FA) ........................................................................................................................................... 24
INDEPENDENT COMPONENT ANALYSIS (ICA) ............................................................................................................... 27
WEEK 6 ........................................................................................................................................................... 30
MULTIDIMENSIONAL SCALING (MDS) ....................................................................................................................... 30
WEEK 7 ........................................................................................................................................................... 33
CONTINGENCY TABLES AND CORRESPONDENCE TABLES .................................................................................................. 33
CORRESPONDENCE ANALYSIS (CA) ........................................................................................................................... 35
KEY TAKEAWAYS ............................................................................................................................................ 43
ASSOCIATION RULE ANALYSIS ................................................................................................................................... 43
CLUSTER ANALYSIS ................................................................................................................................................. 43
PRINCIPAL COMPONENT ANALYSIS ............................................................................................................................ 44
NON-NEGATIVE MATRIX FACTORIZATION ................................................................................................................... 45
PROBABILISTIC LATENT SEMANTIC ANALYSIS ............................................................................................................... 46
FACTOR ANALYSIS ................................................................................................................................................. 46
INDEPENDENT COMPONENT ANALYSIS ....................................................................................................................... 47
MULTIDIMENSIONAL SCALING.................................................................................................................................. 48
CORRESPONDENCE ANALYSIS ................................................................................................................................... 48

2

,Week 1
Key Words
▪ Supervised / unsupervised learning
▪ Antecedent and consequent
▪ Support, confidence and lift
▪ Apriori algorithm and Apriori principle

Supervised vs. unsupervised learning

▪ Supervised learning
Building a statistical model for predicting / estimating an output (y) based on one or
more inputs (x).
o Classification: predict to which category an observation belongs (qualitative
outcomes).
o Regression: predict a quantitative outcome.

▪ Unsupervised learning
Inputs (x) but no outputs (y). Try to learn structure and relationships from data, like …
… discovering associations among variable values → association rule analysis
… discovering unknown subgroups of observations → clustering
… dimension reduction → principal components analysis

Association rule analysis
Goal: to find joint values of the variables x1, …, xp that appear together most frequently in the
data base.
In the case of binary valued data, association rule analysis is called ‘market basket’ analysis.
Transactions are represented in a binary incidence matrix:
1, if the jth item is purchased as part of transaction i.
xij {
0, if the jth item is not purchased as part of transaction i.

This matrix can now be used to find association rules.
An association rule is the implication

A⇒B antecedent ⇒ consequent
In market basket analysis, it can be seen as an if-then statement:
If you buy A, there is a chance that you buy B as well.
3

, Properties of association rules
The support (or prevalence) of association rule A ⇒ B is the relative frequency of the rule.
It’s the probability of simultaneously observing A and B in a randomly selected market basket,
so Pr(A,B).
number of transactions containing A and B
supp(A ⇒ B) =
total number of transactions

Note that this is the support of an association rule. The support of just an item (set) A is defined as:

number of transactions containing A / total number of transactions.

The confidence of association rule A ⇒ B is the conditional probability of B given A, so
Pr(B|A). It is the likelihood of item B being purchased when item A is purchased.
number of transactions containing A and B
conf(A ⇒ B) =
number of transactions containing A

▪ If conf = 1 : B is always purchased when A is purchased.
▪ If conf = 0 : B is never purchases when A is purchased.

Drawback: The confidence for an association rule having a very frequent consequent (B) will
always be high, even if the antecedent (A) is not frequent. Because of this, a rule containing
two items that actually have a weak association may still have a high confidence value.
To overcome this challenge, lift is introduced.

The lift of association rule A ⇒ B calculates the conditional probability of item B given A,
while controlling for the support (frequency) of B.
number of transactions containing A and B / number of transactions containing A
lift(A ⇒ B) =
number of transactions containing B

In other words:
the rise in the probability of having B in the transaction because of the knowledge that A is present
lift(A ⇒ B) = the probability of having B in the transaction without any knowledge about the presence of A

▪ If lift = 1 A and B are independent.
▪ If lift > 1 A and B often occur together.
▪ If lift < 1 A and B are substitutes to each other. The presence of one item has a
negative effect on the presences of the other item.

Lift can be seen as the “strength” of the rule.

4

$9.62

Get access to the full document:

Purchased by 56 students

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

lisannelouwerse

4.6

(71)

Reviews from verified buyers

Showing 7 of 17 reviews

ab3800 Economics And Business Economics · 1 review

1 year ago

very good and detailed summary, only thing that is missing is deep learning week 8.

giovannatullume Business And Economics · 2 reviews

1 year ago

This is a very good summary of the course, but week 2 on linear algebra is missing.

sidersdavids · 1 review

1 year ago

rafaelblyth Economics And Business Economics · 1 review

2 year ago

alapusneanu Economics · 3 reviews

2 year ago

bartvanlidthdejeude Economics And Business Economics · 2 reviews

3 year ago

JJ41221 Economics And Business Economics · 1 review

3 year ago

4.8

17 reviews

Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

lisannelouwerse Universiteit Utrecht

View profile

Sold

340

Member since

8 year

Number of followers

248

Documents

Last sold

3 days ago

Summaries UU Economics and Business Economics

Feedback is always welcome. Send me a message if you have any comments on how I can improve my summaries. :)

4.6

71 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller lisannelouwerse. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $9.62. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 46567 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

ECB3ADAVE2 - Applied Data Analysis and Visualization II - Full Summary

Written for

Document information

Subjects

Content preview

Reviews from verified buyers

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?