100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

0HM270 - Supercrunchers Summary

Rating
-
Sold
2
Pages
76
Uploaded on
08-09-2020
Written in
2019/2020

Complete summary of all lectures in the Supercrunchers course, all you need to know for the exam

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
September 8, 2020
Number of pages
76
Written in
2019/2020
Type
Summary

Subjects

Content preview

0HM270 - Supercrunchers Summary – 19-20 - Q4

Content
Lecture 1: Intro lecture

Lecture 2: User aspects of Recommender systems

Lecture 3: Manager vs Machine and more

Lecture 4: Interactive recommender systems

Lecture 5: Brunswik’s Lens model / Dawes 1974

Lecture 6: Learning analytics and skin cancer detection

Lecture 7: Some notes on prediction

Lecture 8: Netflix for Good – Guest lecture Alain Stark

Lecture 9: Website (online) adaptation

Lecture 1: Intro lecture
Supercrunching = Using (sometimes a lot of) data to predict something that
- We normally cannot predict well
- Humans normally tend to predict

HMI = human model interaction

The timeline of ideas: ideas → .. → … → … → world-wide implementation (difficult to get here)
- Which hurdles need to overcome?
- Can we find consistencies across topics?
- Which kind of crunchers are more likely to be adopted?
- When do which kind of counter-arguments pop up? What can we do about these?
- Etc.

Example: Cook county hospital
Not enough rooms, overworked staff, many patients without insurance etc.
Most often: acute chest pain. There was not much agreement between physicians on what is high,
medium of low risk.
Goldman found out: only 4 things matter: ECG, blood pressure, fluid in lungs, unstable angina. He
created a scheme out of this.
Reilly tested Goldman’s idea. Physicians were 82% of the time right, Goldman’s scheme is 95% of the
time right.

Clinical prediction (human (expert)) versus statistical prediction (computer model, scheme, etc.)
Most often, the model wins!! But this depends on the context.

Where to expect that a human will outperform a computer
- Emotion recognition / emotional support
- In situations where social cues are important
- Where human interaction is very important
- Intuition

,Why is it that computer models often beat (expert) humans?
In total, there are 88 reasons/ well documented flaws in human judgment. Some of these are:
- Our memory fools us (Wagenaar)
- Dealing with probabilities / base rate neglect (Bar-Hillel)
- We emphasize the improbable (Stickler)
- Confirmation bias (Edwards, Wason)
- Hindsight bias (Fischhoff)
- Cognitive dissonance (Festinger)
- Mental floating frankfurter: What you see when you put your fingers close to your eyes and
try to see through them. This is when you see a floating piece of meat. You know that it is not
there, but as soon as you see it, you cannot help this, you just see it. This is the same for
decision making biases, even though you might now that you have them, that doesn’t help
you get rid of it.
- Mental sets: certain ways of thinking you have learned. This makes it difficult to think outside
of the box, you use less of your creativity. (Redelmayer, Tversky)
- Memory: people are not very good in remembering things. We don’t only forget things, we
also get ‘extra stuff in’ that is not supposed to be there. So, remember something (partly)
wrong.
- Availability heuristic: a mental shortcut that relies on immediate examples that come to mind
when evaluating a specific topic.
- Dealing with probabilities = difficult for people
- Overconfidence. E.g. Estimates of how many quiz questions you will have correct are
generally too high. When you are better at something, the overconfidence is generally
worse.
- Finding non-existent patterns. Predict what is going to happen, 2/3 = green 1/3 = red. The
optimal strategy is always pressing green, because you don’t know what it is going to be.
Here you would have 2/3 correct. Other strategy: guess each time, with about 2 greens to
every 1 red. However, here you score lower than 2/3 correct.
- The broken leg cue. E.g. trying to predict whether or not you are going to the cinema this
Saturday. When I hear you have a broken leg, I know you won’t go. In our situation, the
corona-virus would be the broken leg cue. Because of this, you can predict that people are
not going to the cinema. If you know this broken leg cue, in all likelihood you would have a
perfect prediction. The problem: humans see broken leg cues everywhere, way more often
than they actually should.
- The issue of feedback. People learn when they get immediate and unambiguous feedback.
But, in many cases this immediate and unambiguous feedback is simply not there. There is
often a lot of time in between. Not that obvious what exactly you did right or wrong. You
don’t know if what you did influenced it, or that it was something else.

Decision making = store, retrieve, combine information + learn from feedback
A human is not very well equipped to do that, a computer is.

Two competing theories
1. Naturalistic decision making (NDM)
a. = an attempt to understand how people make decisions in real-world contexts that
are meaningful and familiar to them.
b. It is not clear why people make a certain decision, but there is a certain experience
and intuition built over time that helps making the decision (Klein, Shanteau)
c. Counterargument: studies are done in the lab, where decision making is different
than in normal life.
2. Fast and frugal heuristics

, a. People don’t decide in perfect ways, but they have sort of shortcuts, which are (over
time) good enough to make decisions (Gigerenzer)

Difficult issues when implementing ideas:
- When the model makes a mistake, then who can we blame?
- Patients may complain. E.g. who is the idiot with the card/scheme treating me? Why can’t I
get a real doctor? Who doesn’t need a card.
Possible solution: Look at scheme before entering the patient room, so you can remember the
scheme and don’t need it in the room with the patient anymore.

Conclusion:
It is not:
- Humans (or experts) are stupid
Instead:
- Models can beat humans, sometimes
- We have a quite good idea as to why this happens. People make mistakes that are consistent
(not random)
- Implementation issues are often more complicated to solve than it is to make the model
(modeling is easy, humans are complicated)

Lecture 2: User aspects of Recommender systems
Recommender systems:
- Field that combines machine learning, information retrieval and human-computer
interaction (HCI)
- Help overcome information overload, find relevant stuff in the big pile of information
- Offers personalized suggestions based on what it knows about the user, e.g. history of what
the user liked and disliked
- Main task: predict what other items the user would also like
- The prediction task is part of the recommendation task. When you have a good prediction, it
doesn’t automatically mean it is a good recommendation.

Most popular methods of recommender systems:
- Content-based filtering
- Collaborative filtering (CF)
o Neighborhood methods
▪ User-based
▪ Item-based
o Matrix factorization / SVD (singular value decomposition)

What data to use to build a user profile?
- Explicit data
o Ratings of individual items
o Different types of scales
- Implicit data
o Click streams
o Wish list
o Purchase data
o Viewing times

Content-based recommender system (personalization)
- User profile is content description of previous interests (expressed through ratings)

, - It uses these content features (meta-data) to find other movies
o Meta-data can be the genre, the actors etc.
- Advantages:
o Profiles are individual in nature and don’t rely on other users (benefit of privacy!)
o Easy to explain and control by the user
o Can be run client-side (privacy!)
- Drawbacks:
o Overspecializes the item selection
▪ Only based on previous ratings by this particular user
o Difficult to get unexpected items
▪ And people value novel, serendipitous items the most. We want to find new
things.

Collaborative filtering (CF)
- Matching user’s ratings with those of similar users
o Find out how users are similar in what they like and dislike
o Completely data driven, no meta-data needed
- Advantages:
o Domain-free and no explicit profiles/content needed
o Can express aspects in the data that are hard to profile
- Drawbacks:
o Cold-start problem: new users have not rated anything / new items have no ratings
yet. So, you don’t know what to recommend.
o Sparsity: each user has only rated a few items, so you are missing a lot of information
o Server-side: privacy issues in data collection and storage

2 types of collaborative filtering (CF):
- Neighborhood methods (clustering, K-NN)
- Latent factor models (matrix factorization, dimensionality reduction methods)

2 types of neighborhood methods:
- User-based collaborative filtering
o Find similar users like A, then form a neighborhood (clique). Find items rated by the
clique but not by A and predict how A would rate all of these (weighing other user
ratings by their similarity to A). Recommend the items with the highest predicted
ratings
o Drawbacks:
▪ Computationally expensive, because you have to find similar users out of all
of the users in the system (large data base). This will take a lot of time.
- Item-based collaborative filtering
o Similar, but based on similar items:
▪ Find items that are similar (instead of users), by calculating the similarity
between items based on user ratings. Generate a similarity matrix between
the items, based on similarity in the rating profile. So, when movies are rated
similarly by the same people, they are more similar. Use the similarity matrix
to calculate what the expected rating of other items would be.
▪ ‘If you like these items, you might like this as well’
o Computationally better for cases with much more users than items

How to measure performance of CF? And how to optimize the prediction model?
- Deviation of algorithmic prediction from actual user ratings
- Training-test set approach: Predict based on a test set using a training set.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
lynnheesterbeek Technische Universiteit Eindhoven
Follow You need to be logged in order to follow users or courses
Sold
26
Member since
5 year
Number of followers
17
Documents
9
Last sold
2 year ago

5.0

1 reviews

5
1
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions