100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4,6 TrustPilot
logo-home
Resumen

0HM270 - Supercrunchers Summary

Puntuación
-
Vendido
2
Páginas
76
Subido en
08-09-2020
Escrito en
2019/2020

Complete summary of all lectures in the Supercrunchers course, all you need to know for the exam

Institución
Grado

Vista previa del contenido

0HM270 - Supercrunchers Summary – 19-20 - Q4

Content
Lecture 1: Intro lecture

Lecture 2: User aspects of Recommender systems

Lecture 3: Manager vs Machine and more

Lecture 4: Interactive recommender systems

Lecture 5: Brunswik’s Lens model / Dawes 1974

Lecture 6: Learning analytics and skin cancer detection

Lecture 7: Some notes on prediction

Lecture 8: Netflix for Good – Guest lecture Alain Stark

Lecture 9: Website (online) adaptation

Lecture 1: Intro lecture
Supercrunching = Using (sometimes a lot of) data to predict something that
- We normally cannot predict well
- Humans normally tend to predict

HMI = human model interaction

The timeline of ideas: ideas → .. → … → … → world-wide implementation (difficult to get here)
- Which hurdles need to overcome?
- Can we find consistencies across topics?
- Which kind of crunchers are more likely to be adopted?
- When do which kind of counter-arguments pop up? What can we do about these?
- Etc.

Example: Cook county hospital
Not enough rooms, overworked staff, many patients without insurance etc.
Most often: acute chest pain. There was not much agreement between physicians on what is high,
medium of low risk.
Goldman found out: only 4 things matter: ECG, blood pressure, fluid in lungs, unstable angina. He
created a scheme out of this.
Reilly tested Goldman’s idea. Physicians were 82% of the time right, Goldman’s scheme is 95% of the
time right.

Clinical prediction (human (expert)) versus statistical prediction (computer model, scheme, etc.)
Most often, the model wins!! But this depends on the context.

Where to expect that a human will outperform a computer
- Emotion recognition / emotional support
- In situations where social cues are important
- Where human interaction is very important
- Intuition

,Why is it that computer models often beat (expert) humans?
In total, there are 88 reasons/ well documented flaws in human judgment. Some of these are:
- Our memory fools us (Wagenaar)
- Dealing with probabilities / base rate neglect (Bar-Hillel)
- We emphasize the improbable (Stickler)
- Confirmation bias (Edwards, Wason)
- Hindsight bias (Fischhoff)
- Cognitive dissonance (Festinger)
- Mental floating frankfurter: What you see when you put your fingers close to your eyes and
try to see through them. This is when you see a floating piece of meat. You know that it is not
there, but as soon as you see it, you cannot help this, you just see it. This is the same for
decision making biases, even though you might now that you have them, that doesn’t help
you get rid of it.
- Mental sets: certain ways of thinking you have learned. This makes it difficult to think outside
of the box, you use less of your creativity. (Redelmayer, Tversky)
- Memory: people are not very good in remembering things. We don’t only forget things, we
also get ‘extra stuff in’ that is not supposed to be there. So, remember something (partly)
wrong.
- Availability heuristic: a mental shortcut that relies on immediate examples that come to mind
when evaluating a specific topic.
- Dealing with probabilities = difficult for people
- Overconfidence. E.g. Estimates of how many quiz questions you will have correct are
generally too high. When you are better at something, the overconfidence is generally
worse.
- Finding non-existent patterns. Predict what is going to happen, 2/3 = green 1/3 = red. The
optimal strategy is always pressing green, because you don’t know what it is going to be.
Here you would have 2/3 correct. Other strategy: guess each time, with about 2 greens to
every 1 red. However, here you score lower than 2/3 correct.
- The broken leg cue. E.g. trying to predict whether or not you are going to the cinema this
Saturday. When I hear you have a broken leg, I know you won’t go. In our situation, the
corona-virus would be the broken leg cue. Because of this, you can predict that people are
not going to the cinema. If you know this broken leg cue, in all likelihood you would have a
perfect prediction. The problem: humans see broken leg cues everywhere, way more often
than they actually should.
- The issue of feedback. People learn when they get immediate and unambiguous feedback.
But, in many cases this immediate and unambiguous feedback is simply not there. There is
often a lot of time in between. Not that obvious what exactly you did right or wrong. You
don’t know if what you did influenced it, or that it was something else.

Decision making = store, retrieve, combine information + learn from feedback
A human is not very well equipped to do that, a computer is.

Two competing theories
1. Naturalistic decision making (NDM)
a. = an attempt to understand how people make decisions in real-world contexts that
are meaningful and familiar to them.
b. It is not clear why people make a certain decision, but there is a certain experience
and intuition built over time that helps making the decision (Klein, Shanteau)
c. Counterargument: studies are done in the lab, where decision making is different
than in normal life.
2. Fast and frugal heuristics

, a. People don’t decide in perfect ways, but they have sort of shortcuts, which are (over
time) good enough to make decisions (Gigerenzer)

Difficult issues when implementing ideas:
- When the model makes a mistake, then who can we blame?
- Patients may complain. E.g. who is the idiot with the card/scheme treating me? Why can’t I
get a real doctor? Who doesn’t need a card.
Possible solution: Look at scheme before entering the patient room, so you can remember the
scheme and don’t need it in the room with the patient anymore.

Conclusion:
It is not:
- Humans (or experts) are stupid
Instead:
- Models can beat humans, sometimes
- We have a quite good idea as to why this happens. People make mistakes that are consistent
(not random)
- Implementation issues are often more complicated to solve than it is to make the model
(modeling is easy, humans are complicated)

Lecture 2: User aspects of Recommender systems
Recommender systems:
- Field that combines machine learning, information retrieval and human-computer
interaction (HCI)
- Help overcome information overload, find relevant stuff in the big pile of information
- Offers personalized suggestions based on what it knows about the user, e.g. history of what
the user liked and disliked
- Main task: predict what other items the user would also like
- The prediction task is part of the recommendation task. When you have a good prediction, it
doesn’t automatically mean it is a good recommendation.

Most popular methods of recommender systems:
- Content-based filtering
- Collaborative filtering (CF)
o Neighborhood methods
▪ User-based
▪ Item-based
o Matrix factorization / SVD (singular value decomposition)

What data to use to build a user profile?
- Explicit data
o Ratings of individual items
o Different types of scales
- Implicit data
o Click streams
o Wish list
o Purchase data
o Viewing times

Content-based recommender system (personalization)
- User profile is content description of previous interests (expressed through ratings)

, - It uses these content features (meta-data) to find other movies
o Meta-data can be the genre, the actors etc.
- Advantages:
o Profiles are individual in nature and don’t rely on other users (benefit of privacy!)
o Easy to explain and control by the user
o Can be run client-side (privacy!)
- Drawbacks:
o Overspecializes the item selection
▪ Only based on previous ratings by this particular user
o Difficult to get unexpected items
▪ And people value novel, serendipitous items the most. We want to find new
things.

Collaborative filtering (CF)
- Matching user’s ratings with those of similar users
o Find out how users are similar in what they like and dislike
o Completely data driven, no meta-data needed
- Advantages:
o Domain-free and no explicit profiles/content needed
o Can express aspects in the data that are hard to profile
- Drawbacks:
o Cold-start problem: new users have not rated anything / new items have no ratings
yet. So, you don’t know what to recommend.
o Sparsity: each user has only rated a few items, so you are missing a lot of information
o Server-side: privacy issues in data collection and storage

2 types of collaborative filtering (CF):
- Neighborhood methods (clustering, K-NN)
- Latent factor models (matrix factorization, dimensionality reduction methods)

2 types of neighborhood methods:
- User-based collaborative filtering
o Find similar users like A, then form a neighborhood (clique). Find items rated by the
clique but not by A and predict how A would rate all of these (weighing other user
ratings by their similarity to A). Recommend the items with the highest predicted
ratings
o Drawbacks:
▪ Computationally expensive, because you have to find similar users out of all
of the users in the system (large data base). This will take a lot of time.
- Item-based collaborative filtering
o Similar, but based on similar items:
▪ Find items that are similar (instead of users), by calculating the similarity
between items based on user ratings. Generate a similarity matrix between
the items, based on similarity in the rating profile. So, when movies are rated
similarly by the same people, they are more similar. Use the similarity matrix
to calculate what the expected rating of other items would be.
▪ ‘If you like these items, you might like this as well’
o Computationally better for cases with much more users than items

How to measure performance of CF? And how to optimize the prediction model?
- Deviation of algorithmic prediction from actual user ratings
- Training-test set approach: Predict based on a test set using a training set.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
8 de septiembre de 2020
Número de páginas
76
Escrito en
2019/2020
Tipo
RESUMEN

Temas

$8.17
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
lynnheesterbeek Technische Universiteit Eindhoven
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
26
Miembro desde
5 año
Número de seguidores
17
Documentos
9
Última venta
2 año hace

5.0

1 reseñas

5
1
4
0
3
0
2
0
1
0

Documentos populares

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes