Summary

Human-Centered Machine Learning Summary & Course Notes

Rating

Sold

Pages

Uploaded on

22-11-2022

Written in

2022/2023

This document contains notes and summaries covering the content of the course Human-Centered Machine Learning within the Artificial Intelligence Master at Utrecht University. It is divided into two parts: Explainability and Fairness. It covers the following topics: - intro to XAI - interpretable models - model agnostic interpretability methods - neural network interpretability - intro to FAI - measuring fairness - interventions & making fair systems

Show more Read less

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Written for

Institution: Universiteit Utrecht (UU)
Study: Master Artificial Intelligence
Course: Human-Centered Machine Learning

All documents for this subject (3)

Document information

Uploaded on: November 22, 2022
Number of pages: 22
Written in: 2022/2023
Type: Summary

Subjects

explainability
fairness
machine learning
interpretability

Content preview

Course notes on Human-Centered Machine Learning -
Explainability Part

— Lecture 1: XAI Intro —

What is Explainable AI/ML
• No consensus on a universal definition: definitions are domain-specific
• Interpretability: ability to explain or to present in understandable terms to a
human
⁃ The degree to which a human can understand the cause of a decision
⁃ The degree to which a human can consistently predict the result of a
model
• Explanation: answer to a why question
⁃ Usually relates the feature values of an instance to its model prediction
in a humanly understandable way
• Molnar: model interpretability (global) vs explanation of an individual
prediction (local)
• Ribeiro: explainable models are interpretable if they use a small set of
features; ‘an explanation is a local linear approximation of the model's
behavior

Motivation: why do we need XAI?
• Scientific understanding: does my model discriminate?
• Bias/fairness issues: why did my model make this mistake?
• Model debugging and auditing: how can I understand/interfere with the
model?
• Human-AI cooperation/acceptance: does my model satisfy legal requirements
(e.g. GDPR)?
• Regulatory compliance: healthcare, finance/banking, insurance
• Applications: affect recognition in video games, intelligent tutoring systems,
bank loan decision, bail/parole decisions, critical healthcare predictions (e.g.
cancer, major depression), film/music recommendation, job interview
recommendation/job offer, personality impression prediction for job interview
recommendation, tax exemption

Taxonomy
• Feature statistics: feature importance and interaction strengths
• Feature visualizations: partial dependence and feature importance plots
• Model internals: linear model weights, DT structure, CNN filters, etc.
• Data points: exemplars in counterfactual explanations
• Global or local surrogates via intrinsically interpretable models
• Example: play tennis decision tree:
⁃ Intrinsic, model specific, global & local, model internals
• Example: CNN decision areas in images:
⁃ Post-hoc, model specific, local, model internals

Scope of Interpretability
• Algorithmic transparency: how does the algorithm generate the model?
• Global, holistic model interpretability:

, ⁃ How does the trained model make predictions?
⁃ Can we comprehend the entire model at once?
• Global model interpretability on a modular level: how do parts of the model
affect predictions?
• Local interpretability for a single prediction: why did the model make a certain
prediction for an instance?
• Local interpretability for a group of predictions:
⁃ Why did the model make specific predictions for a group of instances?
⁃ May be used for analyzing group-wise bias

Evaluation of interpretability
• Application-level evaluation (real task):
⁃ Deploy the interpretation method on the application
⁃ Let the experts experiment and provide feedback
• Human-level evaluation (simple task): during development, by lay people
• Function-level evaluation (proxy task):
⁃ Does not use humans directly
⁃ Uses measures from a previous human evaluation
• All of above can be used for evaluating model interpretability as well as
individual explanations

Properties of explanation methods
• Expressive power: the "language" or structure of the explanations
⁃ E.g. IF-THEN rules, tree itself, natural language etc.
• Translucency: describes how much the explanation method relies on looking
into the machine learning model
• Portability: describes the range of machine learning models with which the
explanation method can be used
• Algorithmic complexity: computational complexity of the explanation method

Properties of individual explanations
• Accuracy: how well does an explanation predict unseen data?
• Fidelity: how well does the explanation approximate the prediction of the black
box model?
• Certainty/confidence: does the explanation reflect the certainty of the machine
learning model?
• Comprehensibility/plausibility:
⁃ How well do humans understand the explanations?
⁃ How convincing (trust building) are they?
⁃ Difficult to define and measure, but extremely important to get right
• Consistency: how much does an explanation differ between models trained on
the same task and produce similar predictions?
• Stability: how similar are the explanations for similar instances?
⁃ Stability within a model vs consistency across models
• Degree of importance: how well does the explanation reflect the importance of
features or parts of the explanation?
• Novelty and representativeness

, How is explainability usually measured?
• Fidelity: should be measured objectively; but not all explanations can be
checked for fidelity
• Plausibility: requires a user study
• Simulatability: measures the degree that a human can calculate/predict the
model’s outcome, given the explanation

What is a good explanation?
• Contrastive: requires a point of reference for comparison
• Selective: precise, a small set of most important factors; humans can handle
at most 7 ± 2 cognitive entities at once
• Social: considers the social context (environment/audience)
• Truthful (scientifically sound): good explanations prove to be true in reality
• General and probable: a cause that can explain many events is very general
and could be considered a good explanation
• Consistent with prior beliefs of the explainee

Interpretability vs Explainability revisited
• Individual terms may not have no precise definition, but interpretable models:
⁃ Are transparent and simple enough to understand
⁃ Stand for their own explanation
⁃ Thus, their explanations reflect perfect fidelity
• Black-box models:
⁃ Require post-hoc explanations (as an excuse to their opacity)
⁃ Cannot have perfect fidelity with respect to the original model
⁃ Their explanations often do not make sense, or do not provide enough
detail to understand what the black box is doing
⁃ Are often not compatible with situations where information outside the
⁃ Database needs to be combined with a risk assessment

— Lecture 2: Interpretable models —

Linear models
• GeneralizedAdditiveModels(AdditiveModels(LinearModels(ScoringSystems)))
• Assumptions:
⁃ Linearity: f(x+y) = f(x) + f(y), f(cx) = cf(x)
⁃ Normality of the target variable
⁃ Homoscedasticity: constant variance
⁃ Independent instance distribution
⁃ Absence of multicollinearity: no pairs of strongly correlated features

Interpretation of linear models
• Modular view: we assume all remaining feature values are fixed
• Numerical feature weight: increasing the numerical feature by one unit
changes the estimated outcome by its weight
• Binary feature weight: the contribution of the feature when it is set to 1
• Categorical feature with L categories:
⁃ Carry out one-hot-encoding into L binary features (e.g., 3 levels: 1→ [1

$10.19

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

massimilianogarzoni

2.7

(3)

Get to know the seller

massimilianogarzoni Universiteit Utrecht

View profile

Sold

Member since

8 year

Number of followers

Documents

Last sold

5 months ago

2.7

3 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller massimilianogarzoni. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $10.19. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 52191 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

Human-Centered Machine Learning Summary & Course Notes

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?