2025-2026
Dit samenvattend werk is gebaseerd op het handboek Data Science Ethics, geschreven door David Martens, en de
topics die gezien worden in het opleidingsonderdeel Data Science & Ethics gedoceerd door Prof. Martens aan de
faculteit FBE van de Universiteit Antwerpen.
1
,1: introduction
PRACTICAL INFORMATION
1. Exam (16/20)
Questions
● large question: explain and illustrate certain techniques or concepts ( /6)
● discussion case ( /6)
● 5 small questions that you have to answer in 2 lines ( /4)
INTRODUCTION
Data science has an impact
● costs and benefits for businesses
● decisions on humans
● more than making calls to predefined Python libraries
1. Responsible AI
The double-edged sword of advancement
It was the best of times… It was the worst of times…
● reduce risk (eg. predict terrorist attacks) ● data leaks
● reduce crime (eg. detect tax fraud) ● discrimination*
● increase profitability ● digital pawns
● improve medical diagnosis ● filter bubble
● increased “good” ● increased “bad”
* models are built to discriminate good from bad, but it shouldn't do so based on ethnicity or other factors
Responsible AI = knowing what is right and wrong when working with AI
1.1 Data science and ethics
Ethics = moral principles that control or influence a person’s behavior or the conducting of an activity
Moral = connected with principles of right and wrong behavior
● law = what you can do
● ethics = what you should do
Data science ethics = the domain of what is right and wrong when doing data science
Responsible AI = the development and application of AI that is aligned with the moral values of society
2. Why care?
① Expected from society
We have reached an age where society expects business leaders and data scientists alike to be ethically responsible.
Generation Z
● born in 1995 - 2010
● 90 million in the US
● cares about social justice and ethics
2
,② Huge potential risks
Be aware of the risks and countermeasures!
Risks for humans Risks for business
● physical and mental well-being (self-driving) ● reputational
● privacy ● financial
● discrimination
AI in banks determines whether you get a loan or not, which has a big influence on your life. It is important to know
how these AI systems work and whether all ethical steps are considered.
③ Potential benefits
Understanding ethical concerns and applying techniques to deal with this, can improve the data model and be a
marketing instrument.
● remove bias in data: improve the accuracy and fairness of the model
● explain predictions: improve the trust in the model
● ensure proper data gathering: better data quality
● part of a company’s brand
④ The AI Act
● risk-based approach to regulate AI
● applies for every company that works with EU citizens
Requirements for high-risk uses of AI
● establish safeguard against biases in datasets
● prescribed data governance and management practices
● ability to verify and trace back outputs through the system’s life cycle
● including provisions for acceptable levels of transparency and understandability for users of the
systems
● appropriate human oversight
A risk-based approach
● unacceptable risk (eg. AI systems that threaten the safety of people): banned
● high risk (eg. AI systems used in employment, law enforcement or migration): strict obligations
● limited risk (eg. chatbots)
● minimal risk (eg. spam filters)
2.1 Summary
There are multiple reasons to care about data science ethics
● life goal in itself (philosophical goal)
● societal and business reasons
○ expected from society
○ huge potential risks
○ data science ethics can bring value
○ AI act
3
, MACHINE LEARNING
Machine learning = automatic extraction of knowledge from data
→ prediction of new things based on patterns detected in historical data
EXAMPLE
Banks have been using credit scoring for many years to predict whether or not customers will be able to repay their
loan.
If you give this dataset to machine learning, it can detect patterns and predict who will default through a prediction
model. When a new customer comes to ask for a loan, you just have to put in its information in the model and the
model will tell you if it is a good customer or not.
DATA SCIENCE & ETHICS
1. Ethics theories
Utilitarianism = consequentialism = what is produced in the consequence of the act
● action is moral if the consequence is moral, means to an end
→ decides what is good or bad based on the impact of the action
● justifies immoral things
Deontology = not doing immoral actions, irrespective of the impact
1.2 Aristotle’s Nicomachean ethic
Moral behavior can be found at the mean between two extremes: excess and deficiency.
“Golden mean” of data science ethics
● deficiency: not using any data at all
● excess: using all available data for any application, without any concern for the ethical concerns
4