100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary Lecture Slides & notes Real-life Machine learning (300363-B-6)

Beoordeling
-
Verkocht
3
Pagina's
71
Geüpload op
17-12-2023
Geschreven in
2023/2024

This document contains all the lecture slides and notes of the course 'Real-life Machine learning (300363-B-6)', given at Tilburg University as premaster for JADS. This document contains everything needed for the exam and is complete. Goodluck with the course!

Meer zien Lees minder











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Heel boek samengevat?
Nee
Wat is er van het boek samengevat?
Alle stof nodig voor het tentamen/everything needed for the exam
Geüpload op
17 december 2023
Aantal pagina's
71
Geschreven in
2023/2024
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

Lecture 1
What are we going to learn today?
- What is machine learning?
- What are supervised and unsupervised machine learning?
- Which are the most common types of machine learning problems?
- Which are the basic steps of the CRoss Industry Standard Process for data mining
(CRISP-DM)?

Machine learning is the field of study that gives computers the ability to learn without being
explicitly programmed

Machine learning
Assume that you are iterating over and over again an exercise
What should be constant in your exercise?
- Learning! - machine learning applies strategies and algorithms, combined with data
and statistics
- Improving! - machine learning applies statistical indices to measure the overlap
between ML prediction and expected result
When you are doing it, it is human learning
When a machine does it, it is machine learning!




An example of supervised learning
Supervised learning - classification
Given a labelled dataset, the model learns to
predict new examples




An example of unsupervised learning
Unsupervised learning - clustering,
dimensionality reduction, anomaly detection
and novelty detection
Given a dataset, without labels, the model
learns to use to cluster/group similar data

,CRISP-DM process model




Business understanding in the CRISP-DM process




Determine business objectives and success criteria
Business objectives and measures to evaluate the results have to be established

Business objectives:
● What is the customer’s primary objective?
● Increase the number of loyal customers
● Selling more of a certain product
● Have a positive marketing campaign

,Business success criteria:
● Objective measure to establish success (e.g. return of investment)

Main steps in a data mining project
1. Define the goals:
Business and data mining experts together have to define the goals. For each goal a
measure must be defined to understand its success
2. Obtain the models:
Pre-process the data, apply data mining algorithms
3. Evaluate results
Use the pre-specified measures to evaluate the models
4. Deploy:
If the evaluation is successful, the model can be deployed

Costs & benefits
Perform a cost-benefit analysis
Compute the benefits of the project (e.g. return on investment)
Compute the costs of the project - main factors:
● Data sources
● Data mining problem to be solved
● Available tools
● Expertise of the development team

Quantify the risk that the project fails:
● Knowledge not available
● Data not available
● Missing tools

Quality data & feature engineering
What are we going to learn today?
- What kind of data exists?
- How to prepare data?
- What is data balancing?
- How to apply data cleaning and feature scaling?
- What is feature selection?

, What kind of data exists?
- Structured data
- Unstructured data
- Semi-structured data

Structured data
Tabular data (rows and columns) which are very well defined
We know which columns there are and what kind of data they contain (the format is very
strict)
Often such data is stored in databases that represent the relationships between the data as
well. Questions about data can be answered by using a query language.

Unstructured data
The rawest form of data that can be any type of file.
Extracting value out of this shape of data is hard, since you need to extract structured
features from the data
For example, you might want to extract topics from movies.

Semi-structured data
This format is between structured and unstructured data
A consistent format is defined. However, the structure is not very strict. For example, it could
not be tabular or parts of the data may be missing.
Semi-structured data are often stored as files. However, some kinds of semi-structured data
can be stored in document oriented-databases. Such databases allow you to query the
sem-structured data

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
Dee25 Tilburg University
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
132
Lid sinds
4 jaar
Aantal volgers
74
Documenten
44
Laatst verkocht
1 maand geleden

Hoi! Bedankt dat je een bezoekje brengt aan mijn profiel. Ik ben een student van de Master Economics met als track Data Science bij Tilburg University! Gemiddeld sta ik een 7,5 voor mijn vakken en graag wil ik jou helpen om dit ook te bereiken met mijn studie materiaal

4,0

25 beoordelingen

5
13
4
4
3
6
2
0
1
2

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen