100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

CPSC 330 final exam Ch 16, 17, 18 with complete solutions

Beoordeling
-
Verkocht
-
Pagina's
16
Cijfer
A+
Geüpload op
05-03-2025
Geschreven in
2024/2025

CPSC 330 final exam Ch 16, 17, 18 with complete solutions

Instelling
CPSC
Vak
CPSC

Voorbeeld van de inhoud

CPSC 330 final exam Ch 16, 17, 18 with
complete solutions
.... gives you the ability to summarize the major themes in a large collection of
documents (corpus). - ANSWER-Topic modelling

.... is a great EDA tool to get a sense of what's going on in a large corpus. - ANSWER-
Topic modelling

2 approaches to reduce multi-class classification into binary classification - ANSWER-
the one-vs.-rest approach
- 1v{2,3}, 2v{1,3}, 3v{1,2}
- Learn a binary model for each class which tries to separate that class from all of the
other classes.

the one-vs.-one approach
- 1v2, 1v3, 2v3
- Build a binary model for each pair of classes.

After creating profile in Content-based filtering, what do we do? - ANSWER-Create
Ridge() model

Predict the rating of a movie that the user has not seen

An unsupervised approach which only uses the user-item interactions given in the
ratings matrix - ANSWER-Collaborative filtering in a recommender system

Apply all of the classifiers on the test example.

Count how often each class was predicted.

Predict the class with most votes.

These are the properties of - ANSWER-One Vs. One approach (OVO) for prediction

Basic text preprocessing (7) - ANSWER-Tokenization
- the process of breaking down a text or document into individual words, phrases,
symbols, or other meaningful elements known as tokens. In simpler terms, tokenization
is like splitting a sentence into its component parts
- Sentence segmentation: Split text into sentences
- Word tokenization: Split sentences into words

Converting text to lowercase

,Removing punctuation and stopwords
- stopwords: commonly used words that are often considered insignificant or carry little
meaning for understanding the context of a text (such as "a," "an," "the," "is," "in,"
"and,").

Discarding words with length < threshold OR word frequency < threshold

Lemmatization:
- Consider the lemmas instead of inflected forms.
- lemmatization: finding the base form of a word
- For example, lemmatizing the words "running," "ran," and "runs" would give you the
base form "run."
- Vancouver's → Vancouver
- computers → computer
- rising → rise, rose, rises

POS: restrict to a specific part of speech; For example, only consider nouns, verbs, and
adjectives

Stemming
- the process of reducing words to their base or root form, often by removing suffixes, to
simplify analysis and improve text processing efficiency.
- Before stemming: UBC is located in the beautiful province of British Columbia... It's
very close to the U.S. border.
- After stemming: ubc is locat in the beauti provinc of british columbia ... it 's veri close to
the u.s. border .

Collaborative filtering vs. Content-based filtering (1) - ANSWER-use item features or not

Collaborative: Recommends items based on similar users or items without requiring
explicit knowledge of item features.

Content-based: Recommends items based on item features like text, genre, or
metadata

Explain Neural Networks - ANSWER-Neural networks apply a sequence of
transformations on your input data.

We are adding one "layer" of transformations in between features (inputs) and the
target. (output)

The hidden units (e.g., h[1], h[2], ...) represent the intermediate processing steps.

At a very high level you can also think of them as Pipelines in sklearn.

, Explain what Transfer learning is - ANSWER-Recall: CNNs can take in images without
flattening them ← solution to the image classification!

Training a CNN from scratch is not common due to the need for a large dataset,
powerful computers, and significant human effort.

Instead, a common practice is to download a pre-trained model and fine-tune it for your
task.

This is called transfer learning.
- Transfer learning is like using what you already know to learn something new faster. In
machine learning, it means using a pre-trained model's knowledge to solve a different
problem instead of starting from scratch. It saves time and resources while improving
performance.

Given a test point, get scores from all binary classifiers (e.g., raw scores for logistic
regression). This is - ANSWER-OVR

How does LDA work (4) - ANSWER-1. Create BOW representation for the text column
using CountVectorizer

2. Create a topic model with sklearn's LatentDirichletAllocation

3. LDA basically allows access to these two word representations

Topic-words association
- `lda.components_` gives us the weights associated with each word (columns) for each
topic (rows).
- In other words, it tells us which word is important for which topic.

Document-topic association
- Calling `transform` on the data gives us document_topics association.
- It tells us which topic is important for which document.

4. You could change the data representation (i.e., change the labels, round values, and
drop sum column)

If you have K classes, it'll train K binary classifiers, one for each class.

this is - ANSWER-One-vs.-Rest approach (OVR)

If you want to pull documents related to a particular lawsuit, we use - ANSWER-Topic
modeling

ImageNet - ANSWER-An image dataset
There are 14 million images and 1000 classes

Geschreven voor

Instelling
CPSC
Vak
CPSC

Documentinformatie

Geüpload op
5 maart 2025
Aantal pagina's
16
Geschreven in
2024/2025
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
CLOUND Exam
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
621
Lid sinds
2 jaar
Aantal volgers
389
Documenten
11482
Laatst verkocht
23 uur geleden
PROF MM

HELLO WELCOME TO THIS PAGE WHERE YOU WILL FIND ALL EXAMS ,STUDY GUIDE ,CASE, TESTBANKS AND ANY OTHER STUDY MATERIALS,

4.0

122 beoordelingen

5
64
4
16
3
29
2
3
1
10

Populaire documenten

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen