100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Tentamen (uitwerkingen)

ISYE 6501 –ACTUAL Midterm 1 and 2 EXAM- and Fina exam QUESTIONS ANSWERS & RATIONALES.

Beoordeling
-
Verkocht
-
Pagina's
64
Cijfer
A+
Geüpload op
10-12-2025
Geschreven in
2025/2026

ISYE 6501 –ACTUAL Midterm 1 and 2 EXAM- and Fina exam QUESTIONS ANSWERS & RATIONALES.

Instelling
ISYE 6501
Vak
ISYE 6501











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Geschreven voor

Instelling
ISYE 6501
Vak
ISYE 6501

Documentinformatie

Geüpload op
10 december 2025
Aantal pagina's
64
Geschreven in
2025/2026
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

Voorbeeld van de inhoud

ISYE 6501 –ACTUAL Midterm 1 and 2 EXAM- and
Fina exam QUESTIONS ANSWERS & RATIONALES.



ISYE 6501 - Midterm 1




A large value of K will lead to

a large variance in predictios

Setting a large value of k will ...

lead to a large model bias.

What are real effects?

Real relationships between attributes and responses. They are the same in all data sets,

What are random effects?

They are random but look like real effects. They are different in all data sets.

Why can't we measure a model's effectiveness on data it was trained on?

The model's performance on its training data is usually too optimistic, the model is fit to both real and
random pattenrs in the data, so it becomes overly specialized to the specific randomness in the training
set, that doesn't exist in other data.

If we use the same data to fit a model as we do to estimate how good it is, what is likely to happen?

The model will appear to be better than it really is.

The model will be fit to both real and random patterns in the data. The model's effectiveness on this
data set will include both types of patterns, but its true effectiveness on other data sets (with different
random patterns) will only include the real patterns

When comparing models, if we use the same data to pick the best model as we do to estimate how
good the best one is, what is likely to happen?

The model will appear to be better than it really is.

,The model with the highest measured performance is likely to be both good and lucky in its fit to
random patterns.

What is a training set used for

used to fit the models

What is a validation set used for?

used to choose best model

Why would we use two sets?

Reason to use two different sets is because if the first set, the training set, had unique random effects
that the classifer was designed for, we wouldn't be counting those benefits when we measure
effectiveness on the validation set.

What effects does randomness have on training /validation performance?

sometimes the randomness will make the performance look worse than it really is, and sometimes the
randomness will make the performance look better than it really is

how are high-performing models affected by randomness?

They are often boosted by above average random effects making it look better

what is a test data set used for?

to estimate performance of chosen model

When do we need a validation set?

When we are choosing between multiple models.

What are the data splits when working with one model?

70-90% training, 10-30% test

What are the data splits when comparing models?

50-70% training, split the rest between validation and test

What are two methods of splitting data?

random and roation

What is the rotation method of splitting data?

You take turns selecting points.
5 data point rotation sequence: (Training - Validation - Training - Test - Training

What is the advantage of rotation over randomness?

We make sure each part of the data is equally separated.

,What is the disadvantage of using rotation?

We have to make sure we aren't creating some other type of bias when we assign points.

what is k-fold cross validation?

split the training/validation data into k-parts; we train on k-1 parts and validate on the remaining part.

What metric do you use for k-fold cross validation when comparing models?

The average of all k evaluations.

What do we use when important data only appears in the validation or test sets?

cross-validation

What do we do after we've performed cross-validation?

We train the model again using all the data.

what are the benefits of k-fold cross validation?

better use of data, better estimate of model quality, and chooses model more effectively

What can clustering be used for?

grouping data points (e.g., market segmentation) and discovering groups in data points (e.g.,
personalized medicine

Which should we use most of the data for: training, validation, or test?

training

In k-fold cross-validation, how many times is each part of the data used for training, and for
validation?

k-1 times for training, and 1 time for validation

what is rectangular distance useful for?

calculating driving distance when the city is mapped in a grid

what is the value of p for euclidean distance

2

what is the general equation for p-norm distance

2-norm

Straight-line distance corresponds to which distance metric?

How do you find the distance of an infinity norm?

You find the largest | x_i - y_i |

, What is a centroid

the center of a cluster

What are the steps of k means?

0. Pick k clusters within range of data.
1. Assign each data point to nearest cluster center
2. Recalculate cluster centers (centroids)
3. Repeat 1 and 2 until no changes

How do we find the cluster centers?

We take the mean of all the data points in cluster.

Why is k-means an expectation-maximization

finding the mean of all the points in cluster is similar to finding an expectation.

Assigning data points to cluster centers is the maximization step. Really we are minimizing, but we could
think of it as maximizing the negative of the distance to a cluster center

What are some of the consequences of outliers in k-means?

It will drag the cluster center artificially to one side.

Because k-means is a heuristic and thus fast what can we do?

run it several times choosing different clusters centers and choose the best one and we can choose
different values of k

how does bias/variance change as k changes in KNN

the higher the k the higher the bias the lower the k the higher the variance. when K = 1 that is the most
complex model and thus likely to overfit the data.

How do we find the best value of k in k means?

Elbow method: we calculate the total distance of each data point to its cluster center and plot it in two
dimensions. We look for the kik in the graph.

When clustering for prediction how do we choose the prediction?

When we see a new point, we just choose whichever cluster center is closest.

What is the difference between classification and clustering?

With classification mdoels, we know each data point's attributes and we already know the right
classification for the data points (supervised). In clustering (unsupervised) we know the attributes but
we don't know what group any of these data points are in.

What is the difference between supervised learning and unsupervised learning?
€19,72
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten


Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
NursingTotur2 Walden University
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
121
Lid sinds
8 maanden
Aantal volgers
15
Documenten
2445
Laatst verkocht
1 dag geleden
Teachme2 TUTOR

Hi! ,I'm Nurse Simeon , a certified TeachMe2 Totur with over 5 helping University and college students succeed. I am a Verified Nursing Tutor specializing in Ihuman Case Studies, Advanced pharmacology (NR565), HESI, TEAS 7, Pediatrics and More, creating HIGH QUALITY,EXAM FOCUSED STUDY GUIDES. Every document is crafted to be clear, accurate, and easy understanding saving you study time and improving your grades. Whatever you are preparing for Hesi A2, NCLEX or University coursework ,my notes are trusted by hundreds of students like you. ✅ Backed by toturing experience. ✅ Organized by topic and exam need. ✅ Instant access and affordable pricing. Let's help you pass smarter ,not harder. Browse my store now !

Lees meer Lees minder
2,9

12 beoordelingen

5
1
4
3
3
5
2
0
1
3

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen