100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Tentamen (uitwerkingen)

Georgia Tech ISYE - 6501 Homework 2 Due Date: Thursday, September 3rd, 2020, Graded A+

Beoordeling
-
Verkocht
-
Pagina's
9
Cijfer
A+
Geüpload op
25-04-2023
Geschreven in
2022/2023

Georgia Tech ISYE - 6501 Homework 2 Due Date: Thursday, September 3rd, 2020, Graded A+ Document Content and Description Below ISYE - 6501 Homework 2 Due Date: Thursday, September 3rd, 2020 Contents 1 ISYE - 6501 Homework 2 2 2 Homework Analysis 2 2.1 Analysis 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Analysis 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Analysis 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 11 ISYE - 6501 Homework 2 This document contains my analysis for ISYE - 6501 Homework 2 which is due on Thursday, September 3rd, 2020. Enjoy! 2 Homework Analysis 2.1 Analysis 3.1 Q: Using the same data set (credit_card_ or credit_card_) as in Question 2.2, use the ksvm or kknn function to find a good classifier. (a) using cross-validation (do this for the k-nearest-neighbors model; SVM is optional) RESULTS By using cross-validation at 10 folds on a k-nearest-neighbors (KNN) model, at k=15 with a rectangular kernel, we were able to achieve an accuracy score of roughly 85% (85.47009%). This means that 85 out of every 100 applicants is predicted correct! THE CODE: # needed library rm(list=ls()) library(kknn) library(dplyr) (12345) # read data into R data_path <- "data 3.1/" data_filename <- "credit_card_" credit_data <- (paste0(data_path, data_filename), header=TRUE) # train-valid-test-split sample_split <- sample(1:3, size=nrow(credit_data), prob=c(0.7,0.15,0.15), replace = TRUE) train_credit <- credit_data[sample_split==1,] valid_credit <- credit_data[sample_split==2,] test_credit <- credit_data[sample_split==3,] # training our model train_model <- (R1~., train_credit, kmax=100, scale=TRUE, kcv=10, kernel=c("rectangular", "triangular", "epanechnikov", "gaussian", "rank", "optimal"), kpar=list()) train_model ## ## Call: 2## (formula = R1 ~ ., data = train_credit, kmax = 100, kernel = c("rectangular", "triangul ## ## Type of response variable: continuous ## minimal mean absolute error: 0. ## Minimal mean squared error: 0. ## Best kernel: rectangular ## Best k: 15 Using cross-validation at 10 folds, we can see that the best kernel for our model is rectangular with a k of 15. Now that we have the best parameters for our model, let’s use them to train our validation data. # validating our model valid_model <- (R1~., valid_credit, ks=15, kernel="rectangular", scale=TRUE, kpar=list()) valid_pred <- round(predict(valid_model, valid_credit)) accuracy_score <- sum(valid_pred == valid_credit[,11]) / nrow(valid_credit) accuracy_score * 100 ## [1] 91 Our validation model provides and accuracy score of 91%. Now, let’s run the model through our test data to measure its true performance on data it hasn’t seen with before. # run test data through model test_pred <- round(predict(valid_model, test_credit)) accuracy_score <- sum(test_pred == test_credit[,11]) / nrow(test_credit) accuracy_score * 100 ## [1] 85.47009 Our test data provides an accuracy score of roughly 85%. This is lower than the validation data accuracy score (91%). We can conclude that our model favored the randomness of the data it was validated (valid_credit) on over data it hadn’t seen before (test_credit). Thus, being closer to the model’s true performance. (b) splitting the data into training, validation, and test data sets (pick either KNN or SVM; the other is optional). RESULTS A train-valid-test split allows us to produce a Support Vector Machine (SVM) with a splinedot kernel and C value of 1 that achieves an accuracy score of roughly 82% (82.05128). OUR CLASSIFIER EQUATION Based on the SVM model classifier equation below: classifier = β0 + β1x1 + β2x2...βpxi With an error margin (C) of 1 and the splinedot kernel, we produced the following classifier equation: classifier = −0. + 0.x1 + 0.x2 − 0.x3 + 0.x4 + 0.x5 − 0.x6 − 0.x7 − 0.x8 + 0.x9 + 0.x10 3THE CODE: # needed library library(kernlab) library(magicfor) library(ggplot2) library(hrbrthemes) (12345) # train-valid-test-split sample_split <- sample(1:3, size=nrow(credit_data), prob=c(0.7,0.15,0.15), replace = TRUE) train_credit <- credit_data[sample_split==1,] valid_credit <- credit_data[sample_split==2,] test_credit <- credit_data[sample_split==3,] # training our model magic_for(print, silent = TRUE) kerns <- list("rbfdot", "polydot", "vanilladot", "tanhdot", "laplacedot", "besseldot", "anovadot", "splinedot") for (kern in kerns) { train_model <- ksvm(R1~., data=train_credit, type="C-svc", kernel=kern, C=1, scaled=TRUE, kpar=list()) train_pred <- predict(train_model, train_credit[,1:10]) accuracy <- sum(train_pred == train_credit[,11]) / nrow(train_credit) print(accuracy) } kern_accuracy <- magic_result_as_dataframe() # displaying our model’s best kernel ggplot(kern_accuracy, aes(x=kern, y=accuracy, color=accuracy)) + geom_point(size=8) + ylim(c(.7, 1)) + labs(title="Accuracy Scores vs Kernel Function", y="Accuracy Scores", x="Kernel Function") + theme( = element_text(hjust=0.5), .x = element_text(hjust=0.5, size=14), .y = element_text(hjust=0.5, size=14), ion = "None", =element_line(size = 1, linetype="solid")) + geom_vline(xintercept="splinedot", linetype="dotted", color="blue", size=2) Show Less Last updated: 7 months ago

Meer zien Lees minder
Instelling
ISYE 6501
Vak
ISYE 6501









Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Geschreven voor

Instelling
ISYE 6501
Vak
ISYE 6501

Documentinformatie

Geüpload op
25 april 2023
Aantal pagina's
9
Geschreven in
2022/2023
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

€7,48
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten


Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
Savior NCSU
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
93
Lid sinds
2 jaar
Aantal volgers
70
Documenten
3434
Laatst verkocht
4 weken geleden

3,5

25 beoordelingen

5
9
4
7
3
3
2
0
1
6

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen