Question 2.1
An everyday situation we might all relate to is if food is sometimes safe or unsafe to eat. When
you leave food out or in the refrigerator for too long, it can spoil and become unsafe to consume.
This problem can be solved by classifying food or produce as safe or unsafe. We can use
predictors such as presence of mold, expiration date, appearance of discoloration, odor, and days
in the refrigerator.
Presence of mold: a Yes/No indicator of whether mold is visible on the food or produce
Expiration Date: A date indicating how fresh the food or produce is, such as “best by” or
“expiration date”
Appearance of discoloration: An indicator of whether the food or produce has changed color
Odor: The prescence of an unpleasant odor indicating spoilage and the food or produce maybe be
unsafe to consume (Yes/No)
Days in the refrigerator: The amount of time the produce has been stored for
Question 2.2.1
Methodology: I started by finding the directory of where the dataset was saved on my laptop.
Once locating, I read the dataset into R and assigned it to “data.” Once assigned, I used the view
function to view the whole dataset. I also did the head and tail function to see the first and last 10
values of the dataset. Lastly, I did the summary function to get an overview of the dataset.
Then, I installed the kernlab package and called the usage of the package so I can do the ksvm
function later. I used the as.matrix function and assigned it to data so I can store the dataset as a
matrix and be able to use the kvsm function. I used the kvsm function in the next line and
assigned it to “model” and printed it. Next, I computed the weight vector of the SVM from the
support vectors and their coefficients. Next, I found the a0 value to complete the equation of the
classifier at a certain C-value. Lastly, I used the predict function to find what the model predicts
and finding the fraction of the model’s predictions matching the actual classification. Below
provides the code and output for various C-values/lambda’s.
, When C = 100
Equation of classifier: -0.0010065348A1 - 0.0011729048A2 - 0.0016261967A3 +
0.0030064203A8 + 1.0049405641A9 - 0.0028259432A10 + 0.0002600295A11 -
0.0005349551A12 - 0.0012283758A14 + 0.1063633995A15 + 0.08158492 = 0
Accuracy of the model: 0.8639144 or 86.4%