Summary Week 13
Using the code in section 4.4 create 10 subsets of the okc_train data set. Create an analysis set and an assessment set, ANSWER THE FOLLOWING QUESTION: 1. What does sdf_random_split() do? 2. Explain the function of (rbind, vfolds[2:10]) Use the code in section 4.4 to transform the analysis set by scaling age in each of the training and validation sets by creating a function that finds mean and standard deviation. ANSWER THE FOLLOWING QUESTIONS: 3. What does the function(data) code do? Explain in detail by including all elements. Use your code to normalize the essay length variable. Use glimpse() to show the training set with the essay length variable normalized Use logistic regression to show 'not_working' as a product of a combination of variables. ANSWER THE FOLLOWING QUESTIONS: 4. What is the coefficient of the intercept? 5. Obtain a summary of performance metrics on the assessment set and print them. Plot the ROC curve. Compute the AUC. Plot 10 ROC curves. ANSWER THE FOLLOWING QUESTIONS: 6. Are the ROC and AUC in agreement? How do you know? Submit a Word document with the single and ten fold ROC charts. Include a time stamp. Answer the questions providing the question text and number.
Connected book
Written for
- Institution
- Big Data Tools & Architecture
- Course
- Big Data Tools & Architecture
Document information
- Summarized whole book?
- No
- Which chapters are summarized?
- Unknown
- Uploaded on
- July 15, 2023
- File latest updated on
- February 22, 2024
- Number of pages
- 4
- Written in
- 2022/2023
- Type
- Summary