Summary Week 12
Using the code in section 4.3 to scale the age variable. ANSWER THE FOLLOWING QUESTIONS: 1. Explain what the mutate function is doing in this line of code: mutate(scaled_age = (age - !!scale_values$mean-age) / !!scale_values$sd_age) 2. What do the two exclamation marks next to each other do? Use the code in the book to create a histogram of Scaled Age ANSWER THESE QUESTION: 3. Approximately how many profiles in the training set fall in the 0 bin? Using the code in section 4.3, aggregate the profiles in the training set by ethnicity. ANSWER THE FOLLOWING QUESTION: 4. What does it mean to have a 'combination of ethnicities'? Use the code to create dummy variables for each race/ethnicity. ANSWER THE FOLLOWING QUESTIONS: 5. What are the dummy variables? 6. Show how okc_train has been transformed. Using the code from section 4.3, add a column called essay_length to okc_train. ANSWER THE FOLLOWING QUESTIONS: 7. Explain what is happening in this code: essay_length = char_length(paste(!!!syms(paste0("essay", 0:9)))) Use the code in section 4.3 to create a histrogram of essay length. ANSWER THE FOLLOWING QUESTION: 8. What does bins = 100 mean in the code? Using the code in section 4.3, save the training file as a Parquet file.
Gekoppeld boek
- 2019
- 9781492046325
- Onbekend
Geschreven voor
- Instelling
- Big Data Tools & Architecture
- Vak
- Big Data Tools & Architecture
Documentinformatie
- Heel boek samengevat?
- Nee
- Wat is er van het boek samengevat?
- Onbekend
- Geüpload op
- 15 juli 2023
- Bestand laatst geupdate op
- 22 februari 2024
- Aantal pagina's
- 4
- Geschreven in
- 2022/2023
- Type
- SAMENVATTING
Onderwerpen
-
week 12