most recent version Comprehensive
questions and verified answers
accurate solutions Already graded A+
Get it 100% correct
NLP - ,,,,answer,,,,..natural language processing
Tokenization - ,,,,answer,,,,..a computer turning letters
and/or words into something it can read and understand,
like numbers
Two of the most common recommenders, often used
together - ,,,,answer,,,,..user based, item based
Imagine you have a dataset with 2 columns, both filled with
continuous numbers. You believe the first column is a
predictor of the second column. Which of the model
approaches below could work?
1. random forest
2. running .describe and .info on the data
3. regression
4. decision trees - ,,,,answer,,,,..regression (obvious choice),
random forest, decision trees (not the best)
,most talked about problem with decision trees -
,,,,answer,,,,..overfitting
The LinearRegression estimator is only capable of simple
straight line fits: true or false? - ,,,,answer,,,,..false
5 steps to building a machine learning model - ,,,,answer,,,,..1.
choosing a class of model
2. choose hyperparameters
3. arrange data
4. fit the model
5. predict
Difference between unsupervised and supervised learning -
,,,,answer,,,,..unsupervised: you have an X but no Y
supervised: you have an X and a Y
why is a linear regression a good starting point in a modeling
task? - ,,,,answer,,,,..they are interpretable
why are pipelines useful? - ,,,,answer,,,,..help organize the
code you used to clean and treat your data, make it easy to
change small things in your model, make it easy to
repeat/replicate steps and run multiple models
basic idea of regression - ,,,,answer,,,,..we have some X values
called features and some Y value, the variable we are trying
to predict
, Y is our target vector, and y-hat is an output in our model
that is a _____ - ,,,,answer,,,,..estimate or prediction of y
the first variable in a decision tree (before any of the
branches) - ,,,,answer,,,,..root
one of the problems with decision trees is that they are
prone to _____ if you are not careful or do not set the _____
appropriately - ,,,,answer,,,,..overfitting, max depth
True or False: the random forest algorithm prevents, or at
least avoids to some extent, the problems with overfitting
found in decision trees - ,,,,answer,,,,..true
True or False: random forests can only be used on
classification problems - ,,,,answer,,,,..false (has applications
in regression too)
when running our first decision tree, we took out
"maxdepth=". this had the result of... - ,,,,answer,,,,..building a
very large hard to understand tree
the terminal node - ,,,,answer,,,,..the last node (sometimes
called a leaf), the tree doesn't split after this
models often have a number of parameters that the analyst
can choose or set, what is the best source of up to date info
about the different ones that can be set? - ,,,,answer,,,,..the
scikit learn documentation