DATACAMP DATA SCIENTIST EXAM PRACTICE
QUESTIONS AND CORRECT ANSWERS (100%
CORRECT VERIFIED ANSWERS) 2024/2025
Whenever doing a relatively standard task, there probably already exists a function on
the internet - ✔✔
python built in pow function pow(base,exponent,modulus) - ✔✔
pythonstr.replace(relacethis,withthis) - ✔✔
when coding remember to return a value - ✔✔
numpy
matplotlib
scikitlearn - ✔✔
np.array will convert to strings if it is given a list with multiple types
type coersion
it won't always make them strings, booleans can be converted to ints if there are no
strings and possibly other behaviors as well - ✔✔
we can obtain an array of booleans by comparing an np.array to a condition
np.arrays can also be subset with booleans
only those with a value of true are selected - ✔✔
,ndarray n dimensional array - ✔✔
np.array.shape attribute - ✔✔
array[1][2] is equivalent to array[1,2] - ✔✔
numpy array operations to each element of the numpy array list add subtract etc the
next list - ✔✔
np.mean mean of all values passed to it
np.median
because numpy enorces a single data type in an array it can drastically imporve the
speed of calculation as compared to a list
this is one reason C++ is faster it has more structure which allows for more assumptions
- ✔✔
np.random.normal
np.columnstack - ✔✔
if you use fancy indexing to assign part of an older array as a new array the new array
does not behave mutably with the first, that is to say changes to the new array will not
be reflected in the older array - ✔✔Introduction to python
plt.xscale("log") - ✔✔intermediate python
,plt.xlabel()
plt.ylabel()
plt.title()
plt.yticks(tick values, tick names) - ✔✔
[]+nameofalist - ✔✔
plt.scatter(x,y,size)
plt.text(xloc,yloc,text)
plt.grid()
list.index() - ✔✔
country in dictionary
del(dict[value]) - ✔✔
order matters list
lookup table dictionary - ✔✔
dataframe.index=list of labels
pandas.read_csv("path/to/csv/file",index_col=int)
pandas.read_html("url") - ✔✔
pd.DataFrame["column","column2"] series
vs
pd.DataFrame[["column","column2"]] dataframe - ✔✔intermediate python
, pd.DataFrame[int1:int2]
pd.DataFrame.loc and .iloc
.loc["row name"] returns a series, .loc[["row name"]] returns a DataFrame
.loc[["row name 1","row name 2"]]
.loc[[row names] a ":" can also be used here,[column names]]
.iloc[[1,2,3],[column list]] - ✔✔
how to get a datframe back form columns of a different dataframe -
✔✔firstdataframe[["column name 1","column name2"]]
logical_and
logical_or
logical_not
np.logical_and(arraybools1,arraybools2) - ✔✔intermediate python
dictionary.items()
for key,value in dictionary.items:
do stuff - ✔✔
np.nditer gives all the individual elements of a 2-d or larger array and not the sub arrays
- ✔✔
QUESTIONS AND CORRECT ANSWERS (100%
CORRECT VERIFIED ANSWERS) 2024/2025
Whenever doing a relatively standard task, there probably already exists a function on
the internet - ✔✔
python built in pow function pow(base,exponent,modulus) - ✔✔
pythonstr.replace(relacethis,withthis) - ✔✔
when coding remember to return a value - ✔✔
numpy
matplotlib
scikitlearn - ✔✔
np.array will convert to strings if it is given a list with multiple types
type coersion
it won't always make them strings, booleans can be converted to ints if there are no
strings and possibly other behaviors as well - ✔✔
we can obtain an array of booleans by comparing an np.array to a condition
np.arrays can also be subset with booleans
only those with a value of true are selected - ✔✔
,ndarray n dimensional array - ✔✔
np.array.shape attribute - ✔✔
array[1][2] is equivalent to array[1,2] - ✔✔
numpy array operations to each element of the numpy array list add subtract etc the
next list - ✔✔
np.mean mean of all values passed to it
np.median
because numpy enorces a single data type in an array it can drastically imporve the
speed of calculation as compared to a list
this is one reason C++ is faster it has more structure which allows for more assumptions
- ✔✔
np.random.normal
np.columnstack - ✔✔
if you use fancy indexing to assign part of an older array as a new array the new array
does not behave mutably with the first, that is to say changes to the new array will not
be reflected in the older array - ✔✔Introduction to python
plt.xscale("log") - ✔✔intermediate python
,plt.xlabel()
plt.ylabel()
plt.title()
plt.yticks(tick values, tick names) - ✔✔
[]+nameofalist - ✔✔
plt.scatter(x,y,size)
plt.text(xloc,yloc,text)
plt.grid()
list.index() - ✔✔
country in dictionary
del(dict[value]) - ✔✔
order matters list
lookup table dictionary - ✔✔
dataframe.index=list of labels
pandas.read_csv("path/to/csv/file",index_col=int)
pandas.read_html("url") - ✔✔
pd.DataFrame["column","column2"] series
vs
pd.DataFrame[["column","column2"]] dataframe - ✔✔intermediate python
, pd.DataFrame[int1:int2]
pd.DataFrame.loc and .iloc
.loc["row name"] returns a series, .loc[["row name"]] returns a DataFrame
.loc[["row name 1","row name 2"]]
.loc[[row names] a ":" can also be used here,[column names]]
.iloc[[1,2,3],[column list]] - ✔✔
how to get a datframe back form columns of a different dataframe -
✔✔firstdataframe[["column name 1","column name2"]]
logical_and
logical_or
logical_not
np.logical_and(arraybools1,arraybools2) - ✔✔intermediate python
dictionary.items()
for key,value in dictionary.items:
do stuff - ✔✔
np.nditer gives all the individual elements of a 2-d or larger array and not the sub arrays
- ✔✔