Datacamp Data Scientist Notes Questions
with Correct Answers
we can obtain an array of booleans by comparing an np.array to a condition
np.arrays can also be subset with booleans
only those with a value of true are selected
ndarray n dimensional array
np.array.shape attribute
array[1][2] is equivalent to array[1,2]
numpy array operations to each element of the numpy array list add subtract etc the next list
np.mean mean of all values passed to it
np.median
,because numpy enorces a single data type in an array it can drastically imporve the speed of
calculation as compared to a list
this is one reason C++ is faster it has more structure which allows for more assumptions
np.random.normal
np.columnstack
if you use fancy indexing to assign part of an older array as a new array the new array does not
behave mutably with the first, that is to say changes to the new array will not be reflected in the
older array Introduction to python
plt.xscale("log") intermediate python
plt.xlabel()
plt.ylabel()
plt.title()
plt.yticks(tick values, tick names)
,[]+nameofalist
plt.scatter(x,y,size)
plt.text(xloc,yloc,text)
plt.grid()
list.index()
country in dictionary
del(dict[value])
order matters list
lookup table dictionary
dataframe.index=list of labels
pandas.read_csv("path/to/csv/file",index_col=int)
pandas.read_html("url")
, pd.DataFrame["column","column2"] series
vs
pd.DataFrame[["column","column2"]] dataframe intermediate python
pd.DataFrame[int1:int2]
pd.DataFrame.loc and .iloc
.loc["row name"] returns a series, .loc[["row name"]] returns a DataFrame
.loc[["row name 1","row name 2"]]
.loc[[row names] a ":" can also be used here,[column names]]
.iloc[[1,2,3],[column list]]
how to get a datframe back form columns of a different dataframe
firstdataframe[["column name 1","column name2"]]
logical_and
logical_or
with Correct Answers
we can obtain an array of booleans by comparing an np.array to a condition
np.arrays can also be subset with booleans
only those with a value of true are selected
ndarray n dimensional array
np.array.shape attribute
array[1][2] is equivalent to array[1,2]
numpy array operations to each element of the numpy array list add subtract etc the next list
np.mean mean of all values passed to it
np.median
,because numpy enorces a single data type in an array it can drastically imporve the speed of
calculation as compared to a list
this is one reason C++ is faster it has more structure which allows for more assumptions
np.random.normal
np.columnstack
if you use fancy indexing to assign part of an older array as a new array the new array does not
behave mutably with the first, that is to say changes to the new array will not be reflected in the
older array Introduction to python
plt.xscale("log") intermediate python
plt.xlabel()
plt.ylabel()
plt.title()
plt.yticks(tick values, tick names)
,[]+nameofalist
plt.scatter(x,y,size)
plt.text(xloc,yloc,text)
plt.grid()
list.index()
country in dictionary
del(dict[value])
order matters list
lookup table dictionary
dataframe.index=list of labels
pandas.read_csv("path/to/csv/file",index_col=int)
pandas.read_html("url")
, pd.DataFrame["column","column2"] series
vs
pd.DataFrame[["column","column2"]] dataframe intermediate python
pd.DataFrame[int1:int2]
pd.DataFrame.loc and .iloc
.loc["row name"] returns a series, .loc[["row name"]] returns a DataFrame
.loc[["row name 1","row name 2"]]
.loc[[row names] a ":" can also be used here,[column names]]
.iloc[[1,2,3],[column list]]
how to get a datframe back form columns of a different dataframe
firstdataframe[["column name 1","column name2"]]
logical_and
logical_or