DAT 250 Test 5 With Questions And
Answers
What are the 3 types of big data? - Answer - structured, unstructured, semi-structured
What percent of data is structured? - Answer - 5-10%
What percent of data is unstructured? - Answer - 80%
What percent of data is semi-structured? - Answer - 5-10%
What type of data is email? - Answer - unstructured
What is structured data? - Answer - Data that can be processed in a fixed format. Highly organized
information that can be readily available
An excel spreadsheet would be which type of data? - Answer - Structured, due to organized rows and
columns
What are the 7 steps of the Data Science Lifestyle? - Answer - Business Understanding, Data Mining,
Data Cleaning, Data Exploration, Feature Engineering, Predictive Modeling, Data Visualization
What is Business Understanding? - Answer - Asking relevant questions and defining objectives for the
problem that needs to be solved.
What is Data Mining? - Answer - "mining the data"
-gathering and scoping the data necessary for the project
What is Data Cleaning? - Answer - "cleaning the data"
-fixing the inconsistencies in the data and handling missing values
, What is Data Exploration? - Answer - "exploring the data"
-forming a hypothesis about your defined problem by visually analyzing the data
What is Feature Engineering? - Answer - Selecting important features & constructing more meaningful
ones using the raw data you have.
What is Predictive Modeling? - Answer - -"using models to make predictions"
-train machine learning models, evaluate their performance and use them to make predictions.
What is Data Visualization? - Answer - "visualizing the data"
-communicate the findings with key shareholders using plots and interactive visualizations.
What is data science? - Answer - The practice of mining large data sets of raw data to identify patterns
and get insight from them.
-raw data is both structured and unstructured
What is AI? - Answer - Getting a computer to mimic human behavior in some way.
What is machine learning? - Answer - subset of AI
-giving the computer a brain to learn from data
-helps computers figure things out from data and deliver AI applications
What is one of the most common tools data scientists use? - Answer - open source notebooks- web
applications for writing & running code and visualizing data all in one place
ex: Jupyter, RStudio, & Zeppelin
Who oversees data science process? - Answer - Business managers, IT managers, and Data Science
managers
Answers
What are the 3 types of big data? - Answer - structured, unstructured, semi-structured
What percent of data is structured? - Answer - 5-10%
What percent of data is unstructured? - Answer - 80%
What percent of data is semi-structured? - Answer - 5-10%
What type of data is email? - Answer - unstructured
What is structured data? - Answer - Data that can be processed in a fixed format. Highly organized
information that can be readily available
An excel spreadsheet would be which type of data? - Answer - Structured, due to organized rows and
columns
What are the 7 steps of the Data Science Lifestyle? - Answer - Business Understanding, Data Mining,
Data Cleaning, Data Exploration, Feature Engineering, Predictive Modeling, Data Visualization
What is Business Understanding? - Answer - Asking relevant questions and defining objectives for the
problem that needs to be solved.
What is Data Mining? - Answer - "mining the data"
-gathering and scoping the data necessary for the project
What is Data Cleaning? - Answer - "cleaning the data"
-fixing the inconsistencies in the data and handling missing values
, What is Data Exploration? - Answer - "exploring the data"
-forming a hypothesis about your defined problem by visually analyzing the data
What is Feature Engineering? - Answer - Selecting important features & constructing more meaningful
ones using the raw data you have.
What is Predictive Modeling? - Answer - -"using models to make predictions"
-train machine learning models, evaluate their performance and use them to make predictions.
What is Data Visualization? - Answer - "visualizing the data"
-communicate the findings with key shareholders using plots and interactive visualizations.
What is data science? - Answer - The practice of mining large data sets of raw data to identify patterns
and get insight from them.
-raw data is both structured and unstructured
What is AI? - Answer - Getting a computer to mimic human behavior in some way.
What is machine learning? - Answer - subset of AI
-giving the computer a brain to learn from data
-helps computers figure things out from data and deliver AI applications
What is one of the most common tools data scientists use? - Answer - open source notebooks- web
applications for writing & running code and visualizing data all in one place
ex: Jupyter, RStudio, & Zeppelin
Who oversees data science process? - Answer - Business managers, IT managers, and Data Science
managers