COMPLETE SOLUTIONS 2026
What are the 3 types of big data? - ANSWERSstructured, unstructured, semi-structured
What percent of data is structured? - ANSWERS5-10%
What percent of data is unstructured? - ANSWERS80%
What percent of data is semi-structured? - ANSWERS5-10%
What type of data is email? - ANSWERSunstructured
What is structured data? - ANSWERSData that can be processed in a fixed format.
Highly organized information that can be readily available
An excel spreadsheet would be which type of data? - ANSWERSStructured, due to
organized rows and columns
What are the 7 steps of the Data Science Lifestyle? - ANSWERSBusiness
Understanding, Data Mining, Data Cleaning, Data Exploration, Feature Engineering,
Predictive Modeling, Data Visualization
What is Business Understanding? - ANSWERSAsking relevant questions and defining
objectives for the problem that needs to be solved.
What is Data Mining? - ANSWERS"mining the data"
-gathering and scoping the data necessary for the project
What is Data Cleaning? - ANSWERS"cleaning the data"
-fixing the inconsistencies in the data and handling missing values
What is Data Exploration? - ANSWERS"exploring the data"
-forming a hypothesis about your defined problem by visually analyzing the data
What is Feature Engineering? - ANSWERSSelecting important features & constructing
more meaningful ones using the raw data you have.
What is Predictive Modeling? - ANSWERS-"using models to make predictions"
-train machine learning models, evaluate their performance and use them to make
predictions.
What is Data Visualization? - ANSWERS"visualizing the data"
, -communicate the findings with key shareholders using plots and interactive
visualizations.
What is data science? - ANSWERSThe practice of mining large data sets of raw data to
identify patterns and get insight from them.
-raw data is both structured and unstructured
What is AI? - ANSWERSGetting a computer to mimic human behavior in some way.
What is machine learning? - ANSWERSsubset of AI
-giving the computer a brain to learn from data
-helps computers figure things out from data and deliver AI applications
What is one of the most common tools data scientists use? - ANSWERSopen source
notebooks- web applications for writing & running code and visualizing data all in one
place
ex: Jupyter, RStudio, & Zeppelin
Who oversees data science process? - ANSWERSBusiness managers, IT managers,
and Data Science managers
What is the contribution of the business manager? - ANSWERSdevelop the problem
and develop a strategy of analysis
What is the contribution of the IT manager? - ANSWERSmay help build and update IT
environments for data science teams
-monitor operations and resource usage
What is the contribution of the data science manager? - ANSWERSoversees data
science team
team builders who balance development with project planning and monitoring
Who is the most important role in the data science process? - ANSWERSThe data
scientist.
What is secondary use? - ANSWERSInformation collected for one purpose used for
another
How is Google's Personalized Search a secondary source? - ANSWERSThe
information collected from Google about your search queries and Web pages can be
used by companies/retailers for direct marketing.
What is collaborative filtering? - ANSWERSAnalyzing information about preferences of
large numbers of people to predict what one person may prefer.