Science.
,Table of contents
Unit 1: Introducing Data Science and Data Collection
● Chapter 1: What Are Data and Data Science?
● Chapter 2: Collecting and Preparing Data
Unit 2: Analyzing Data Using Statistics
● Chapter 3: Descriptive Statistics: Statistical Measurements and Probability
Distributions
● Chapter 4: Inferential Statistics and Regression Analysis
Unit 3: Predicting and Modeling Using Data
● Chapter 5: Time Series and Forecasting
● Chapter 6: Decision-Making Using Machine Learning Basics
● Chapter 7: Deep Learning and AI Basics
Unit 4: Maintaining a Professional and Ethical Data Science Practice
● Chapter 8: Ethics Throughout the Data Science Cycle
● Chapter 9: Visualizing Data
● Chapter 10: Reporting Results
,Chapter 1: What Are Data and Data Science?
Question 1
In the modern data science landscape, the boundaries between traditional disciplines
have blurred. Which of the following best describes the contemporary expectation for a
data scientist compared to the "traditional" model?
A. Data scientists should focus solely on mathematical modeling, leaving data collection to
domain experts and storage to computer scientists.
B. Data scientists are expected to possess expertise across domain knowledge, data
management/computing, and statistical analysis.
C. Data collection is now viewed as an automated process that no longer requires the
context provided by domain experts.
D. Statistical analysis has become secondary to the ability to manage large cloud-based
warehouses like Amazon RedShift.
Correct Answer: B
Explanation: Historically, data science tasks were siloed among domain experts, computer
scientists, and statisticians. Technological advancement has integrated these roles,
requiring modern data science teams to bridge all three domains.
Question 2
According to the 2020 Anaconda survey cited in the text, which phase of the data science
cycle consumes approximately half of a data scientist's time and effort?
A. Problem Definition
B. Data Analysis
C. Data Collection and Preparation
D. Data Reporting and Visualization
Correct Answer: C
Explanation: While analysis and reporting are critical, data scientists spend about 50% of
the entire process on data collection and cleaning/preparation.
Question 3
, A data scientist is analyzing global web search trends. They notice that "nighttime" search
queries are inconsistent because the data was stored in a single UTC timestamp but the
users reside in multiple time zones. Correcting this discrepancy occurs during which
stage?
A. Problem Definition
B. Data Preparation
C. Data Analysis
D. Data Reporting
Correct Answer: B
Explanation: Data preparation (or processing) addresses issues like time zone variations,
typos, and missing values to ensure the analysis yields accurate results.
Question 4
Which technology is specifically characterized as a centralized repository that manages
large volumes of data from various sources to enable business intelligence, often offered
as a cloud service like Google BigQuery?
A. Local Storage
B. Data Warehousing
C. Spreadsheet Programs
D. Integrated Development Environments (IDEs)
Correct Answer: B
Explanation: Data warehousing systems store and manage large volumes of data centrally,
allowing for efficient retrieval and analysis in a cloud-based environment.
Question 5
Walmart’s use of predictive analytics during Hurricane Frances in 2004 discovered a
sevenfold increase in sales for which specific product, allowing them to optimize inventory
ahead of the storm?
A. Flashlights
B. Bottled Water
C. Strawberry Pop-Tarts