Data Analytics Applications
3.0 Credits
Objective Assessment Review (Qns &
Ans)
2025
©2025
, Question 1:
An enterprise requires immediate insights from streaming data
generated by IoT sensors in its manufacturing plants. Which key
advantage does a stream processing architecture (e.g., using Apache
Kafka with Spark Streaming) provide over traditional batch processing?
A. Higher overall data accuracy
B. Lower latency for real-time decision making
C. Simplified data integration from multiple sources
D. Reduced cost for historical data storage
Correct ANS: B. Lower latency for real‑time decision making
Rationale:
Stream processing architectures handle data continuously as it arrives,
ensuring that insights are delivered in real time. This low latency is vital
for time‑sensitive applications, such as monitoring manufacturing
processes and immediately detecting anomalies.
---
Question 2:
When designing an ETL pipeline for a data analytics application that
aggregates information from multiple heterogeneous sources, which
architectural component is most crucial?
A. A centralized high‑performance server
B. A parallel data transformation engine
C. Manual data cleansing processes
D. A legacy data warehousing solution
Correct ANS: B. A parallel data transformation engine
Rationale:
©2025
, Scalable ETL pipelines rely on parallel processing to transform and
integrate data efficiently. This approach allows for the simultaneous
handling of diverse data sources, ensuring the pipeline can manage high
volumes and varying data formats.
---
Question 3:
In predictive modeling applications, overfitting is a common challenge
that can lead to poor generalization on new data. Which technique is
most effective in mitigating this issue?
A. Increasing the number of model parameters
B. Applying cross‑validation during model training
C. Reducing the training dataset size
D. Incorporating hard‑coded business rules
Correct ANS: B. Applying cross‑validation during model training
Rationale:
Cross‑validation involves partitioning the dataset into multiple subsets to
validate the model’s performance on unseen data. This practice helps
detect overfitting and guides adjustments to improve generalization.
---
Question 4:
A financial firm uses a data analytics system to continuously monitor
transaction flows for fraudulent patterns. Which type of algorithm is best
suited for detecting rare and anomalous activities in such applications?
A. Standard regression analysis
B. Anomaly detection algorithms (e.g., Isolation Forest)
C. Unsupervised clustering
D. Time‑series forecasting
Correct ANS: B. Anomaly detection algorithms (e.g., Isolation Forest)
©2025