Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

Certified Machine Learning (Python) Practice Exam

Rating
-
Sold
-
Pages
55
Grade
A+
Uploaded on
26-03-2025
Written in
2024/2025

1. Introduction to Machine Learning • Definition and history of machine learning • Types of machine learning (Supervised, Unsupervised, Semi-supervised, Reinforcement) • Key applications of machine learning in different industries • Overview of machine learning lifecycle • Difference between AI, ML, and Deep Learning • Understanding the concept of training, testing, and validation 2. Python Basics for Machine Learning • Python programming essentials for data analysis and machine learning • Python libraries (NumPy, pandas, scikit-learn, TensorFlow, Keras, Matplotlib) • Working with data structures (lists, tuples, dictionaries, sets) in Python • Control flow (if, else, for, while loops) and functions in Python • Writing Python scripts and functions for ML workflows • Understanding Python's object-oriented features • Data manipulation using pandas: DataFrames, series, indexing, and slicing 3. Data Preprocessing and Cleaning • Data collection and loading data from different formats (CSV, Excel, SQL, etc.) • Handling missing data (imputation, removal, interpolation) • Encoding categorical variables (One-hot encoding, Label encoding) • Feature scaling techniques (Standardization, Normalization) • Data transformation and feature engineering • Handling imbalanced data (SMOTE, undersampling, oversampling) • Feature selection techniques (Correlation, Recursive Feature Elimination) • Handling outliers and data smoothing 4. Exploratory Data Analysis (EDA) • Descriptive statistics (mean, median, variance, skewness, kurtosis) • Visualizing data distributions (histograms, box plots, density plots) • Bivariate and multivariate analysis • Data visualization techniques (Matplotlib, Seaborn, Plotly) • Identifying patterns and relationships between features • Correlation matrices and heatmaps • Using scatter plots and pair plots for analysis 5. Supervised Learning Algorithms • Linear Regression o Concept of linear regression o Cost function, Gradient Descent o Model evaluation (MSE, RMSE, MAE, R-squared) • Logistic Regression o Understanding the logistic function and its application o Cost function for logistic regression o Model evaluation (Accuracy, Precision, Recall, F1-Score, ROC Curve) • Decision Trees o Splitting criteria (Gini Impurity, Entropy, Information Gain) o Overfitting and pruning techniques o Hyperparameter tuning (max depth, min samples split) • Support Vector Machines (SVM) o Hyperplane, margin, and kernels (linear, polynomial, RBF) o SVM for classification and regression o Regularization in SVM • k-Nearest Neighbors (k-NN) o Distance metrics (Euclidean, Manhattan) o Model evaluation (Confusion Matrix, K-Fold Cross Validation) o Choosing the right value for k • Naive Bayes o Probabilistic model and assumptions (Independence of features) o Types of Naive Bayes classifiers (Gaussian, Multinomial, Bernoulli) • Ensemble Methods o Bagging (Bootstrap Aggregating) o Random Forests and its advantages o Boosting (AdaBoost, Gradient Boosting, XGBoost) o Stacking and Blending for improved accuracy 6. Unsupervised Learning Algorithms • Clustering Techniques o K-Means clustering (Initialization, Elbow Method) o Hierarchical Clustering (Agglomerative, Divisive) o DBSCAN (Density-Based Spatial Clustering) o Evaluation of clustering models (Silhouette Score, Davies-Bouldin Index) • Dimensionality Reduction o Principal Component Analysis (PCA) o t-Distributed Stochastic Neighbor Embedding (t-SNE) o Linear Discriminant Analysis (LDA) o Feature extraction vs. feature selection 7. Deep Learning (Neural Networks) • Introduction to neural networks and how they mimic human brain • Layers in neural networks (Input, Hidden, Output) • Activation functions (Sigmoid, ReLU, Tanh, Softmax) • Backpropagation and Gradient Descent • Loss functions (Mean Squared Error, Cross-Entropy) • Overfitting and Regularization (Dropout, L2 regularization) • Introduction to Convolutional Neural Networks (CNN) o Layers (Convolutional, Pooling, Fully Connected) o Applications of CNN (Image classification, Object detection) • Introduction to Recurrent Neural Networks (RNN) o Understanding time-series data and sequence models o Long Short-Term Memory (LSTM) networks • Autoencoders and their applications (Anomaly detection, Data compression) 8. Model Evaluation and Optimization • Cross-validation techniques (K-Fold, Stratified K-Fold, Leave-One-Out) • Performance metrics for classification (Accuracy, Precision, Recall, F1-Score, ROC Curve, AUC) • Performance metrics for regression (MSE, RMSE, MAE) • Hyperparameter tuning (Grid Search, Random Search, Bayesian Optimization) • Model selection criteria and trade-offs • Bias-Variance trade-off and how to achieve optimal model performance • Feature importance and model interpretability 9. Working with Big Data in Machine Learning • Overview of Big Data frameworks (Hadoop, Spark) • Distributed machine learning and parallel processing • Handling large datasets using Dask and Spark MLlib • Introduction to GPUs and their role in accelerating machine learning 10. Deployment and Model Monitoring • Model deployment strategies (On-premises, Cloud-based solutions like AWS, Azure) • Model versioning and rollback • Continuous integration/continuous deployment (CI/CD) for ML models • Using Docker for containerization of ML models • Model monitoring and drift detection • Updating models in production • Ethics in AI and machine learning (Fairness, Transparency, Accountability) 11. Machine Learning in Real-World Applications • Natural Language Processing (NLP) o Text Preprocessing (Tokenization, Lemmatization, Stemming) o Bag-of-Words and TF-IDF o Sentiment analysis, Named Entity Recognition (NER) o Word Embeddings (Word2Vec, GloVe)

Show more Read less
Institution
Computers
Course
Computers

Content preview

Certified Machine Learning (Python) Practice Exam
Question 1: What is the primary goal of machine learning?
Options:
A. To design explicit algorithms
B. To learn patterns from data
C. To compute statistics manually
D. To implement database queries
Answer: B
Explanation: Machine learning focuses on automatically learning patterns from data to make predictions
or decisions without explicit programming.

Question 2: Which of the following best describes supervised learning?
Options:
A. Learning from unlabeled data
B. Learning from labeled data
C. Learning without any feedback
D. Learning by trial and error
Answer: B
Explanation: Supervised learning uses labeled data to train models so that they can predict outcomes for
new, unseen data.

Question 3: In unsupervised learning, what is the main goal of clustering?
Options:
A. To predict target values
B. To reduce data dimensionality
C. To group similar data points
D. To enhance image resolution
Answer: C
Explanation: Clustering aims to group similar data points together based on features and similarities
without prior labeling.

Question 4: Which Python library is most commonly used for numerical computations in machine
learning?
Options:
A. pandas
B. NumPy
C. matplotlib
D. TensorFlow
Answer: B
Explanation: NumPy provides support for large, multi-dimensional arrays and matrices, making it
essential for numerical computations.

Question 5: What does the term “feature scaling” refer to?
Options:
A. Increasing the number of features

,B. Reducing the number of observations
C. Normalizing data values to a common scale
D. Encoding categorical variables
Answer: C
Explanation: Feature scaling normalizes data values so that features contribute equally to the model’s
performance.

Question 6: Which activation function is most commonly used in deep learning hidden layers?
Options:
A. Softmax
B. Sigmoid
C. ReLU
D. Linear
Answer: C
Explanation: ReLU (Rectified Linear Unit) is popular because it helps mitigate the vanishing gradient
problem while being computationally efficient.

Question 7: What is overfitting in machine learning models?
Options:
A. Underestimating the model’s complexity
B. When a model learns noise in the training data
C. Having too few training samples
D. When a model performs equally on training and test data
Answer: B
Explanation: Overfitting occurs when a model learns the training data—including its noise—instead of
the underlying pattern, resulting in poor generalization.

Question 8: Which technique is used for reducing overfitting in neural networks?
Options:
A. Increasing learning rate
B. Dropout
C. Using more layers
D. Removing bias
Answer: B
Explanation: Dropout randomly disables neurons during training, which helps prevent the network from
overfitting.

Question 9: In a confusion matrix, what does the term “True Positive” (TP) represent?
Options:
A. Incorrectly predicted positive cases
B. Correctly predicted negative cases
C. Correctly predicted positive cases
D. Incorrectly predicted negative cases
Answer: C
Explanation: True Positives are cases where the model correctly predicts the positive class.

,Question 10: Which method is used for hyperparameter tuning by exhaustively searching over
specified parameter values?
Options:
A. Random Search
B. Grid Search
C. Bayesian Optimization
D. Cross-Validation
Answer: B
Explanation: Grid Search systematically tests all parameter combinations to find the best model
configuration.

Question 11: What does the acronym “PCA” stand for in machine learning?
Options:
A. Principal Cluster Analysis
B. Partial Component Analysis
C. Principal Component Analysis
D. Probabilistic Clustering Algorithm
Answer: C
Explanation: PCA stands for Principal Component Analysis, a technique used for dimensionality
reduction.

Question 12: Which of the following is a common cost function for linear regression?
Options:
A. Cross-entropy loss
B. Mean Squared Error (MSE)
C. Hinge loss
D. Log loss
Answer: B
Explanation: Mean Squared Error (MSE) measures the average squared difference between predicted
and actual values in linear regression.

Question 13: What distinguishes reinforcement learning from other types of machine learning?
Options:
A. Use of labeled data
B. Learning based on rewards and penalties
C. Clustering data points
D. Dimensionality reduction
Answer: B
Explanation: Reinforcement learning involves an agent that learns to make decisions by receiving
rewards or penalties.

Question 14: Which library is primarily used for data manipulation and analysis in Python?
Options:
A. pandas
B. scikit-learn
C. TensorFlow

, D. Matplotlib
Answer: A
Explanation: pandas is a powerful library used for data manipulation and analysis, offering data
structures like DataFrames.

Question 15: In the context of decision trees, what is “pruning”?
Options:
A. Adding more branches to the tree
B. Reducing the depth of the tree to prevent overfitting
C. Increasing the number of leaves
D. Scaling features
Answer: B
Explanation: Pruning is the process of reducing the size of a decision tree to improve its generalization
by removing branches that have little importance.

Question 16: What is the purpose of one-hot encoding in data preprocessing?
Options:
A. To scale numeric features
B. To convert categorical variables into binary vectors
C. To impute missing values
D. To reduce dimensionality
Answer: B
Explanation: One-hot encoding transforms categorical variables into a binary matrix representation,
which is more suitable for ML algorithms.

Question 17: Which metric is most appropriate for evaluating a regression model?
Options:
A. Accuracy
B. Precision
C. Mean Absolute Error (MAE)
D. F1-Score
Answer: C
Explanation: Mean Absolute Error (MAE) is commonly used to evaluate regression models by measuring
the average absolute differences between predicted and actual values.

Question 18: Which of the following is an ensemble learning method?
Options:
A. Logistic Regression
B. k-Nearest Neighbors
C. Random Forest
D. Support Vector Machine
Answer: C
Explanation: Random Forest is an ensemble learning method that combines multiple decision trees to
improve model accuracy and reduce overfitting.

Question 19: In support vector machines, what does the “kernel trick” enable?
Options:

Written for

Institution
Computers
Course
Computers

Document information

Uploaded on
March 26, 2025
Number of pages
55
Written in
2024/2025
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$85.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
nikhiljain22 EXAMS
View profile
Follow You need to be logged in order to follow users or courses
Sold
960
Member since
1 year
Number of followers
33
Documents
23250
Last sold
6 hours ago

3.5

226 reviews

5
77
4
49
3
46
2
16
1
38

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions