Examen

Certified Machine Learning (Python) Practice Exam

Puntuación

Vendido

Páginas

Grado

A+

Subido en

26-03-2025

Escrito en

2024/2025

1. Introduction to Machine Learning • Definition and history of machine learning • Types of machine learning (Supervised, Unsupervised, Semi-supervised, Reinforcement) • Key applications of machine learning in different industries • Overview of machine learning lifecycle • Difference between AI, ML, and Deep Learning • Understanding the concept of training, testing, and validation 2. Python Basics for Machine Learning • Python programming essentials for data analysis and machine learning • Python libraries (NumPy, pandas, scikit-learn, TensorFlow, Keras, Matplotlib) • Working with data structures (lists, tuples, dictionaries, sets) in Python • Control flow (if, else, for, while loops) and functions in Python • Writing Python scripts and functions for ML workflows • Understanding Python's object-oriented features • Data manipulation using pandas: DataFrames, series, indexing, and slicing 3. Data Preprocessing and Cleaning • Data collection and loading data from different formats (CSV, Excel, SQL, etc.) • Handling missing data (imputation, removal, interpolation) • Encoding categorical variables (One-hot encoding, Label encoding) • Feature scaling techniques (Standardization, Normalization) • Data transformation and feature engineering • Handling imbalanced data (SMOTE, undersampling, oversampling) • Feature selection techniques (Correlation, Recursive Feature Elimination) • Handling outliers and data smoothing 4. Exploratory Data Analysis (EDA) • Descriptive statistics (mean, median, variance, skewness, kurtosis) • Visualizing data distributions (histograms, box plots, density plots) • Bivariate and multivariate analysis • Data visualization techniques (Matplotlib, Seaborn, Plotly) • Identifying patterns and relationships between features • Correlation matrices and heatmaps • Using scatter plots and pair plots for analysis 5. Supervised Learning Algorithms • Linear Regression o Concept of linear regression o Cost function, Gradient Descent o Model evaluation (MSE, RMSE, MAE, R-squared) • Logistic Regression o Understanding the logistic function and its application o Cost function for logistic regression o Model evaluation (Accuracy, Precision, Recall, F1-Score, ROC Curve) • Decision Trees o Splitting criteria (Gini Impurity, Entropy, Information Gain) o Overfitting and pruning techniques o Hyperparameter tuning (max depth, min samples split) • Support Vector Machines (SVM) o Hyperplane, margin, and kernels (linear, polynomial, RBF) o SVM for classification and regression o Regularization in SVM • k-Nearest Neighbors (k-NN) o Distance metrics (Euclidean, Manhattan) o Model evaluation (Confusion Matrix, K-Fold Cross Validation) o Choosing the right value for k • Naive Bayes o Probabilistic model and assumptions (Independence of features) o Types of Naive Bayes classifiers (Gaussian, Multinomial, Bernoulli) • Ensemble Methods o Bagging (Bootstrap Aggregating) o Random Forests and its advantages o Boosting (AdaBoost, Gradient Boosting, XGBoost) o Stacking and Blending for improved accuracy 6. Unsupervised Learning Algorithms • Clustering Techniques o K-Means clustering (Initialization, Elbow Method) o Hierarchical Clustering (Agglomerative, Divisive) o DBSCAN (Density-Based Spatial Clustering) o Evaluation of clustering models (Silhouette Score, Davies-Bouldin Index) • Dimensionality Reduction o Principal Component Analysis (PCA) o t-Distributed Stochastic Neighbor Embedding (t-SNE) o Linear Discriminant Analysis (LDA) o Feature extraction vs. feature selection 7. Deep Learning (Neural Networks) • Introduction to neural networks and how they mimic human brain • Layers in neural networks (Input, Hidden, Output) • Activation functions (Sigmoid, ReLU, Tanh, Softmax) • Backpropagation and Gradient Descent • Loss functions (Mean Squared Error, Cross-Entropy) • Overfitting and Regularization (Dropout, L2 regularization) • Introduction to Convolutional Neural Networks (CNN) o Layers (Convolutional, Pooling, Fully Connected) o Applications of CNN (Image classification, Object detection) • Introduction to Recurrent Neural Networks (RNN) o Understanding time-series data and sequence models o Long Short-Term Memory (LSTM) networks • Autoencoders and their applications (Anomaly detection, Data compression) 8. Model Evaluation and Optimization • Cross-validation techniques (K-Fold, Stratified K-Fold, Leave-One-Out) • Performance metrics for classification (Accuracy, Precision, Recall, F1-Score, ROC Curve, AUC) • Performance metrics for regression (MSE, RMSE, MAE) • Hyperparameter tuning (Grid Search, Random Search, Bayesian Optimization) • Model selection criteria and trade-offs • Bias-Variance trade-off and how to achieve optimal model performance • Feature importance and model interpretability 9. Working with Big Data in Machine Learning • Overview of Big Data frameworks (Hadoop, Spark) • Distributed machine learning and parallel processing • Handling large datasets using Dask and Spark MLlib • Introduction to GPUs and their role in accelerating machine learning 10. Deployment and Model Monitoring • Model deployment strategies (On-premises, Cloud-based solutions like AWS, Azure) • Model versioning and rollback • Continuous integration/continuous deployment (CI/CD) for ML models • Using Docker for containerization of ML models • Model monitoring and drift detection • Updating models in production • Ethics in AI and machine learning (Fairness, Transparency, Accountability) 11. Machine Learning in Real-World Applications • Natural Language Processing (NLP) o Text Preprocessing (Tokenization, Lemmatization, Stemming) o Bag-of-Words and TF-IDF o Sentiment analysis, Named Entity Recognition (NER) o Word Embeddings (Word2Vec, GloVe)

Mostrar más Leer menos

Institución

Computers

Grado

Computers

Vista previa del contenido

Certified Machine Learning (Python) Practice Exam
Question 1: What is the primary goal of machine learning?
Options:
A. To design explicit algorithms
B. To learn patterns from data
C. To compute statistics manually
D. To implement database queries
Answer: B
Explanation: Machine learning focuses on automatically learning patterns from data to make predictions
or decisions without explicit programming.

Question 2: Which of the following best describes supervised learning?
Options:
A. Learning from unlabeled data
B. Learning from labeled data
C. Learning without any feedback
D. Learning by trial and error
Answer: B
Explanation: Supervised learning uses labeled data to train models so that they can predict outcomes for
new, unseen data.

Question 3: In unsupervised learning, what is the main goal of clustering?
Options:
A. To predict target values
B. To reduce data dimensionality
C. To group similar data points
D. To enhance image resolution
Answer: C
Explanation: Clustering aims to group similar data points together based on features and similarities
without prior labeling.

Question 4: Which Python library is most commonly used for numerical computations in machine
learning?
Options:
A. pandas
B. NumPy
C. matplotlib
D. TensorFlow
Answer: B
Explanation: NumPy provides support for large, multi-dimensional arrays and matrices, making it
essential for numerical computations.

Question 5: What does the term “feature scaling” refer to?
Options:
A. Increasing the number of features

,B. Reducing the number of observations
C. Normalizing data values to a common scale
D. Encoding categorical variables
Answer: C
Explanation: Feature scaling normalizes data values so that features contribute equally to the model’s
performance.

Question 6: Which activation function is most commonly used in deep learning hidden layers?
Options:
A. Softmax
B. Sigmoid
C. ReLU
D. Linear
Answer: C
Explanation: ReLU (Rectified Linear Unit) is popular because it helps mitigate the vanishing gradient
problem while being computationally efficient.

Question 7: What is overfitting in machine learning models?
Options:
A. Underestimating the model’s complexity
B. When a model learns noise in the training data
C. Having too few training samples
D. When a model performs equally on training and test data
Answer: B
Explanation: Overfitting occurs when a model learns the training data—including its noise—instead of
the underlying pattern, resulting in poor generalization.

Question 8: Which technique is used for reducing overfitting in neural networks?
Options:
A. Increasing learning rate
B. Dropout
C. Using more layers
D. Removing bias
Answer: B
Explanation: Dropout randomly disables neurons during training, which helps prevent the network from
overfitting.

Question 9: In a confusion matrix, what does the term “True Positive” (TP) represent?
Options:
A. Incorrectly predicted positive cases
B. Correctly predicted negative cases
C. Correctly predicted positive cases
D. Incorrectly predicted negative cases
Answer: C
Explanation: True Positives are cases where the model correctly predicts the positive class.

,Question 10: Which method is used for hyperparameter tuning by exhaustively searching over
specified parameter values?
Options:
A. Random Search
B. Grid Search
C. Bayesian Optimization
D. Cross-Validation
Answer: B
Explanation: Grid Search systematically tests all parameter combinations to find the best model
configuration.

Question 11: What does the acronym “PCA” stand for in machine learning?
Options:
A. Principal Cluster Analysis
B. Partial Component Analysis
C. Principal Component Analysis
D. Probabilistic Clustering Algorithm
Answer: C
Explanation: PCA stands for Principal Component Analysis, a technique used for dimensionality
reduction.

Question 12: Which of the following is a common cost function for linear regression?
Options:
A. Cross-entropy loss
B. Mean Squared Error (MSE)
C. Hinge loss
D. Log loss
Answer: B
Explanation: Mean Squared Error (MSE) measures the average squared difference between predicted
and actual values in linear regression.

Question 13: What distinguishes reinforcement learning from other types of machine learning?
Options:
A. Use of labeled data
B. Learning based on rewards and penalties
C. Clustering data points
D. Dimensionality reduction
Answer: B
Explanation: Reinforcement learning involves an agent that learns to make decisions by receiving
rewards or penalties.

Question 14: Which library is primarily used for data manipulation and analysis in Python?
Options:
A. pandas
B. scikit-learn
C. TensorFlow

, D. Matplotlib
Answer: A
Explanation: pandas is a powerful library used for data manipulation and analysis, offering data
structures like DataFrames.

Question 15: In the context of decision trees, what is “pruning”?
Options:
A. Adding more branches to the tree
B. Reducing the depth of the tree to prevent overfitting
C. Increasing the number of leaves
D. Scaling features
Answer: B
Explanation: Pruning is the process of reducing the size of a decision tree to improve its generalization
by removing branches that have little importance.

Question 16: What is the purpose of one-hot encoding in data preprocessing?
Options:
A. To scale numeric features
B. To convert categorical variables into binary vectors
C. To impute missing values
D. To reduce dimensionality
Answer: B
Explanation: One-hot encoding transforms categorical variables into a binary matrix representation,
which is more suitable for ML algorithms.

Question 17: Which metric is most appropriate for evaluating a regression model?
Options:
A. Accuracy
B. Precision
C. Mean Absolute Error (MAE)
D. F1-Score
Answer: C
Explanation: Mean Absolute Error (MAE) is commonly used to evaluate regression models by measuring
the average absolute differences between predicted and actual values.

Question 18: Which of the following is an ensemble learning method?
Options:
A. Logistic Regression
B. k-Nearest Neighbors
C. Random Forest
D. Support Vector Machine
Answer: C
Explanation: Random Forest is an ensemble learning method that combines multiple decision trees to
improve model accuracy and reduce overfitting.

Question 19: In support vector machines, what does the “kernel trick” enable?
Options:

Informar violación de derechos de autor

Escuela, estudio y materia

Institución: Computers
Grado: Computers

Información del documento

Subido en: 26 de marzo de 2025
Número de páginas: 55
Escrito en: 2024/2025
Tipo: Examen
Contiene: Preguntas y respuestas

Temas

certified machine learning python practice exam

$85.99

Accede al documento completo:

Escrito por estudiantes que aprobaron

Inmediatamente disponible después del pago

Leer en línea o como PDF

Conoce al vendedor

nikhiljain22

3.5

(226)

Conoce al vendedor

nikhiljain22 EXAMS

Ver perfil

Seguir

Vendido

960

Miembro desde

1 año

Número de seguidores

Documentos

23268

Última venta

1 hora hace

3.5

226 reseñas

Documentos populares

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

100% de satisfacción garantizada: ¿Cómo funciona?

Nuestra garantía de satisfacción le asegura que siempre encontrará un documento de estudio a tu medida. Tu rellenas un formulario y nuestro equipo de atención al cliente se encarga del resto.

Who am I buying this summary from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller nikhiljain22. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy this summary for $85.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 45,681 summaries were sold in the last 30 days Founded in 2010, the go-to place to buy summaries for 16 years now