Data science >Machine Learning Project.html.2023& study guide with complete solution
import pandas as pd import numpy as np from sklearn import preprocessing from _selection import train_test_split from _bayes import GaussianNB from cs import accuracy_score import seaborn as sns import t as plt from import zscore import warnings rwarnings( "ignore") from r_model import LinearRegression from er import KMeans from cs import mean_squared_error from ers_influence import variance_inflation_fac tor import math from r_model import LogisticRegression from sklearn import metrics from cs import plot_confusion_matrix from cs import roc_auc_score,roc_curve,classification_repo rt,confusion_matrix #plot_confusion_matrix 1.1 Read the dataset. Do the descriptive statistics and do the null value condition check. Write an inference on it. In [2]: df=_csv("Election_D",index_col=0) In [3]: (10) Out[3]: vote age nal hold Blair Hague Europe politica 1 Labour 43 3 3 4 1 2 Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD In [4]: (10) vote age nal hold Blair Hague Europe politica 2 Labour 36 4 4 4 4 5 3 Labour 35 4 4 5 2 3 4 Labour 24 4 2 2 1 4 5 Labour 41 2 2 1 1 6 6 Labour 47 3 4 4 4 4 7 Labour 57 2 2 4 4 11 8 Labour 77 3 4 4 1 1 9 Labour 39 3 3 4 4 11 10 Labour 70 3 2 5 1 11 Out[4]: vote age nal hold Blair Hague Europe 1516 Conservative 82 2 2 2 1 11 1517 Labour 30 3 4 4 2 4 1518 Labour 76 4 3 2 2 11 1519 Labour 50 3 4 4 2 5 1520 Conservative 35 3 4 4 2 8 1521 Conservative 67 5 3 2 4 11 1522 Conservative 73 2 2 4 4 8 1523 Labour 37 3 3 5 4 2 1524 Conservative 61 3 3 1 4 11 1525 Conservative 74 2 3 2 4 11 Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD In [5]: ibe(include='all') In [6]: l().sum() In [7]: () Out[5]: vote age nal hold Blair count . 1525. 1525. 1525. 152 unique 2 NaN NaN NaN NaN top Labour NaN NaN NaN NaN freq 1063 NaN NaN NaN NaN mean NaN 54. 3. 3. 3. std NaN 15. 0. 0. 1. min NaN 24. 1. 1. 1. 25% NaN 41. 3. 3. 2. 50% NaN 53. 3. 3. 4. 75% NaN 67. 4. 4. 4. max NaN 93. 5. 5. 5. Out[6]: vote 0 age 0 nal 0 hold 0 Blair 0 Hague 0 Europe 0 edge 0 gender 0 dtype: int64 <class '.DataFrame'> Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD In [8]: print("no. of rows: ",[0], "n""no. of columns: ",[1]) In [9]: dups = cated() print('Number of duplicate rows = %d' % (())) df[dups] Int64Index: 1525 entries, 1 to 1525 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 vote 1525 non-null object 1 age 1525 non-null int64 2 nal 1525 non-null int64 3 hold 1525 non-null int64 4 Blair 1525 non-null int64 5 Hague
Written for
- Institution
-
Great Lakes Christian College
- Course
-
DATA SCIEN
Document information
- Uploaded on
- March 17, 2023
- Number of pages
- 72
- Written in
- 2022/2023
- Type
- Other
- Person
- Unknown
Subjects
- project
-
data science gtmachine learning project