Questions with Complete Solutions
What is the purpose of exploratory data analysis (EDA)?
✔✔ To visually and statistically summarize the main characteristics of a dataset
What is a correlation in data analysis?
✔✔ A statistical relationship between two variables, indicating how they change together
How do you handle missing data in a dataset?
✔✔ By imputing missing values, removing rows with missing values, or using other techniques
depending on the situation
What is a histogram used for in data analysis?
✔✔ To visualize the distribution of a single variable
What does the term "outlier" refer to in data analysis?
✔✔ A data point that significantly differs from other observations in the dataset
1
,What is the difference between qualitative and quantitative data?
✔✔ Qualitative data describes categories or attributes, while quantitative data represents
numerical values
What is the significance of using statistical tests in data analysis?
✔✔ To determine whether the observed data patterns are statistically significant or occurred by
chance
How do you calculate the mean of a dataset?
✔✔ By summing all the data values and dividing by the total number of data points
What is a box plot used for in data analysis?
✔✔ To visualize the distribution of data, showing the median, quartiles, and potential outliers
What is a pivot table in data analysis?
✔✔ A tool that allows you to summarize and aggregate data based on specific criteria
What does the term "data normalization" mean?
2
, ✔✔ The process of adjusting data to a common scale without distorting differences in the ranges
of values
How does regression analysis help in data analysis?
✔✔ It is used to model the relationship between a dependent variable and one or more
independent variables
What is the purpose of data visualization?
✔✔ To communicate insights from the data through graphical representation, making it easier to
understand and analyze
What does "data distribution" refer to in data analysis?
✔✔ The way in which data points are spread across different values or ranges
How do you measure the central tendency of a dataset?
✔✔ By calculating the mean, median, and mode
What is the difference between supervised and unsupervised learning in data analysis?
3