100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
30+ data science pratice questions $8.29   Add to cart

Exam (elaborations)

30+ data science pratice questions

 2 views  0 purchase
  • Course
  • Institution
  • Book

30+ data science pratice questions

Preview 2 out of 8  pages

  • May 23, 2023
  • 8
  • 2022/2023
  • Exam (elaborations)
  • Only questions
avatar-seller
30+ data science pratice questions

1. What is the difference between Cluster and Systematic Sampling?
Answer: Cluster sampling is a technique used when it becomes difficult to study the target
population spread across a wide area and simple random sampling cannot be applied.
Cluster Sample is a probability sample where each sampling unit is a collection or cluster of
elements. Systematic sampling is a statistical technique where elements are selected from
an ordered sampling frame. In systematic sampling, the list is progressed in a circular
manner so once you reach the end of the list, it is progressed from the top again. The best
example of systematic sampling is equal probability method.

2. What does P-value signify about the statistical data?
Answer: P-value is used to determine the significance of results after a hypothesis test in
statistics. P-value helps the readers to draw conclusions and is always between 0 and 1.

• P-Value > 0.05 denotes weak evidence against the null hypothesis which means the null
hypothesis cannot be rejected.

• P-value <= 0.05 denotes strong evidence against the null hypothesis which means the null
hypothesis can be rejected.

• P-value=0.05is the marginal value indicating it is possible to go either way.

3. A test has a true positive rate of 100% and a false-positive rate of 5%. There is a
population with a 1/1000 rate of having the condition the test identifies. Considering a
positive test, what is the probability of having that condition?
Answer: Let’s suppose you are being tested for a disease if you have the illness the test will
end up saying you have the illness. However, if you don’t have the illness- 5% of the times
the test will end up saying you have the illness and 95% of the times the test will give an
accurate result that you don’t have the illness. Thus there is a 5% error in case you do not
have the illness.

Out of 1000 people, 1 person who has the disease will get true positive result.

Out of the remaining 999 people, 5% will also get true positive result.

Close to 50 people will get a true positive result for the disease.

This means that out of 1000 people, 51 people will be tested positive for the disease even
though only one person has the illness. There is only a 2% probability of you having the
disease even if your reports say that you have the disease.

4. What is the goal of A/B Testing?
Answer: It is a statistical hypothesis testing for a randomized experiment with two variables
A and B.

The goal of A/B Testing is to identify any changes to the web page to maximize or increase
the outcome of interest.

, An example of this could be identifying the click-through rate for a banner ad.

5. Python or R – Which one would you prefer for text analytics?
We will prefer Python because of the following reasons:

Python would be the best option because it has Pandas library that provides easy to use
data structures and high-performance data analysis tools.
R is more suitable for machine learning than just text analysis.
Python performs faster for all types of text analytics.

6. What is Systematic Sampling?
Answer: Systematic sampling is a statistical technique where elements are selected from an
ordered sampling frame. In systematic sampling, the list is progressed in a circular manner
so once you reach the end of the list, it is progressed from the top again. The best example
of systematic sampling is equal probability method. (E learning portal)

7. Which technique is used to predict categorical responses?
Answer: Classification technique is used widely in mining for classifying data sets.

8. What are Recommender Systems?
Answer: A subclass of information filtering systems that are meant to predict the preferences
or ratings that a user would give to a product. Recommender systems are widely used in
movies, news, research articles, products, social tags, music, etc.

9. What is power analysis?
Answer: power analysis is a vital part of the experimental design. It is involved with the
process of determining the sample size needed for detecting an effect of a given size from a
cause with a certain degree of assurance. It lets you deploy a specific probability in a sample
size constraint.
The various techniques of statistical power analysis and sample size estimation are widely
deployed for making statistical judgment that is accurate and evaluates the size needed for
experimental effects in practice.

Power analysis lets you understand the sample size estimate so that they are neither high
nor low. A low sample size there will be no authentication to provide reliable answers and if it
is large there will be wastage of resources.

10. How is Data modeling different from Database design?
Answer:

Data Modeling: It can be considered as the first step towards the design of a database. Data
modeling creates a conceptual model based on the relationship between various data
models. The process involves moving from the conceptual stage to the logical model to the
physical schema. It involves the systematic method of applying data modeling techniques.

Database Design: This is the process of designing the database. The database design
creates an output which is a detailed data model of the database. Strictly speaking,

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller v4victoryvamshi. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $8.29. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

73314 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$8.29
  • (0)
  Add to cart