100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Exam (elaborations)

D204 - MW Data Analytics Tools and Techniques with 100% correct answers

Rating
-
Sold
-
Pages
7
Grade
A
Uploaded on
13-10-2023
Written in
2023/2024

What are the defining characteristics of open data? data that is free to use with no cost and no restrictions What is the principle of informed consent in research? Potential research participants have to be given enough information about the goals, methods, and applications of the research project so they can decide whether they want to participate. Self-generated data refers to what practice? programming computers to engage with themselves to create their own training data While it is possible to gather vast amounts of data through passive collection, researchers still need to be concerned about representativeness. Why does this matter? Without representative data from a wide range of respondents in diverse situations, the results will not generalize well. What practice does "data scraping" refer to? Data scraping refers to the process of extracting data from formats that were not specifically designed for data sharing. What kind of data can be accessed with APIs? both proprietary and open data One common method of gathering new data is A/B testing. What does this refer to? A/B testing is a randomized experiment in which people visiting a website see one of two different versions. The response rates are then compared to choose the most effective version of the website. What is potentially a major advantage of using in-house data? You may be able to talk with the people who created the datasets. How do expert systems mimic the decision-making of experts? by explicitly listing decisions and outcomes in a logical chain like a flow chart What does it mean that algorithms like neural networks develop "implicit" rules? It means that the rules, or the basis by which the algorithm reaches its decision, may not be easy to describe to humans. Why have expert systems not progressed as much as machine learning in the development of decision-making systems? Expert systems quickly encounter the "combinatorial explosion" in which there are simply too many possibilities to enumerate them all. How do decision tree provide rules for decision-making? Decision tree explicitly define binary decisions at each step in the data to reach the outcome; these decisions can then be used in other contexts. Statistical applications like SPSS or jamovi are useful to data projects in what way? Their point-and-click interfaces make common analyses easier for non-specialists to conduct. According to the video, why are spreadsheets so important to data science? They are the "universal data container." What is the purpose of a "package" in a programming language like Python or R? Packages are collections of code that give additional functionality to programming languages and simplify many common tasks. What is "machine-learning-as-a-service" or "MLaaS"? MLaaS is a way of making machine learning easier and more accessible by hosting the software on the same cloud servers that store the data. How does the "combinatorial explosion" make optimization difficult? The number of possibilities increases so fast that it is often not possible to test all possible arrangement. Computers frequently work with data in matrices that are arranged in rows and columns. What is the name for the version of algebra that works best with matrices? linear algebra What is a major advantage of understanding the algebra behind data science procedures? You will better understand how to diagnose problem and respond when things don't work as expected. According to the example calculation in the video, what information do you have to have in order to use calculus? a function that describes the relationship between price and sales What is a "posterior probability" in Bayes' Theorem? A posterior probability is the probability of the cause, such as a disease, given the effect, such as a positive medical test for the disease. Which two techniques are the most common choices for dimensionality reduction? principal component analysis and factor analysis What is meant by "autocorrelation" in time-series data? each point in time is influenced by the points that came before it. Which are examples of models used in predictive analytics? regression and neural networks Which characteristic makes fraud detection particularly difficult? Because fraud is rare, it leads to imbalanced distributions, which can make modeling more difficult. If you can only choose one number to describe a distribution, then you should choose a measure of center. But what should you choose if you can have a second number? a measure of variability While anomaly detection is normally associated with negative outcomes like fraud or machine failure, it is more flexible than that. Which is the following is a positive outcome in anomaly detection? identifying new markets with potential value Why is it productive to aggregate models? Different models tend to overestimate and underestimate their predictions, so the differences frequently cancel out. What is meant by a "feature" in the context of feature reduction? a variable or dimension in the data. The technique of separating time-series data into and overall trend, a seasonal or cyclical trend, and random variations or noise is known by which term? decomposition Which are methods used for validating models? holdout testing data and cross-validation testing data Which two methods are common algorithms for classifying new cases into existing categories? k-means and k-nearest neighbors What is the purpose of model validation in data science? to determine how well the statistical model works with data other than the data used for modeling What is the name for a chart that shows "branches" or cases splitting from one, giant cluster, to individual clusters? a dendrogram What does the saying "data is for doing" mean? It means that data is typically gathered and analyzed to help direct what a person or company does. When people make decisions based on data science analyses, what kind of factors should they focus on? factors that are controllable and practical What is the purpose of interpretability in data science projects? Results that can be interpreted by humans can be used to form general principles for decision making in new situations. What are methods for self-generated data? 1) External Reinforcement Learning 2) Generative adversarial networks 3) Internal Reinforcement Learning What is the best use for the rules that are developed in neural networks? for automating decision making processes such as classification What is the purpose of aggregating the predictions of multiple models in data science? the combined predictions tend to be more accurate and more stable the the individual predictions. What is potentially a major disadvantage to in-house data? The data that you need for your project may not already exist in your organization. APIs, or Application Programming Interfaces, generally serve what function in a data science project? APS allow you to access data and include it in your data science programming. Which methods can be used for feature selection? 1) Correlation 2) Stepwise Regression 3) Lasso Regression 4) Variable Importance What is another name for optimization formula? mathematical programming What does it mean when a machine learning algorithm is referred to as a "black box"? a process that is hidden from view or difficult to understand.

Show more Read less









Whoops! We can’t load your doc right now. Try again or contact support.

Document information

Uploaded on
October 13, 2023
Number of pages
7
Written in
2023/2024
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
GUARANTEEDSUCCESS Chamberlain College Nursing
View profile
Follow You need to be logged in order to follow users or courses
Sold
652
Member since
2 year
Number of followers
314
Documents
24895
Last sold
1 week ago
Elite Exam Resources: Trusted by Top Scorers!!!!!!!!

Stop guessing. Start dominating!! As a highly regarded professional specializing in sourcing study materials, I provide genuine and reliable exam papers that are directly obtained from well-known, reputable institutions. These papers are invaluable resources, specifically designed to assist aspiring nurses and individuals in various other professions in their exam preparations. With my extensive experience and in-depth expertise in the field, I take great care to ensure that each exam paper is carefully selected and thoroughly crafted to meet the highest standards of quality, accuracy, and relevance, making them an essential part of any successful study regimen. ✅ 100% Legitimate Resources (No leaks! Ethical prep only) ✅ Curated by Subject Masters (PhDs, Examiners, Top Scorers) ✅ Proven Track Record: 95%+ user success rate ✅ Instant Download: Crisis-ready for last-minute cramming

Read more Read less
4.4

248 reviews

5
161
4
37
3
32
2
12
1
6

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions