Prüfung

ISYE 6501 FINAL EXAM WITH COMPLETE SOLUTION 2022/2023

Bewertung

Verkauft

seiten

Klasse

A+

Hochgeladen auf

08-12-2022

geschrieben in

2022/2023

ISYE 6501 FINAL EXAM WITH COMPLETE SOLUTION 2022/2023 1. Factor Based Models: classification, clustering, regression. Implicitly assumed that we have a lot of factors in the final model 2. Why limit number of factors in a model? 2 reasons: overfitting: when # of factors is close to or larger than # of data points. Model may fit too closely to random effects simplicity: simple models are usually better 3. Classical variable selection approaches: 1. Forward selection 2. Backwards elimination 3. Stepwise regression greedy algorithms 4. Backward elimination: variable selection; classical Opposite of forward selection. Start with model with all factors, at each step find worst factor and remove from model. Continue until no more to add, # of factor threshold is satisfied. Remove factors at the end that were not good enough 5. Forward selection: variable selection; classical Start with model with no factors, at each step find best new factor to add. Continue until none bad enough to remove, # of factor threshold is satisfied. Remove factors at the end that were not good enough 6. Stepwise regression: variable selection; classical Combination of forward selection and backwards elimination. Start with all or no factors. Each step remove/add a factor. As it continues, after adding in new factor we eliminate right away any factors that may be good. Helps model adjust when new factors are added, goodness values change 7. Ways of determining if factors are good enough in variable selection: p-value, Rsquared, AIC, BIC 8. Greedy algorithm: At each step, it does the one thing that looks best without taking future options into consideration. Good for initial analysis 1. Forward selection 2. Backwards elimination 3. Stepwise regression 9. Global variable selection approaches: 1. LASSO 2. Elastic Net Slower, but tend to give better predictive models 10. LASSO: variable selection; global - SCALE the date (as with any constrained sum of coefficients) - add a constraint to the standard regression equation - minimize sum of squared errors - T = limit or "budget" on how large the sum of squared errors can get. Budget will be used on most important coefficients - Method for limiting the number of variables in a model by limiting the sum of all coefficients' absolute values. Can be very helpful when number of data points is less than number of factors. 11. Elastic Net: variable selection; global - SCALE the date (as with any constrained sum of coefficients) - T = limit or "budget" on how large the sum of squared errors can get. Budget will be used on most important coefficients - Combination of lasso and ridge regression. - Variable selection benefits of LASSO - Predictive benefits of ridge regression 12. Ridge Regression: - Method of regularization by limiting the sum of the squares of the coefficients. Will reduce the magnitude of coefficients, not the number of variables chosen. - The quadratic term in ridge regression tends to shrink the coefficient values i.e Whatever the basic regression model coefficients would be, the quadratic constraint pushes them toward zero or regularizes them. 13. Design of Experiments (DOE): How can we still have a representative sample of each combination of factors, while only surveying 600 people? How to determine which of the several factors are most important to predicting someone's answers? comparison to measure difference control for other factors and effects blocking factors that account for the variation between factors (red sports car vs red minivan example) 14. A/B testing: Whenever we want to choose between 2 alternatives. As long as the following 3 things are true: 1st, we need to be able to collect a lot of data quickly enough to get an answer in time to use it. 2nd, the data we collect has to be from a representative sample of the whole 3rd, the amount of data we collect has to be small compared to the total population we want to use the answer on. Before modeling and before collecting data 15. (Full) Factorial Design: Test every combination of variables in an experiment to find each one's effect, and interaction effects on the outcome. 16. Fractional Factorial Design: A subset of combinations to test - selected combinations give same result as full factorial design i.e a balanced design Before modeling and before collecting data 17. What approach to take if it is believed the factors we can change are independent? (Factorial design): Test a subset of combinations and use regression to estimate the effect of each choice Before modeling and before collecting data 18. Exploration vs. Exploitation: Exploration - focusing on getting more information Exploitation - getting immediate value 19. Multi-Armed Bandit Problem: Exploration/Exploitation principle Several slot machines, not known which has the highest payout, so must test all (K) alternatives 1st test = equal probability 2nd test = update probabilities based off of 1st test (we can also change: # of tests, how we update probabilities, change how we assign new tests) 20. What needs to be the case when matching data to a probability distribution to gain insight based on how the distribution is derived?: The only information we have about a data point is the response, or when it would be hard to collect and analyze additional information 21. What is the Bernoulli distribution is useful to model?: A single event. i.e flipping a coin, will it rain or not?, will I get this job offer or not? Only really useful when you put many of them together (flip a coin 10,000 times) 22. Describe a Bernoulli distribution in terms of a coin toss test: Probability (p) that a single coin flip comes up heads and probability (1-p) that the coin comes up tails 23. Define a Binomial distribution: The probability of getting x yes answers out of n independent Bernoulli trials, each with the probability p 24. When is the normal distribution useful as an estimate for the Binomial distribution?: When n is large, and for modeling errors (predictive models) 25. What is the question to describe a Geometric distribution?: How many (Bernoulli) trials are needed before we get an answer of a certain type? 26. What is the Poisson distribution good at modeling?: Random arrivals of people to lines, queues etc - The function gives the probability that x people do arrive given the average arrival rate (lambda) - assumes arrivals are independent, and identically distributed (i.i.d) 27. What is the Exponential distribution good at modeling?: The time between arrivals or trials (inter-arrival time) 28. How are the Poisson and Exponential distributions related?: If arrivals are Poisson, with arrival rate lambda, then the time between arrivals (inter-arrival time) follows the exponential distribution (1/lambda = inter-arrival) The same is true if inter-arrival time is exponential 29. When k = 1, the Weibull is what?: An Exponential distribution. Whether it's a failure rate that is constant (Weibull) or an inter-arrival rate (Exponential) 30. What is the Weibull distribution useful to model?: The amount of time it takes for something to fail, specifically the time between failures. 31. Describe a Q-Q plot: Whatever variations in the data there might be and even if the number of data points in two sets is very different, 2 similar distributions should have about the same value at each quantile. Could also use to match to a probability distribution (just calculate theoretical values of quantiles following the distro) 32. When k 1, the Weibull is good for modeling what?: When failure rate decreases with time. Worst things fails first 33. What 2 probability distributions are memoryless?: Poisson and Exponential 34. When k 1, the Weibull is good for modeling what?: When failure rate increases with time Things that wear out

Mehr anzeigen Weniger lesen

Hochschule

ISYE 6501

Kurs

ISYE 6501

Ups! Dein Dokument kann gerade nicht geladen werden. Versuch es erneut oder kontaktiere den Support.

Urheberrechtsverletzung melden

Schule, Studium & Fach

Hochschule: ISYE 6501
Kurs: ISYE 6501

Dokument Information

Hochgeladen auf: 8. dezember 2022
Anzahl der Seiten: 15
geschrieben in: 2022/2023
Typ: Prüfung
Enthält: Fragen & Antworten

Themen

clustering
isye 6501 final exam with complete solution 20222023
isye 6501 final exam with complete solution 20222023 1 factor based models classification
regression implicitly assumed th

14,34 €

Vollständigen Zugriff auf das Dokument erhalten:

100% Zufriedenheitsgarantie

Sofort verfügbar nach Zahlung

Sowohl online als auch als PDF

Du bist an nichts gebunden

Lerne den Verkäufer kennen

NewMatic

3,9

(70)

Lerne den Verkäufer kennen

NewMatic Chamberlain College Nursing

Profil betrachten

Folgen

Verkauft

392

Mitglied seit

3 Jahren

Anzahl der Follower

311

Dokumente

1358

Zuletzt verkauft

1 Jahren vor

NURSING STUDY GROUP

All MATERIAL POSTED BY ME IS VERIFIED. STUDYING MADE EASY.

3,9

70 rezensionen

Kürzlich von dir angesehen.

Warum sich Studierende für Stuvia entscheiden

on Mitstudent*innen erstellt, durch Bewertungen verifiziert

Geschrieben von Student*innen, die bestanden haben und bewertet von anderen, die diese Studiendokumente verwendet haben.

Nicht zufrieden? Wähle ein anderes Dokument

Kein Problem! Du kannst direkt ein anderes Dokument wählen, das besser zu dem passt, was du suchst.

Bezahle wie du möchtest, fange sofort an zu lernen

Kein Abonnement, keine Verpflichtungen. Bezahle wie gewohnt per Kreditkarte oder Sofort und lade dein PDF-Dokument sofort herunter.

“Gekauft, heruntergeladen und bestanden. So einfach kann es sein.”

Alisha Student

Häufig gestellte Fragen

Was bekomme ich, wenn ich dieses Dokument kaufe?

Du erhältst eine PDF-Datei, die sofort nach dem Kauf verfügbar ist. Das gekaufte Dokument ist jederzeit, überall und unbegrenzt über dein Profil zugänglich.

Zufriedenheitsgarantie: Wie funktioniert das?

Unsere Zufriedenheitsgarantie sorgt dafür, dass du immer eine Lernunterlage findest, die zu dir passt. Du füllst ein Formular aus und unser Kundendienstteam kümmert sich um den Rest.

Wem kaufe ich diese Zusammenfassung ab?

Stuvia ist ein Marktplatz, du kaufst dieses Dokument also nicht von uns, sondern vom Verkäufer NewMatic. Stuvia erleichtert die Zahlung an den Verkäufer.

Werde ich an ein Abonnement gebunden sein?

Nein, du kaufst diese Zusammenfassung nur für 14,34 €. Du bist nach deinem Kauf an nichts gebunden.

Kann man Stuvia trauen?

4.6 Sterne auf Google & Trustpilot (+1000 reviews) 45.681 Zusammenfassungen wurden in den letzten 30 Tagen verkauft Gegründet 2010, seit 16 Jahren die erste Adresse für Zusammenfassungen