Georgia Tech ISYE 6501: Introduction to Analytics Modeling Professor: Dr. Joel Sokol Homework 3. 100% pass rate.
Georgia Tech ISYE 6501: Introduction to Analytics Modeling Professor: Dr. Joel Sokol Homework 3. 100% pass rate. Document Content and Description Below ISYE 6501: Introduction to Analytics Modeling Professor: Dr. Joel Sokol Homework 3 31 January 2018Overview This week’s lesson involves data preparation, including outlier identification, handlin g outliers, and an introduction to change detection. Data preparation involves inspecting data visually for outliers and using a statistical test, Grubbs Test, to detect outliers in a univariate data set assumed to come from a normally distributed population. The null and alternative hypotheses are two mutually exclusive statements about a population. A hypothesis test uses sample data to determine whether to reject the null hypothesis. The null hypothesis states that all the data values come from the same normal distribution. The alternative hypothesis states that either the smallest or largest data value is an outlier.1 The CUMSUM test is used for change detection. CUSUM: St = max{0, St-1 + (xt – mu - C)} Is St >= T? Calculate metric St and declare an observed change when St goes above some threshold (T). At each time period, observe xt and see how far above the expectation it is (xt – mu) and add it to the previous period’s metric (St-1). Take the max of 0 and that value (essentially keep the value if it’s > 0), else reset running total to zero. Sometimes there are random values (up to 50% of time), so we include a value C to pull the running total down a little bit. The bigger the C, the harder it is for to St to get large and the LESS SENSITIVE the model is. The smaller the C, the more sensitive the model is since St can get larger faster. How do you choose good values for C and t so the model is finds changes quickly but isn’t too sensitive? Use data! Evaluate how costly the C and T boundaries are to your situation. Higher T = slower detection but less false detection changes. Lower T = faster detection but more likely to falsely detect changes. Question 5.1 – Crime Data Analysis Using crime data from
Written for
- Institution
- ISYE 6501
- Module
- ISYE 6501
Document information
- Uploaded on
- April 25, 2023
- Number of pages
- 11
- Written in
- 2022/2023
- Type
- Exam (elaborations)
- Contains
- Questions & answers
Subjects
-
georgia tech isye 6501 introduction to analytics modeling professor dr joel sokol homework 3 100 pass rate document content and description below isye 6501 introduction to analytics modeling pr
Also available in package deal