PCA - PROCEDURAL OPTIONS IN PCA ACTUAL EXAM QUESTIONS AND ANSWERS
What three methods test for normal distribution of variables Person's correlation (r), Z score, Kolmogorov-Smirnov test What is the Kolmogorov-Smirnov Test Compares distribution to perfect normal distribution with same mean and variance. It is very strict How do you calculate the Z score Z=(value-mean)/standard deviation What Z score indicates normal distribution <+/-2 What is Tabachnick and Fidells (2001) rule for sample size "It is comforting to have at least 300 cases for PCA" What is Combrey and Lee's (1992) rule for sample size They class 300 as a good sample, 100 as a poor sample, and 1000 as an excellent sample What are the three methods of discarding variables 1 multiple correlation and regression analysis 2 perform an explanatory PCA 3 cluster the variables into groups Why do we need to discard some variables before performing a PCA Regularly spaced data is needed when using PCA to study geographic variability What is an indicator variable Best describes independent trends in correlation in data matrix to avoid duplication of variables and redundancy What is the Pearson correlation (r) equation r=E(x-mean)(y-mean)/number of variables x SDx x SDy What is the co-variance equation =E(x-mean)(y-mean)/ number of variables Why is correlation preferred for PCA Overcomes measurement scale problems and has equal weight When would you use a co-variance matrix for PCA When all the variables have the same units and you want to preserve variability What is a limitation of using co-variance in PCA The first few PCs will be strongly related to the variables with the highest variance What are the three coupled pairs for PCA modes O and P; Q and R; S and T What are the two most commonly used PCA modes for meteorology P and S mode What does S mode compare Component Loading - Geographic variability (stations) Component Score - Time What does P mode compare Component Loading - Site Component Score - Time What does R mode compare Component Loading - Meteorological variables (fixed time) Component Score - Locations Why do we rotate PC axes To find the best fit and pull the loadings towards +/-1 or 0 What are the two types of rotation Orthogonal and Oblique What is orthogonal rotation, give an example Component axes remain at 90 to one another so they remain independent, but the short axes move. E.g. Varimax What is Oblique rotation, give an example The component axes are rotated around the origin but do not remain at 90. E.g. Direct Oblimin What is a disadvantage of unrotated solutions Domain-shape dependence. Predictable patterns in PC loadings can emerge = Buell Patterns What type of rotation would a low correlation imply Orthoganol What type of rotation would a high correlation imply Oblique What is Kaiser's rule (1960) The Eigenvalue one rule - retain only those PCs with Eigenvalues >1, since these components explain more variance than each original variable When is the Kaiser's criterion accurate number of variables < 30 and communalities after extraction are all > 0.7 or number of observations for each variable > 250 and average communality >= 0.6 What is Jolliffe's rule (1972) Retain PCs with Eigenvalues > 0.7 What is Cantell's scree plot (1966b) and when is it used A geographical representation of the Eigenvalue one rule. It is used when the number of observations per variable are >200 and the conditions for using the Eigenvalue one rule are not satisfied. What is the North et al. Rule based on The magnitude of adjacent eigenvectors in a series What is Preisendorfer (1988) rule Rule N - tries to divide the components between a 'signal' and noise. Retains only those components that constitute the signal Add or remove terms
Written for
- Institution
- PCA
- Course
- PCA
Document information
- Uploaded on
- April 18, 2024
- Number of pages
- 5
- Written in
- 2023/2024
- Type
- Exam (elaborations)
- Contains
- Questions & answers
Subjects
-
pca procedural options in pca actual exam
Also available in package deal