Exploratory Factor Analysis
Exploratory Factor Analysis is a statistical technique that is primarily used to reduce a larger number of variables into smaller number of factors, and to identify and explore the underlying theoretical pattern of the concept. There are two common methods that are adopted in Exploratory Factor Analysis:
- Principal Component Method: This method is appropriate when we want to extract maximum variance from the original data using minimum number of factors.
- Common Factor Analysis: This method is appropriate when the nature of the underlying factors is not known in advance.
In deciding about the number of factors to be extracted for a concept we need to review the theory. This provides us with a general idea about the number of factors that are appropriate for the construct to make sense. Empirically there are three criteria to determine the number of factors to be extracted:
- Eigen value criteria (>1)
- Scree plot
- Variance Explained
The rotation method is the second thing that we need to decide while performing Exploratory Factor Analysis. There are a couple of options that are available and depending on the research requirements we can chose anyone of them:
- Orthogonal Rotation: The axis are rotated in such a way that the 90 degree angle is maintained. Thus, the factors extracted are independent. Under orthogonal method there are three options that are available:
- Quartimax: Rows are simplified so that variable load on a single factor.
- Varimax: Columns are simplified so as to improve the chances of each variable being associated with a unique factor.
- Equimax: This method simplifies both the rows and columns.
- Oblimin Rotation: The axis are rotated in a way such that there is association between them. The extracted factors are not completely independent of each other.
The next thing that we need to determine is the factor loading threshold value to keep or drop a variable. In general, the threshold value varies between 0.3 to 0.50.
Some of the Assumptions of Exploratory Factor Analysis are:
- The variables should be measured on either ordinal or continuous scale.
- Typically we require a sample that is 5-10 times the number of variables. So if there are 15 variables we require a minimum of 75-150 respondents.
- The variables should be correlated (the data should be homogeneous).
- Data should not have any outliers.