Factor analysis digression

Set up some factor scores

f = rnorm(100,0,15)

Some noise for the five manifest variables

e1 = rnorm(100,0,5)
e2 = rnorm(100,0,5)
e3 = rnorm(100,0,20)
e4 = rnorm(100,0,50)
e5 = rnorm(100,0,5)

And then the variables themselves, generated from the latent variable. Lots of stuff to play with, e.g. the slopes and the noise term, to see what affects the factor loadings…

x1 = 200 * f + e1
x2 = 5 * f + e2
x3 = 2 * f + e3
x4 = 2 * f + e4
x5 = 2 * f + e5

Now let’s use factor analysis to get back f:

fa1 = factanal(~ x1 + x2 + x3 + x4 + x5, factors=1, scores = “Bartlett”)

The output

> fa1

Call:
factanal(x = ~x1 + x2 + x3 + x4 + x5, factors = 1, scores = “Bartlett”)

Uniquenesses:
x1 x2 x3 x4 x5
0.005 0.005 0.348 0.677 0.029

Loadings:
Factor1
x1 0.998
x2 0.998
x3 0.808
x4 0.569
x5 0.986

Factor1
SS loadings 3.940
Proportion Var 0.788

Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 45.24 on 5 degrees of freedom.
The p-value is 1.3e-08

Compare the scores from FA (should have variance 1, mean 0) with f.

> as.vector(fa1$scores)
[1] 0.46229282 -0.60935524 1.68547486 0.32820617 -0.31099996 …
> as.vector(scale(f))
[1] 0.45208732 -0.66695823 1.63456194 0.33728772 -0.35131222  …

Looks good, and hopefully the theoretical model in my head isn’t too far off.

Next stop, SEM with a latent variable in…

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s