Chi-square in SEM

May 11, 2008 by Andy

Playing around again with SEM. Just where does that \chi^2 come from? Here’s a brain dump of the gist.

You start with the sample covariance matrix (S) and a model description (quantitative boxology; CFA tied together with regression). The fit machinery gives you estimates for the various parameters over several iterations until the difference between S and the “implied” covariance matrix (i.e., the one predicted by the model, C) is minimised and out pops the final set of estimates. Then you multiply that difference between S and C by (N - 1) to get something out with a \chi^2 distribution.

Marvellous.

First how do we get C? Loehlin (2004, p. 41) to the rescue:

C = F \cdot (I-A)^{-1} \cdot S \cdot (1 - A)^{-1'} \cdot F'

Here A and S have the same dimensions as the sample covariance matrix. (This is a different S to the one I mentioned above—don’t be confused yet.)

A contains the (assymetric) path estimates, S contains the (symmetric) covariances and residual variances (the latter seem to be squared—why?), and F is the so called filter matrix which marks which variables are measured variables. (I is the identity matrix and M' is the transpose of M.)

I don’t quite get WHY the implied matrix is plugged together this way, but onwards…

So now we have a C. Take S again—the sample covariance matrix. Loehlin gives a number of different criterion measures which tell you how far off C is. I’m playing with SEM in R so let’s see what John Fox’s package does… SEEMS to be this one:

\mbox{tr}(SC^{-1}) + \mbox{log}(|C|) - \mbox{log}(|S|) - n

where \mbox{tr} is the trace of a matrix and is the sum of the diagonal, and |M| is the determinant of M. Oh and n is the number of observed variables.

The R code for this (pulled and edited from the null \chi^2 calculation in the sem fit function) is

sum(diag(S %*% solve(C))) + log(det(C)) - log(det(S)) - n

Here you can see trace is implemented as a sum after a diag. The solve function applied to only one matrix (as here) gives you the inverse of the matrix.

Let’s have a quick poke around with the sem package using a simple linear regression:

require(sem)

N=100

x1 = rnorm(N, 20, 20)
x2 = rnorm(N, 50, 10)
x3 = rnorm(N, 100, 15)
e = rnorm(N,0,100)

y = 2*x1 - 1.2*x2 + 1.5*x3 + 40 + e

thedata = data.frame(x1,x2,x3,y)

mod1 = specify.model()
y <->y, e.y, NA
x1 <->x1, e.x1, NA
x2 <->x2, e.x2, NA
x3 <->x3, e.x3, NA
y <- x1, bx1, NA
y <- x2, bx2, NA
y <- x3, bx3, NA

sem1 = sem(mod1, cov(thedata), N=dim(thedata)[1], debug=T)
summary(sem1)

When I ran this, the model \chi^2 = 4.6454.

The S and C matrices can be extracted using

sem1$S
sem1$C

Then plugging these into the formula …

N = 100
n = 4

S = sem1$S
C = sem1$C

(N - 1) *
(sum(diag(S %*% solve(C))) + log(det(C))-log(det(S)) - n)

… gives… 4.645429.

One other thing: to get the null \chi^2 you just set C as the diagonal of S.

Next up, would be nice to build C by hand for particular model and its parameter estimates…

Reference

Loehlin, J. C. (2004). Latent Variable Models (4th ed). LEA, NJ, USA.

And I thought this would be easier than just making the completion sets…

May 2, 2008 by Andy

?- Q.
% … 1,000,000 ………… 10,000,000 years later
%
% >> 42 << (last release gives the question)

Some advice on factor analysis from the 60s

April 30, 2008 by Andy

“What are the alternatives to factor (or component) analysis if one has a correlation whose analysis one cannot escape? There is only one alternative method of analysing a correlation matrix which needs to be mentioned, and that is to LOOK AT IT.”

“Quite the best alternative to factor analysis is to avoid being saddled with the analysis of a correlation matrix in the first place. (Just to collect a lot of people, to measure them all on a lot of variables, and then to compute a correlation matrix is, after all, not a very advanced way of investigating anything.)”

From Andrew S. C. Ehrenberg (1962). Some Questions About Factor Analysis. The Statistician, 12(3), 191-208 [Thanks 'lexis for the ref!]

13 ways to look at (Galton-Pearson) correlation

April 27, 2008 by Andy

Found this paper on having a nosy around to see different ways of correlating non-Gaussian variables: Joseph Lee Rodgers and W. Alan Nicewander (1988). Thirteen Ways to Look at the Correlation Coefficient. The American Statistician, 42(1), 59-66.

Therein you’ll find details of the history (apparently Gauss got there first, but didn’t care about the special case of bivariate correlation); a range of examples of how to get the coefficient (e.g., standardised covariance, standardised regression slope, a geometric interpretation in “person space”, the balloon rule). Also a nice reminder that, in terms of the maths, the dichotomy between experimental and observational analysis is false: the difference lies in interpretation. Still many people seem to think that ANOVA is for experiments and regression is for observational studies (or that SEM magically deals with causation in observational studies).

All amusing stuff.

Sex differences in psychology

April 25, 2008 by Andy

My take on this:

  1. There are sex differences in ability, but not many, and the effect size is typically small* (Hyde 2005).
  2. Brain structure development is affected by a range of factors, environmental and genetic. For instance brain structure changes as a result of learning (e.g., Maguire et al, 2000). (And the phrase “hard-wired” is annoying.)
  3. A mean difference between groups, mean(group 1) > mean(group 2), on some measure does not imply that everyone in group 1 is better than everyone in group 2. So when selecting someone for a job, say, you could (a) grab a load of people with the sex which, on average, has (very slightly—see point 1) more of the ability you want and choose someone at random, or (b) you could choose someone who has more of the ability you want, and not focus on what genitalia they happen to possess.
  4. The designers of IQ tests hack their tests to remove sex differences, for instance the designers of the British Ability Scales (version 2) “used three strategies to test for fairness and to remove items likely to increase bias” (Hill, 2005). Blinkhorn (2005) says: “Where there are sex differences to be found, detailed study of the internal workings of the test tends to show why. That’s not based on instinct, but on my professional experience in designing gender-fair tests.”

Blinkhorn, S. (2005). Intelligence: a gender bender. Nature, 438, 31-32.

Hill, V. (2005). Through the Past Darkly: A Review of the British Ability Scales. Second Edition. Child and Adolescent Mental Health, 10, 87-98.

Hyde, J. S. (2005). The gender similarities hypothesis. American Psychologist, 60, 581-592.

Maguire, E. A.; Gadian, D. G.; Johnsrude, I. S.; Good, C. D.; Ashburner, J.; Frackowiak, R. S. & Frith, C. D. (2000). Navigation-related structural change in the hippocampi of taxi drivers. Proceedings of the National Academy of Sciences of the United States of America, 97, 4398-4403

* Women are presumably better at giving birth, with a large effect size.

Meta-review (and another study) of unconscious decision making

April 25, 2008 by Andy

See over here: Acker, F. (2008). New findings on unconscious versus conscious thought in decision making: additional empirical data and meta-analysis. Judgment and Decision Making, 3(4), 292-303.

From the abstract: “[...] there is little evidence for an advantage to normative decision making using unconscious thought. However, a discussion of potential moderators shows that further study would help to identify situations in which unconscious thought is truly helpful and those in which it is not.”

Hints of individual differences! How cool!

See also here.

A couple of properties of correlation

April 24, 2008 by Andy

Spotted these in Langford, E.; Schwertman, N. & Owens, M. (2001) [Is the Property of Being Positively Correlated Transitive? The American Statistician, 55, 322-325.]

1. Let U, V, and W be independent random variables. Define X = U+V, Y = V+W, and Z = W-U. Then the correlation between X and Y is positive, Y and Z is positive, but the correlation between X and Z is negative.

It’s easy to see why.  X and Y are both V but with different uncorrelated noise terms. Y and Z have W in common, again with different noise terms. Now X and Z have U in common: for this pair, X is U plus some noise and Z is -U plus some noise which is uncorrelated with the noise in X.

2. If X, Y, and Z are random variables, and X and Y are correlated (call the coefficient r_1), Y and Z are correlated (r_2), and r_1^2 + r_2^2 > 1, then X and Z are positively correlated.

And… hmm… I’m not sure why this holds.

    Nice quote on how perception determines your world

    April 11, 2008 by Andy

    From a lovely article by John Hull in New Statesman.

    “Sighted people, for the most part, do not recognise themselves as sighted. What I mean is that they seldom appreciate the extent to which they live in a world which is a projection from their sighted bodies. This leads to the common mistake of thinking that one’s own world is actually the only world, and so sighted people tend to unconsciously look upon those who are not in their sighted world as being without any world, and thus to be pitied. It took me a long time to realise that blindness is actually a world, a distinctive human way of living and being.”

    “Conscious” reasoning

    April 4, 2008 by Andy

    Read this:

    For us, however, a key difference is that only conscious reasoning can make use of working memory to hold intermediate conclusions, and accordingly reason in a recursive way (Johnson-Laird, 2006, p. 69): primitive recursion, by definition, calls for a memory of the results of intermediate computations (Hopcroft & Ulmann, 1979). [... example task omitted ...] The non-recursive processes of intuition cannot make this inference, but when we deliberate about it consciously, we grasp its validity (Cherubini & Johnson-Laird, 2004). Conscious reasoning therefore has a greater computational power than unconscious reasoning, and so it can on occasion overrule our intuitions.

    There’s no evidence that whatever bits of memory intuition uses cannot do recursion.  Hunting through semantic memory structures can be viewed as a recursive process and the process is not (at least always) accessible to consciousness.  Aside from this, you can impose recursion on just about any process you care to analyse, and you can often remove recursion from a process description depending on what primitives are available.  Questioning whether a process “is” or “isn’t” recursive isn’t a healthy activity.  Also the jump from “recursive” to “primitive recursive”, as if they were one and the same, is deeply confusing.  See the Stanford Encyclopaedia of Philosophy for details of other flavours of recursion.

    Bucciarelli, M.; Khemlani, S. & Johnson-Laird, P. N. (2008). The psychology of moral reasoning. Judgment and Decision Making, 3, 121-139

    Note to self

    April 4, 2008 by Andy

    If you decide to write a review of some corner of the literature, please mention effect sizes (but note this) to maximise its utility.