The e-book is a work in progress. Chapters will appear sporadically. Parts of it are from the draft of a book being prepared for the Springer series on using R, other parts are just interesting tid-bits that would not be appropriate as chapters.
It is written in the hope that I can instill in a new generation of psychologists the love for quantitative methodology imparted to me by reading the popular and then later the scientific texts of Ray Cattell [Cattell, 1966b] and Hans Eysenck [Eysenck, 1964, Eysenck, 1953, Eysenck, 1965]. Those Penguin and Pelican paperbacks by Cattell and Eysenck were the first indications that I had that it was possible to study personality and psychology with a quantitative approach.
My course in psychometric theory, on which much of this book is based, was inspired by a course of the same name by Warren Norman. The organizational structure of this text owes a great deal to the structure of Warren's course. Warren introduced me, as well as a generation of graduate students at the University of Michigan, to the role of theory and measurement in the study of psychology. He introduced to me to the "bible" of psychometrics: Jum Nunnally's Psychometric Theory [Nunnally, 1967].
The students in my psychometric theory classes over the years, by their continuing questions and sometimes confusion, have given me the motivation to try to make this text as understandable and useful as I can. The members of the Society of Multivariate Experimental Psychology, by their willingness to share cutting (and sometimes bleeding) edge ideas freely and with respect for alternative interpretations have been a never ending source of new and exciting ideas.
This book would not be possible without the amazing contributions of the R-Core Team and the many contributers to R and the R-Help listserve.
Psychometrics is that area of psychology that specializes in how to measure what we talk and think about. It is how to assign numbers to observations in a way that best allows us to summarize our observations in order to advance our knowledge. Although in particular it is the study of how to measure psychological constructs, the techniques of psychometrics are applicable to most problems in measurement. The measurement of intelligence, extraversion, severity of crimes, or even batting averages in baseball are all grist for the psychometric mill. Any set of observations that are not perfect exemplars of the construct of interest is open to questions of reliability and validity and to psychometric analysis.
Although it is possible to make the study of psychometrics seem dauntingly difficult, in fact the basic concepts are straightforward. This text is an attempt to introduce the fundamental concepts in psychometric theory so that the reader will be able to understand how to apply them to real data sets of interest. It is not meant to make one an expert, but merely to instill confidence and an understanding of the fundamentals of measurement so that the reader can better understand and contribute to the research enterprise.
At first glance, it would seem that we have an infinite way to collect data. Measuring the diameter of the earth by finding the distance to the horizon, measuring the height of waves produced by a nuclear blast by nailing (empty) beer cans to a palm tree, or finding Avogadro's number by dropping oil into water are techniques that do not require great sophistication in the theory of measurement. In psychology we can use self report, peer ratings, reaction times, psychophysiological measures such as the Electric Encephelagram (EEG), the basal level of Skin Conductance (SC), or the Galvanic Skin Response (GSR). We can measure the number of voxels showing activation greater than some threshold in a functional Magnetic Resonance Image (fMRI), or we can measure life time risk of cancer, length of life, risk of mortality, etc. Indeed, the basic forms of data we can collect probably are unlimited. But in fact, it is possible to organize these disparate forms of data in terms of an abstract organization in terms of what is being measured and in comparison to what.
For the experimentalist, the problem becomes interpreting the effect of an experimental manipulation upon some outcome variable in terms of the effect of manipulation on the latent outcome variable and the relationship between the latent and observed outcome variables. For the observationalist, the observed correlation between the observed Person Variable and Outcome variable is interpreted as a function of the relationship between the latent person trait variable and the observed trait variable, the latent outcome variable and the observed outcome variable and most importantly for inference, the relationship between the two latent variables.
Parsimony of description has been a goal of science since at least the famous dictum commonly attributed to William of Ockham to not multiply entities beyond necessity1. The goal for parsimony is seen in psychometrics as an attempt either to describe (components) or to explain (factors) the relationships between many observed variables in terms of a more limited set of components or latent factors.
The typical data matrix represents multiple items or scales usually thought to reflect fewer underlying constructs2. At the most simple, a set of items can be be thought of representing random samples from one underlying domain or perhaps a small set of domains. The ques- tion for the psychometrician is how many domains are represented and how well does each item represent the domains. Solutions to this problem are examples of factor analysis (FA), principal components analysis (PCA), and cluster analysis (CA). All of these procedures aim to reduce the complexity of the observed data. In the case of FA, the goal is to identify fewer underlying constructs to explain the observed data. In the case of PCA, the goal can be mere data reduction, but the interpretation of components is frequently done in terms similar to those used when describing the latent variables estimated by FA. Cluster analytic techniques, although usually used to partition the subject space rather than the variable space, can also be used to group variables to reduce the complexity of the data by forming fewer and more homogeneous sets of tests or items.
Error may be both random as well systematic. Random error reflects trial by trial variabil- ity due to unknown sources while systematic error may reflect situational or individual effects that may be specified. Perhaps the classic example of systematic error (known as the personal equation) is the analysis of individual differences in reaction time in making astronomical ob- servations. "The personal equation of an observer is the interval of time which habitually intervenes between the actual and the observed transit of a star..." (Rogers, 1869). Before systematic individual differences were analyzed, the British astronomer Maskelyn fired his as- sistant Kennebrook for making measurements that did not agree with his own (Stigler, 1986). Subsequent investigations of the systematic bias (Safford, 1898; Sanford, 1889) showed con- sistent individual differences as well as the effect of situational manipulations such as hunger and sleep deprivation (Rogers, 1869).