All scientific theories require measurement of the constructs underlying the field. Personality theories are no different. Whether we are developing theories of species typical behavior, of individual differences in behavior, or unique patterns of thoughts and feelings, we need to be able to measure the responses in question. The fields of psychometrics and personality assessment are devoted to the study of the measurement of pscyhological constructs associated with personality.
Consider the case of differences in vocabulary in a particular language (e.g., English). Although it is logically possible to organize people in terms of the specific words they know in English, the more than 2^(500,000) possible response patterns that could be found by quizzing people on each of the more than 500,000 words in English introduces more complexity rather than less. Classical Test Theory (CTT) ignores individual response patterns and estimates an individual's total vocabulary size by measuring performance on small samples of words. Words are seen as random replicates of each other and thus individual differences in total vocabulary size are estimated from observed differences on these smaller samples. The Pearson Product Moment Correlation Coefficient (r) compares the degree of covariance between these samples with the variance within samples. As the number of words sampled increases, the correlation of the individual differences within each sample and with those in the total domain increases accordingly.
Estimates of ability based upon Item Response Theory (IRT) take into account parameters of the words themselves (i.e., the difficulty and discriminability of each word) and estimate a single ability parameter for each individual. Although CTT and IRT estimates are highly correlated, CTT statistics are based on decomposing the sources of variance within and between individuals while IRT statistics focus on the precision of an individual estimate without requiring differences between individuals. CTT estimates of reliability of ability measures are assessed across similar items (internal consistency), across alternate forms, and across different forms of assessment as well as over time (stability). Tests are reliable to the extent that differences within individuals are small compared to those between individuals when generalizing across items, forms, or occasions. CTT reliability thus requires between subject variability. IRT estimates, on the other hand, are concerned with the precision of measurement for a particular person in terms of a metric defined by item difficulty.
The test theory developed to account for sampling differences within domains can be generalized to account for differences between domains. Just as different samples of words will yield somewhat different estimates of vocabulary, different cognitive tasks (e.g., vocabulary and arithmetic performance) will yield different estimates of performance. Using multivariate procedures such as Principal Components Analysis or Factor Analysis, it is possible to decompose the total variation into between domain covariance, within domain covariance, and within domain variance. One of the most replicable observations in the study of individual differences is that almost all tests thought to assess cognitive ability have a general factor (g) that is shared with other tests of ability. That is, although each test has specific variance associated with content (e.g., linguistic, spatial), form of administration (e.g., auditory, visual), or operations involved (e.g., perceptual speed, memory storage, memory retrieval, abstract reasoning), there is general variance that is common to all tests of cognitive ability.
A guide to R for the personality researcher as well as a package of functions particularly suited for personality measurement is now part of the personality project. The R suite of programs includes many useful for the personality researcher, including factor analysis, structural equation modeling, and multidimensional scaling. The psych package includes basic tools for scale construction and analysis, including finding basic descriptive statistics, using the Very Simple Structure (VSS) criterion for determing the optimal number of factors , cluster analysis of items using the ICLUST algorithm, hierarchical factor analysis with Schmid Leiman tranformations, and procedures for estimating alternative measures of test reliablility (i.e., alpha, beta, and omega.)
Statistical packages for personality research are also commercially available in such programs as SPSS, SYSTAT, or SAS. Bob Muenchen has developed a comparison of the features of R with SAS and SPSS.
Some useful publications from the APA are available online:
Comments, criticism, suggestions for additions or deletions, etc. should be sent to
William Revelle, Director
Graduate Program in Personality
Department of Psychology