by Frank Sulloway


The critical relevance of birth-order research to Harris's controversial thesis about the family is best understood in connection with the cumulative findings of behavioral geneticists. In studies of twins raised together and apart, researchers have shown that about 40 percent of the variance in personality traits is attributable to genetics. Another 35 percent of the variance is attributable to the nonshared environment (that is, experiences that are not shared by siblings who have grown up together). Of the remaining variance in personality, about 20 percent is associated with errors in measurement, which leaves just 5 percent that can be explained by the shared environment (or family milieu).

One of the most important unanswered questions arising from these behavioral genetics findings is the precise nature of the nonshared environment, which constitutes the lion's share of all environmental influences. The nonshared environment has two main sources: family microenvironments and those extrafamilial experiences, including peer group influences, that are not shared by siblings. By proclaiming that personality is primarily shaped by peer groups, Harris is forced to deny an overwhelming body of evidence that supports the influence of family microenvironments on personality. The single most convincing threat to Harris's extreme thesis is the voluminous research on birth order. Because there are no genes for being a firstborn or a laterborn, birth-order effects must be attributed to differences in within-family environments. (1)

At first glance, the literature on birth order does not appear to provide encouraging support for the influence of family microenvironments. In their important 1983 book, two Swiss investigators--Cecile Ernst and Jules Angst--undertook a comprehensive review of more than a thousand studies on this subject that had appeared between 1946 and 1980. Ernst and Angst argued that when studies are controlled for background factors that often confound results (and lead to spurious conclusions), birth-order differences are not generally observed. They concluded that birth-order effects are mostly artifacts of uncontrolled background influences, principally differences in social class and sibship size.

Based on Ernst and Angst's critical verdict, Harris reiterated this viewpoint in her 1995 Psychological Review article that outlined the basic thesis of her new book. Unfortunately, Ernst and Angst's literature review was impressionistic--that is, conducted without the benefit meta-analytic techniques, which, in the early 1980s, were beginning to be employed in the biomedical and social sciences. In the Preface to their 1983 book, these two authors regretted that they had not taken advantage of these newer methods, which have shown themselves, in the interim, to be far more reliable than impressionistic forms of review.

In an article published in 1995, I performed what is called a "vote counting" meta-analysis of the birth-order data summarized in Ernst and Angst's book. In this form of meta-analysis, the number of significant findings in the published literature is compared with the number of null outcomes. For those studies reported as being controlled for social class or sibship size, I found a confirmation rate of 37 percent for 196 controlled findings. To the nonspecialist, such a modest confirmation rate might seem lackluster, but it is actually statistically impressive because the expected hit rate is nowhere near 100 percent. Given the median size of birth-order studies (about 250 subjects) and assuming that birth-order effects on personality are roughly of the same magnitude as those observed for age and sex, the best one could possibly hope for is a confirmation rate of about 50-60 percent. By contrast, if there are no true birth-order effects, the expected confirmation rate is only 2.5 percent (based on a two-tailed statistical test). Thus, the observed confirmation rate for birth-order effects is 15 times higher than the expected rate and leads to a very different conclusion than Ernst and Angst themselves reached. These impressive meta-analytic findings represent the Achilles heel of Harris's argument about personality development, which explains why Harris is so concerned about repudiating these findings.


In my 1996 book Born to Rebel, I summarized my previous meta-analytic results in the context of a Darwinian theory of personality development based on sibling competition. It is relevant that Harris's own response to my book was inspired by a review in Science magazine, by the historian John Modell, who remarked that he had been unable to replicate my meta-analytic totals. When I contacted Modell in order to find out exactly why he had been unable to replicate my results, it became apparent that Modell had overlooked a crucial footnote at the bottom of the table in which I presented my meta-analytic findings. In this footnote, I explicitly state that I had tallied my results in terms of individual "findings" rather than "studies." Because Modell counted studies--ignoring multiple findings in the same study--he naturally obtained different totals. Modell subsequently acknowledged his mistake to me (personal communication).

Meta-analytic tallies in terms of "studies" make no sense. Findings are what matter. This is especially true given the goals of my own meta-analysis, which sought to test specific hypotheses about sibling strategies in terms of the Big Five personality dimensions. For example, I expected firstborns to be more conscientious than laterborns, and I expected laterborns to be more agreeable and open to experience than firstborns. Each birth-order study may report multiple findings relevant to each of these different hypotheses. A single study may therefore confirm one hypothesis and refute others. In addition, a study with multiple findings should not be given the same weight as a study containing only a single finding. Thus Modell's decision to count studies rather than findings was an inherently bad idea.

Unfortunately, Judith Harris followed in Modell's methodological footsteps, basing her own meta-analytic counts on "studies." She did so in spite of being fully aware that my own counts were by "findings." Additionally, she made no effort to ascertain how these two alternative methods of counting might differ in their outcomes. Harris subsequently submitted her meta-analytic results to two mainstream psychology journals, both of which rejected her manuscripts. As a reviewer for the second of these two journals, I first became aware of the discussion and arguments that are now presented, in much the same form, in Harris's EDGE commentary and as Appendix 1 of her book.

Not only does counting by studies lead to different results (a fact that Harris eagerly exploited as part of her critique), but it also distorts the ratio between confirmations and refutations. If, for example, a given study reported that firstborns are more conscientious than laterborns, but also more agreeable, I counted one confirmation and one refutation (in accordance with my formal hypotheses for these two dimensions). By contrast, Harris classified such "mixed" results as a single null outcome. Harris's method of counting "mixed" results as nulls tends to underestimate the number of confirming findings and to overestimate the number of nulls, skewing the results in favor of her own theoretical biases. By contrast, Harris's equally inappropriate procedure of counting interaction effects as single positive or negative outcomes has the opposite consequence. (An interaction effect occurs, for example, when birth-order effects hold for men but not for women.) It would be nice to think that these two sources of errors cancel one another out, but Harris did not discuss this issue.

There is another reason why Modell and Harris both obtained differing meta-analytic totals from my own: unbeknownst to them, their tallies were riddled with errors. Neither of these two investigators consulted the original literature, relying instead on Ernst and Angst's (1983) summaries of the birth-order research through 1980. It is customary for researchers performing a meta-analysis of a specific literature to actually read the original literature. Accordingly, before I undertook my own meta-analysis, I examined more than two hundred of the original publications in an effort to verify Ernst and Angst's tabulations. In the process, I found at least 45 errors and inconsistencies, which I corrected before tallying my results. (A formal compilation of these errors is available from the author.) Last January, I sent a complete list of these errors to Judith Harris, indicating the publications involved, the specific nature of the reporting errors, and the pages in Ernst and Angst's book where the reporting errors occur. During the seven months between her receipt of this list of errors and the publication of her book, Harris made no attempt to verify these inaccuracies or to implement the necessary corrections in her own tallies. Instead, she has published her original tallies in unaltered form as part of her critique of my own meta-analysis. She has also withheld from her readers the extent of these errors, as well as her own prior knowledge of this information. In science, the knowing publication of erroneous data is considered serious misconduct. (2)

Given Harris's inappropriate method of counting "studies" rather than "findings," as well as her refusal to correct her counts for the 40-odd errors that I identified in her overall tallies, it is hardly surprising that Harris and I reached differing totals in our respective meta-analytic counts. In spite of these errors, Harris's confirmation rate was only modestly lower than mine (29 percent, for 179 studies, versus 37 percent for my total of 196 findings). Moreover, her tallies--distorted as they were--did not support her theoretical position, which becomes apparent when statistical tests are applied to her results.

