Categorization of humans in biomedical research: genes, race and disease
forceful presentation of the second point - that racial differences are merely cosmetic - was given recently in an editorial in the New England Journal of Medicine [1]: "Such research mistakenly assumes an inherent biological difference between black-skinned and white-skinned people. It falls into error by attributing a complex physiological or clinical phenomenon to arbitrary aspects of external appearance. It is implausible that the few genes that account for such outward characteristics could be meaningfully linked to multigenic diseases such as diabetes mellitus or to the intricacies of the therapeutic effect of a drug."
The logical flaw in this argument is the assumption that the blacks and whites in the referenced study differ only in skin pigment. Racial categorizations have never been based on skin pigment, but on indigenous continent of origin. For example, none of the population genetic studies cited above, including the study of Wilson et al. [2], used skin pigment of the study subjects, or genetic loci related to skin pigment, as predictive variables. Yet the various racial groups were easily distinguishable on the basis of even a modest number of random genetic markers; furthermore, categorization is extremely resistant to variation according to the type of markers used (for example, RFLPs, microsatellites or SNPs).
Genetic differentiation among the races has also led to some variation in pigmentation across races, but considerable variation within races remains, and there is substantial overlap for this feature. For example, it would be difficult to distinguish most Caucasians and Asians on the basis of skin pigment alone, yet they are easily distinguished by genetic markers. The author of the above statement [1] is in error to assume that the only genetic differences between races, which may differ on average in pigmentation, are for the genes that determine pigmentation.
In conjunction with Table 2, we can estimate that about 120 unselected SNPs or 20 highly selected SNPs can distinguish group CA from NA, AA from AS and AA from NA. A few hundred random SNPs are required to separate CA from AA, CA from AS and AS from NA, or about 40 highly selected loci. STRP loci are more powerful and have higher effective δ values because they have multiple alleles. Table 3 reveals that fewer than 100 random STRPs, or about 30 highly selected loci, can distinguish the major racial groups. As expected, differentiating Caucasians and Hispanic Americans, who are admixed but mostly of Caucasian ancestry, is more difficult and requires a few hundred random STRPs or about 50 highly selected loci. These results also indicate that many hundreds of markers or more would be required to accurately differentiate more closely related groups, for example populations within the same racial category.