Association analysis: within-sibship sampling variation and solutions.
C. Li, M. Boehnke. Dept Biostatistics, Univ Michigan, Ann Arbor, MI.
In classical case-control studies, we sample independent affected and unaffected individuals
and tally their alleles (or haplotypes) into a 2xk table, where k is the number of alleles.
We then test for independence between the alleles and affection status, or equivalently, for
equal distribution of alleles in the affected and unaffected groups. Often we sample affected
siblings and other relatives of the affected individuals to do linkage analysis for the disease
of interest. The affected relatives' information is not used in case-control studies, resulting
in inefficient use of data. We propose two methods to use efficiently genotype data of affected
sibships. One is to count all alleles of an affected sibship, but down-weight the sibship so
that its total allele contribution is 2. The relative efficiency of this method vs. the standard
one is 1.3 for sib pairs and 1.5 for sib trios. The Pearson's chi-squared test can be performed
as usual. We also introduce a likelihood ratio statistic and a permutation test. Another method
is to down-weight an affected sibship of size k so that its total contribution is 4k/(k+1)
(Broman 2001 Genet Epidemiol 20:307-315). Under no linkage and no association, the resulting
allele frequency estimates have the smallest variance among all weighted averages of individual
sibship allele frequencies. It is slightly more efficient than the first method if we have
sibships of variable sizes. However, given tight linkage, this method may over count alleles
for larger sibships and inflate the type I error rate. Analytical and simulation results show
that these methods are more powerful than using just case-control data under a variety of
sibship sizes and disease models. Further, when we have data of affected siblings, selection
of one affected individuals per sibship in the standard method is quite arbitrary and introduces
variability. For example, for an allele of frequency 0.05, it is not unusual to have frequency
estimates varying from 0.039 to 0.061 for 200 sib pairs.