#Exome sequencing & #genetic basis of complex traits
http://www.nature.com/ng/journal/v44/n6/full/ng.2303.html Key pt: amt of rare variants exceeds that from neutral model
Kiezun A, Garimella K, Do R, Stitziel NO, Neale BM, McLaren PJ, Gupta N, Sklar P, Sullivan PF, Moran JL, Hultman CM, Lichtenstein P, Magnusson P, Lehner T, Shugart YY, Price AL, de Bakker PI, Purcell SM, Sunyaev SR. Exome sequencing and the genetic basis of complex traits. Nature Genetics (2012) 44: 623-630
This article serves as part review, and part research article, focusing on using exome sequencing to detect associations between variants and complex traits.
An important fact they point out, with a wide range of implications for studying disease, is that the number of rare variants exceeds the number predicted by the neutral model. Figure 1 illustrates nicely this excess of rare variants.
I agree with their statement that the majority of these mutations are not “neutral”. They attribute this excess to population expansion or purifying selection, but a plausible explanation that explains this excess, which is found in all organisms regardless of demographic history, is linked selection.
The authors compare statistics derived before and after filtering exome sequencing data of 438 individuals (HIV and Scizophrenia data-sets), illustrating the importance of filtering in obtaining high quality calls. WGS (CGI data on 37 individuals) was used as a benchmark for the number of called SNP counts of different categories (silent, missense, nonsense).
They then proceed to analyze the affect of population stratification on significance values by combining different ratios of individuals from the European-American HIV cohort and the Swedish schizophrenia cohort. (Theory predicts that older populations should have more rare variants because recombination has had more time to break up linkage blocks, and because newer populations have most likely gone through homogenizing bottlenecks.) They find that calculating p-values using a permutation test provides fewer type I errors (false positives), and that this technique can competently deal with population
stratification when conducting association studies.