Positive Selection of an Indel Polymorphism in the FADS Gene Cluster May be Driving Long-Chain PUFA Biosynthetic Capacity in Specific Human Populations.
This article at a glance
- This study reports a 22-bp nucleotide insertion-deletion (indel) genetic polymorphism that may be causally related to the control of gene expression of the fatty acid desaturases, enzymes that control the biosynthesis of long-chain polyunsaturated fatty acids (LCPUFA) from 18-carbon PUFA.
- The population frequency of the indel (rs66698963) is remarkably different among human populations with the insertion being far more frequent in South Asians, Africans and some East Asian populations, and far less common in European and other East Asian populations.
- The polymorphism has a significant effect on baseline arachidonic acid levels, and on the product-precursor relationship for the omega-6 LCPUFA biosynthesis pathway. The effects on omega-3 LCPUFA homeostasis remain to be reported, but further exploration of indel frequency in populations may significantly augment our understanding of the link between diet and PUFA status in health and disease.
The appearance of genetic polymorphisms, i.e. differences in the DNA nucleotide sequence between individuals, contributes to the opportunity for functional adaptation to specific environments that organisms may encounter. Over many generations, the selection of beneficial traits associated with particular polymorphisms can lead to gradual changes in the frequency of specific genetic polymorphisms in a genetically-isolated population. When a specific genetic variant reaches all members of the population, the trait becomes fixed. In contrast with the positive selection of favorable adaptive polymorphisms, a genetic variant may also gradually disappear from a population if there is no survival advantage in having it.
In addition to constituting fascinating signatures of the population genetic history, diverse types of genetic variance are also of importance to human health and chronic disease susceptibility. Several single-nucleotide polymorphisms (SNPs) that may affect long-chain polyunsaturated fatty acid (LCPUFA) biosynthesis, the distribution of dietary PUFA, and their functional effects have been described in recent years (see this Guest Article). Studies on polymorphisms found in one cluster of closely-located genes coding for fatty acid desaturases (FADS), enzymes that play an important role in the biosynthesis of both omega-6 and omega-3 LCPUFA, are providing an interesting perspective on the interaction between traditional diets and the population genetic history of humans.
Several SNPs located in the FADS gene cluster are now known to be related to changes in risk for complex and chronic diseases. None of these small polymorphisms can be clearly assigned to changes in the amino acid sequence of the FADS enzymes, and are believed to mostly influence the regulation of gene expression. In 2012, a much larger, 22-base pair long stretch of nucleotides was identified within a part of the FADS-2 gene that is not translated to protein (an intron). The polymorphism was present as an insertion in cells derived from a group of Japanese individuals, but was completely absent (a deletion) in others, albeit with a lower frequency (hence called the minor allele).
This genetic variant, termed an indel (insertion/deletion mutation), was found to influence the regulation of the expression levels of both FADS-1 and FADS-2 in a cellular system, with the minor allele in Japanese people being associated with lower expression of the FADS-1 and FADS–2 genes. The 22-base pair sequence (named rs66698963) is likely to affect the activity of a nearby sterol response element (a binding site for sterol response element binding protein, which is a class of DNA-binding transcription factors involved in sterol and fatty acid homeostasis).
The same group of researchers, working at Cornell University, Ithaca, NY, in collaboration with colleagues at the University of Kansas, KS, USA, and the Sinhgad College of Engineering, at the University of Pune, India, recently reported on a more comprehensive assessment of the frequency of this insertion-deletion polymorphism in several different ethnic human populations, and determined if positive selection of this genetic variant may have occurred. Furthermore, the biochemical implications of the indel polymorphism were determined by evaluating the fatty acid composition of red blood cell membrane phospholipids as a long-term marker for LCPUFA formation.
The population frequency of the allele corresponding to the rs66698963 insertion (allele named I) or deletion (allele named D) was determined from genomic DNA extracted from human samples (blood, breast milk, and placenta) obtained from several participating institutions in the US and Canada (n=211, nearly all from Kansas City), as well as from a group of Asian Indians (n=76). The observed allele frequency in North Americans was 0.38 for I and 0.62 for D, whereas for the Asian Indians it was 0.82 I and 0.18 D. The striking difference in allele frequency distribution prompted a more comprehensive assessment of genotype variation, as the results suggested that the minor allele (D) originally identified in Japanese and then in Asian Indians might well constitute a major allele in other populations.
To that end, whole-genome sequencing data from the 1000 Genomes Project was accessed to estimate the global genotype frequency distribution of rs66698963. Frequency distributions of genetic variation in ethnic populations of African (7 different populations), European (5), East Asian (5), and South Asian (5) origin were estimated. The I/I genotype was the genotype with highest frequency in African and South Asian populations. In European and East Asian populations, the I/D heterozygous genotype was most abundant, except in Japanese (Tokyo) and Han Chinese (Beijing), where also the I/I genotype was present to a major extent. Among people of South Asian ancestry, in particular Gujarati and Telugu Indian people, the D/D genotype was nearly absent and approximately 80% of the population was homozygous for I/I. Finnish people had the highest frequency of the D/D genotype (approx. 40%), and people from the United Kingdom had the lowest frequency of the I/I genotype.
This remarkable population-associated genotypic variation encouraged testing the hypothesis that positive selection of this fatty acid desaturase gene cluster indel may be at play. First, the degree of population differentiation in allele frequencies was quantified. A higher than expected differentiation of population structure for the indel was recognized using the FST statistic, a test for co-ancestry of alleles among individuals. Employing pair-wise comparisons of the FST statistic between the ethnic populations grouped according to the four continental regions, it was found that significant genetic divergence for the indel has occurred between South Asians and Europeans, Africans and Europeans, and between South Asians and East Asians. Together with results from a second test for population differentiation, the researchers report that positive natural selection of the rs66698963 insertion may have occurred in several of the studied populations.
Further evidence for positive selection of the insertion in different populations was obtained by site frequency spectrum (SFS) analysis, and by tests for natural selection based on haplotype frequency. SFS analysis was carried out with three approaches that assess (loss of) genetic diversity (π test), an excess of rare variants (Tajima’s D test), and an excess of high-frequency derived alleles (Fay and Wu test). SFS tests compare DNA sequences from individuals from different populations and make estimations of the differences in sequence by gradually stepping through the sequence in 1,000 bp steps and comparing 5,000 bp-long sequences. A probability scoring is performed that calculates the likelihood that the frequency of sequence variants in a population is statistically significantly different from surrounding DNA sequences that are on average not under selective evolutionary pressure. The results from these tests showed statistically significant (strong) positive selection of the insertion in South Asian populations, and a less significant one for the African populations, but not in European or East Asian populations. Strong positive selection of the insertion was detected in all the South Asian populations individually as well, in particular in Gujarati and Telugu Indian.
Haplotype-based tests for positive selection provided evidence of a strong selective pressure for the stretch of DNA that encompasses the rs33398963 indel in South Asians. Haplotypes are stretches of DNA (containing genes and regulatory sites) that are inherited as a unit. To detect a higher than expected inheritance of particular haplotypes within populations, three approaches were employed: the integrated Haplotype Score (iHS), the number of segregating sites by length (nSL), and the Cross-Population Extended Haplotype Homozygosity (XPEHH). These tests evaluate all sequence variants that occur with a frequency >5% against the overall genome-wide variation, and a normalized probability of occurrence is calculated. Positive selection of the haplotype carrying the insertion was subject to positive selection in South Asians, and also in African populations. In East Asian populations analyzed separately, significant selection for the insertion was found to have occurred in the Chinese Han of Beijing and in the Japanese population of Tokyo, but not the other three populations (Southern Han Chinese, Chinese Dai and Vietnam Kinh).
Based on these results, the authors propose that the rs66698963 indel, or a DNA sequence located very closely to it, constitutes a so called “adaptive allele”, i.e. a genetic element that is subject to positive selection related to some specific advantageous function in that human population. The presence of the insertion was found to represent the “ancestral” allele, having much stronger extended homozygosity (forming part of larger haplotypes, and having lower recombination frequency) than the “derived” allele, which is the evolutionary loss of the insertion (i.e. the deletion).
Since the indel is closely located to a regulatory sequence that controls expression of both FADS1 and FADS2, the authors next assessed if basal PUFA levels, as well as the substrate-product relationships for several omega-6 LCPUFA that can be formed from linoleic acid (LA: 18:2), were different between individuals with the I/I, I/D, and D/D genotypes of the original North American subjects. The biochemical phenotypes turned out to be remarkably distinct. As assessed from the composition of omega-6 PUFA esterified within erythrocyte phospholipids, the conversion of 20:3 to arachidonic acid (AA) (and onwards to 22:4) was significantly increased in carriers of the insertion (both in hetero- and homozygotes). Linoleic acid levels were unchanged, suggesting that in the tested individuals more than enough linoleic acid is available that can be converted to longer chain omega-6 PUFA, and that it is the FADS-1 and FADS-2 enzyme levels that are rate-limiting. In rs66698963 insertion homozygotes, the difference in the level of AA minus that of LA was 30% higher than the difference measured in deletion homozygotes. Several other product-precursor relationships all suggested that the insertion plays a measurable role in increasing baseline FADS-mediated LCPUFA biosynthesis. At this stage this has formally only been shown to occur for omega-6 LCPUFA, but it is expected that the same will apply to omega-3 LCPUFA biosynthesis given that the biochemical pathways for both LCPUFA families are shared.
In summary, this study has provided new indications that in different human populations rs66698963 constitutes an advantageous mutation that leads to an enhanced endogenous transformation of LA to longer-chain omega-6 PUFA. In some populations, such as the South Asian, a selective “sweep” of this insertion has led to near fixation. In other populations, the insertion does not appear to have undergone positive selection, and a high frequency of deletion is found. These results are in line with the consideration that people who over a period of many generations traditionally ingest a vegetable-based diet, have experienced a population fitness benefit from increasing the conversion of plant-derived LA to omega-6 LCPUFA. In populations where traditional diets already provide LCPUFA, such as fish and meat, no selective advantage is associated with retaining the insertion, which has to a significant extent been lost from the human genome and has resulted in a polymorphic distribution among members of European and some East Asian populations.
Expanding the analysis to further ethnic and geographically-defined population will provide an even better idea of the polymorphic distribution of this important indel. In the extreme case of the Greenlandic Inuit, reliance over many generations on a marine diet that is rich in preformed LCPUFA has obviated the need for maintaining the insertion, which has consequently been nearly totally lost from the Inuit genome. The latter has not been formally shown, but the ancestral form of a SNP (rs174570) that is nearly fixed in Inuit segregates closely to the indel reported here, and was also found in South Asians.
Positive selection of various genetic polymorphisms in the FADS gene cluster that are associated with an increased efficiency to convert medium chain PUFA to LCPUFA had previously been reported in African populations. It is thought that positive selection for increased conversion may have allowed African populations to establish themselves across the continent using predominantly plant-based diets as substrate for LCPUFA synthesis. Why these polymorphisms are not represented in populations that migrated out of Africa is not clear. The information in the present study suggests that the ancestral allele containing the insert has been present in the human genome for a long time: the 22-bp indel is absent in other primates but it was found in the DNA of Denisovan and Neanderthal humans.
This study has a number of potentially important implications that have been clearly outlined by the authors. FADS2 indel genotypes contribute to the variability in response to PUFA consumption. For vegans and vegetarians that predominantly consume LCPUFA precursors produced by plants, tissue LCPUFA composition will depend on the relative proportions of LA and alpha-linolenic acid consumed. For vegans/vegetarians with the I/I genotype and who ingest a diet with a dominance of LA, higher baseline AA levels are expected to result, likely associated with higher risk for developing chronic diseases related to inflammation. Direct consumption of preformed omega-3 LCPUFA, such as EPA and DHA from marine sources, may be required for these people to a much greater extent than those with a D/D genotype to balance the formation of prostanoids produced during inflammatory responses. D/D carriers maintain lower AA levels and may be less susceptible to chronic inflammatory disease, and less susceptible to excess LA intake. D/D individuals may however benefit more from supplemental EPA/DHA intake in periods of development dependent on omega-3 LCPUFA (e.g. during pregnancy).
Testing these implications in concordance with FADS2 indel screening holds promise to provide us with new insights into the question of who is most likely to benefit from the ingestion of which specific LCPUFA and/or its precursors, and their doses. Indels are inaccurately detected and annotated with current whole genome sequencing algorithms and the authors note that the D/D frequency estimated from whole genome sequencing efforts may in reality be even higher, and the I/D frequency lower, than currently determined from whole-genome sequencing projects. This may imply that the frequency of deletion homozygotes is even higher in populations with significant frequency of the D allele than estimated. Further research into this FADS2 indel is likely to provide very interesting insight into the role of PUFA in the health of humans living in a modern society that is increasingly disconnected from locally-sourced traditional food sources, and in some ethnically-mixed societies with a high diversity of indel genotypes.
Kothapalli KS, Ye K, Gadgil MS, Carlson SE, O’Brien KO, Zhang JY, Park HG, Ojukwu K, Zou J, Hyon SS, Joshi KS, Gu Z, Keinan A, Brenna JT. Positive selection on a regulatory insertion-deletion polymorphism in FADS2 influences apparent endogenous synthesis of arachidonic acid. Mol. Biol. Evol. 2016;33(7):1726-1739. [PubMed]
Fumagalli M, Moltke I, Grarup N, Racimo F, Bjerregaard P, Jorgensen ME, Korneliussen TS, Gerbault P, Skotte L, Linneberg A, Christensen C, Brandslund I, Jorgensen T, Huerta-Sanchez E, Schmidt EB, Pedersen O, Hansen T, Albrechtsen A, Nielsen R. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science 2015;349(6254):1343-1347. [PubMed]
Li Y, Wang DD, Ley SH, Howard AG, He Y, Lu Y, Danaei G, Hu FB. Potential impact of time trend of life-style factors on cardiovascular disease burden in China. J. Am. Coll. Cardiol. 2016;68(8):818-833. [PubMed]
Malerba G, Schaeffer L, Xumerle L, Klopp N, Trabetti E, Biscuola M, Cavallari U, Galavotti R, Martinelli N, Guarini P, Girelli D, Olivieri O, Corrocher R, Heinrich J, Pignatti PF, Illig T. SNPs of the FADS gene cluster are associated with polyunsaturated fatty acids in a cohort of patients with cardiovascular disease. Lipids 2008;43(4):289-299. [PubMed]
Mathias RA, Fu W, Akey JM, Ainsworth HC, Torgerson DG, Ruczinski I, Sergeant S, Barnes KC, Chilton FH. Adaptive evolution of the FADS gene cluster within Africa. PLoS One 2012;7(9):e44926. [PubMed]
Minihane A-M. The Impact of Common Gene Variants on EPA and DHA Status and Responsiveness to Increased Intakes PUFA Newsletter, December 2015.
1000 Genomes Project: http://www.1000genomes.org/about#1000G_PROJECT
Reardon HT, Zhang J, Kothapalli KS, Kim AJ, Park WJ, Brenna JT. Insertion-deletions in a FADS2 intron 1 conserved regulatory locus control expression of fatty acid desaturases 1 and 2 and modulate response to simvastatin. Prostaglandins Leukot. Essent. Fatty Acids 2012;87(1):25-33. [PubMed]
Weir BS. Estimating F-statistics: A historical view. Philos. Sci. 2012;79(5):637-643. [PubMed]