Genetic Variations Influencing Glucose Homeostasis and Insulin Secretion and their Associations with Autism Spectrum Disorder in Kazakhstan

Introduction: There is a complex interaction between glucose and insulin homeostasis pathways, diabetes and autism spectrum disorder (ASD). It is known that neuronal migration pathways may be interrupted by intrauterine hyperinsulinemia and hyperglycemia. Moreover, neonatal hypoglycemia which is related to mitochondrial dysfunction has a potential role that influences ASD pathogenesis [6,7]. We present here a preliminary case-control study on children and adolescents (8 – 15 years old) with and without autism spectrum disorder examining the association between genetic polymorphisms impacting glucose and insulin homeostasis and autism spectrum disorder in Kazakhstan.
Methods: In this case-control study looking at 211 samples, associations of glucose and insulin homeostasis gene polymorphisms of 10 genes and demographic variables with autism spectrum disorder were examined. Fisher’s exact test and multivariate logistic regression models were used to find associations between polymorphisms and other predictors.
Results: Preliminary results suggest that there is a complex relation between autism spectrum disorder and genetic variations that are associated with impaired glucose and insulin homeostasis susceptibility. There is a significant association of the T allele of ADIPOQ (rs1501299) (OR=1.75, 95% CI:1.04-2.93, p-value=0.035); the T allele of GCKR (rs1260326) (OR=0.6, 95% CI:0.39-0.93, p-value=0.023); the T allele of SLC30A8 (rs13266634) (OR=1.77, 95% CI:1.12-2.78, p-value=0.014); and the recessive GG genotype of rs10757278 (CDKN2B) (OR=2.58, 95% CI: 1.24-5.36, p-value=0.011) with autism spectrum disorder in the Kazakhstan population.
Conclusion: Overall, this preliminary study revealed that there is evidence of significant associations between glucose and insulin homeostasis gene polymorphisms and autism spectrum disorder susceptibility in Kazakhstan and further study in this area to further verify this, is needed.


INTRODUCTION
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that comprises Autism disorder, Asperger syndrome and Pervasive Developmental Disorder -Not Otherwise Specified [1].
The latter two are known to be milder forms of the disorder. The condition includes impairments in social interaction, verbal, non-verbal communication skills, and rigid repetitive behaviors. In addition, there is a wide range of co-occurring conditions such as attention problems, self-injurious behavior, hypersensitivity to sounds, smells, taste and light, anxiety and delays in motor skills. As its name suggests, ASD is expressed in a wide spectrum of ways. They vary in daily life limitations and generally, in how it shapes the lives of people with ASD as well as the lives of their families. According to the CDC in 2000, the yearly prevalence was 1/150, in 2006 -1/110, in 2012 -1/68, and from 2016 data it is approximately 1/54 children in the US, and boys are four times more likely to be diagnosed with autism than girls [2]. In Kazakhstan, the ASD statistics is still lower than in the world, but it demonstrates an upward trend: increasing from 77 to almost 2000 diagnosed with ASD from 2003 to 2017 [3].
There is a complex interaction between glucose and insulin homeostasis pathways, diabetes and ASD. According to existing knowledge, maternal diabetes is associated with doubling the incidence of ASD [4]. Although insulin does not cross the placenta, fetal blood glucose levels are explained to increase as a result of elevated insulin secretion caused by maternal diabetes [4]. In addition, "ketogenic diet", a high fat, adequate protein, low carbohydrate diet is associated with improvements in ASD by increasing beta-hydroxybutyrate after glucose challenge. It is consistent with satisfactory levels of ketosis [5]. Moreover, insulin is known to cross a blood-brain barrier, and it can regulate a synaptic activity in vital portions of the brain. It was claimed that insulin could participate in ASD pathways in genetically susceptible populations [6]. It is hypothesized that ASD is associated with distortions in neuronal migration and neuronal mitochondrial dysfunction. Interruptions of neuronal migration pathways may be caused by intrauterine hyperinsulinemia and hyperglycemia. Moreover, neonatal hypoglycemia is related to mitochondrial dysfunction [7].
TCF7L2 is a transcription factor in the Wnt-signaling pathway and is expressed in many tissues including fat, liver and pancreatic islets of Langerhans [8]. TCF7L2 helps to initiate gene transcription, when WNT ligands bind to the receptors on membranes, and when the signal is sent to the nucleus [9]. There is an association of the T allele of rs7903146 site of the gene with Type 2 Diabetes Mellitus (T2D). Namely, there is an association with impaired glucose-stimulated insulin secretion. Authors additionally noted that there is also an elevated proinsulin to insulin ratio. Meta-analysis on 115,809 subjects [10] based on 68 studies also revealed that there is a strong association between T2D and the rs7903146 polymorphism.
The rs1501299 of ADIPOQ, the gene expressed in adipose tissue, also has shown to be associated with a diabetes risk [11] and adiponectin is linked to insulin resistance [13]. The gene itself has an insulin-sensitizing effect and other variants of the gene are associated with body size and serum adiponectin concentrations and may also modify the risk of developing T2D [35]. Commonly, there are disturbances in adipocytokines and immunoinflammatory factors during neurodevelopment in ASD patients [12].
The SLC30A8 gene encoding the ZnT8 zinc transporter (solute carrier family 30 member 8) is another important gene in T2D development. It is known that loss-of-function mutation of this gene is protective against diabetes [14]. A common variant in T2D risk associated with proinsulin levels and glucose is rs13266634 polymorphism or p.Trp325Arg. Polymorphisms of Zrt and Irt like proteins (ZIP) and Zn transporters (ZnT) transporter genes were suggested to be associated with schizophrenia and ASD [15][16][17].
Meta-analysis on the association of the T2D susceptibility genes (GCKR, SLC30A8, and FTO) polymorphisms found that they are associated with a gestational diabetes mellitus risk. The GCKR gene encodes glucokinase regulatory protein, which regulates glucokinase activity. Glucokinase is responsible for storage and disposal of glucose from liver and insulin secretion in pancreas. However, rs1260326 C/T SNP of the GCKR gene did not demonstrate association with T2D [17]. Nevertheless, rs1260326 is found to be associated with high plasma triglyceride (TG) levels and with long-term activation of hepatic glucokinase [18]. T allele of GCKR is associated with the increased risk of higher serum TG concentrations [19]. The A allele of the rs780094 SNP of GCKR gene was found to be associated with low fasting plasma glucose and high TG levels [20].
The T allele of the rs10811661 SNP of CDKN22A/2B is reported to increase the risk of T2D by 21-27% [21]. CDKN2B gene polymorphism (rs10757278) shows weak association with T2D, however the number of studies on this polymorphism was low. In a Russian population rs5219 (KCNJ11), rs13266634 (SLC30A8), rs10811661 (CDKN2B) and rs9465871, rs7756992 and rs10946398 of the CDKAL1 gene showed a significant association with impaired glucose metabolism or impaired βcell function [22]. It has been shown that the T allele of rs7903146 (TCF7L2) was associated with impaired insulin secretion and enhanced rate of hepatic glucose production [23,8]. rs4402960 (IGF2BP2), rs5219 (KCNJ11), and rs1799884 (GCK) polymorphisms were associated with increased risk of Type 2 Diabetes Mellitus [24][25][26]. The rs560887 (G6PC2) was also found to be associated with higher fasting plasma glucose (FPG) [27]. However, not much information is available about their associations with ASD.
Overall, following sites were studied ( Table 1): rs10757278 (CDKN2A/B) is located in the enhancer region and disrupts a binding site for STAT1. rs1260326 is located at exon 15 and rs780094 is located at intron 16 of GCKR. rs13266634 is in the C-terminal region of the ZnT-8 protein of SLC30A8 gene. rs1501299 (+276G>T) located at the intron 2 region of adiponectin (ADIPOQ) gene. rs7903146 (IVS3C>T) is located at the intron 3 region of TCF7L2 gene. rs4402960 is in the promoter region of IGF2BP2 gene. rs5219 (E23K) is located at exon 1 region of KCNJ11 gene. rs560887 is in the third intron of the G6PC2 gene and rs1799884 (-30G>A) is in the β-cell specific promoter region of GCK gene.
Existing literature on associations of insulin secretion and glucose metabolism pathways shows associations between polymorphisms and diabetes and ASD separately [8-9, 11-13, 15-26, 35]. Diabetes and ASD are both multifactorial disorders that show overlaps in insulin and glucose regulatory pathways. The research question addressed by the current study is to determine if variations in the gene polymorphisms of TCF7L2, ADIPOQ, SLC30A8, GCKR, GCK, CDKN2B, KCNJ11, IGF2BP2 and G6PC2 are associated with ASD. It is hypothesized that there are differences in the prevalence of genetic polymorphisms of impaired glucose metabolism and insulin secretion related genes between ASD cases and healthy controls.

Study Design
A case-control study was conducted to determine the association of SNPs in genes impacting diabetes, impaired glucose or insulin homeostasis with ASD. Inclusion criteria for cases was being diagnosed with ASD, whereas the control was a person with no previously diagnosed neurological or other health concerns, selected to match the age of the intervention group. There was no exclusion based on gender and ethnicity of the participants. In total 211 children were eligible for the study and they were comprised of 101 children with ASD which were matched by age and gender with 110 healthy controls.
Study participants were recruited from three regions of Kazakhstan (Nur-Sultan, Almaty and Pavlodar) and their blood samples, anthropometric measurements and clinical records were collected between 2018 and 2019. For this purpose, service providers from Non-governmental organizations, governmental organizations and privately owned autism centers that provide services to children with autism (Autism parental networks), and charitable organizations ("Asyl Miras") were contacted to assist in recruiting participants. Blood samples and buccal cell samples were collected using venipuncture and mouth swabs. All sample collection procedures were done in a non-threatening and familiar surrounding. Healthy control samples were collected based on voluntarily informed consent participation after information was provided to parents through the various organizations. Noticeably, blood collection procedure was performed by trained phlebotomists from "INVITRO-Kazakhstan" laboratory.
The ethics approval was obtained from the Institutional Research Ethics Committee of Nazarbayev University. All study participants provided written informed consent and assent forms. In order to decrease the potential risk, parents/legal guardians were instructed that all information with regards to their children volunteered by them to the researchers will be de-identified and will not be appropriated in any way that will lead back to the identity of the child or family. A short survey to collect epidemiological and relevant medical data was also obtained and collected, together with the informed consent form.

Demographic Data
The dependent variable was the presence or absence of ASD. Independent variables included age, sex, body mass index (BMI) and ethnicity. Age was categorized into two groups: less than 10 years old and 10 years or older based on WHO definition of 'Adolescents' [37]. It was considered reasonable to categorize age, since there are developmental differences in this cutoff of 10 years old. BMI variable was categorized into underweight (less than fifth percentile), healthy weight (5 th percentile to less than the 85 th percentile), overweight (85 th to less than the 95 th percentile) and obese (equal or more than 95 th percentile) according to the CDC growth charts for children and teens [28]. The study included participants of Kazakh, Russian, Uighur, Korean, Tadzhik, Belarusian, Azerbaijan and Ukrainian ethnicities. However, due to a low number of representatives of other than Kazakh groups, they were stratified into "Kazakh" and "Other" categories.

Genotype Data
Additionally, genetic information was obtained, and SNP data was collected, such as, genotypes of SNPs of genes associated with T2D, glucose or insulin homeostasis ( Table 1). DNA samples were extracted using Promega Wizard Genomic DNA extraction kit A1125 following manufacturers protocol from whole blood. Quantity and quality of DNA was ascertained using Nanodrop spectrophotometer.
The DNA samples were sent to Fitgenes Pty LtD in Australia for SNP analysis. Genotyping was conducted using Life Technologies QuantStudio 12K Flex Real-Time PCR System using the cycle relative threshold (Crt) method and Taqman genotyping assays with two primers and a Taqman probe for each specific target SNP.

Statistical Analysis
Data cleaning was performed using Microsoft Excel [36]. All statistical analysis was conducted using the Stata 13 statistical program and SNPstats web-tool [29].
Basic descriptive statistics, such as frequencies and mean values were generated. To assess association with the outcome variable, Fisher's exact test was used for categorical independent variables, and Wilcoxon Rank Sum test was used for continuous independent variables. To estimate the strength of the association between polymorphisms and ASD, multivariate logistic regression (MLR) analysis was performed. The odds ratio (OR) and 95% confidence interval (CI) were calculated. The model was built by including all statistically significant genotype variables (p<0.05) and clinically important (age, gender, ethnicity, and BMI) covariates in the model. These demographic covariates were included in the model to adjust for their possible confounding effect on the outcome variable. All statistical tests were two-sided. The significance level (α) equal to 0.05 was chosen.
The Hardy-Weinberg equilibrium test and bivariate statistics for the different inheritance patterns were conducted using the SNPStats web tool. To evaluate the best model, the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used.
In addition, since obesity is a risk factor for metabolic diseases and diabetes, presence and absence of obesity were included in the final model. Gender is known to vary in ASD, thus, it was also stored. Ethnicity, namely, Kazakh ethnicity was kept in the model, since ethnicity has the potential to affect lifestyle and to confound findings.

Study Subjects
The demographic data of the study subjects is summarized in Table 2. 26.4% of controls and 17.8% of cases were females. The proportion of underweight and healthy weight people was higher in controls than in cases (11.8% vs 7.1% for underweight and 70.9% vs 60.6% for normal weight). Descriptive statistics showed that controls were older than cases (11.2±2.1 vs 10.4±2.2).

Association Study
Out of the 10 SNPs that were tested, 7 were in Hardy-Weinberg equilibrium 3 were not; these 3 are TCF7L2 (rs7903146), KCNJ11 (rs5219) and GCKR (rs780094) ( Table 3). Figure 1 summarizes the association between ASD and the various SNPs of the 10 genes analyzed.
Significant differences in prevalence of alleles for the various SNPs when comparing ASD and control subjects were found for rs13266634 (SLC30A8) (p < 0.02) and rs1501299 (ADIPOQ) (p<0.02) (see Table 2).
Association between the various SNPs and ASD under codominant, additive, overdominant genetic models are also summarized in Table 3.
Recessive inheritance patterns for statistically significant variables (based on AIC and BIC values) in which two copies of risk allele is necessary to cause an effect were obtained for the SNPs rs10757378 (CDKN2B), rs13266634 (SLC30A8) and rs1501299 (ADIPOQ). The rs1260326 (GCKR) follows an Overdominant inheritance pattern otherwise called "heterozygous advantage". In addition, ADIPOQ had a small sample size for TT genotype (Table 3). Thus, for not satisfying a minimal sample size for each category, it was not eligible for MLR. MAF-minor allele frequency; * indicates p<0.25; ** indicates p<0.05 For CDKN2B, rs10757378, was found to be of significant association with ASD under the codominant and recessive models tested (p<0.05). For GCKR rs1260326 under codominant and overdominant models statistically significant associations were found (p<0.008).
Collinear variables were not found. To identify confounders, logistic regression analysis was applied. Only BMI and Age were statistically significantly associated with ASD ( Table 2) and GCKR (rs1260326) was found to be related to both Age and BMI. The analysis showed that an unadjusted estimated slope coefficient of GCKR on ASD is 0.44 (0.26-0.77), whereas 11% increase was observed in slope coefficient for Age. Age resulted in a slight increase in standard error. Being obese did not confound the association.
Overall, the odds of having ASD increases by having recessive GG genotype of CDKN2B (rs10757278) ( Being older than 10 (inclusively) is associated with 54% decrease in the odds of ASD, Kazakh ethnicity -59% less chance of ASD adjusting for the other variables. On the contrary, being male resulted in 1.53 (0.73-3.2) odds of ASD in comparison with female participants, adjusting for other variables. Although the TT genotype of ADIPOQ lacked adequate sample size, the overall effect of this genotype is associated with a risk of ASD.
As for allele-based analysis ( Table 4), after import of p<0.25 variables CDKN2B (rs10757278) was not a statistically significant predictor of ASD. Overall model results in odds of ASD were 57% lower in children older than 10 years of age, CC vs CT-TT TT vs CC-CT * indicates p<0.05; **p<0.01; ***p<0.001; 'Reference' indicates OR=1; "Co"-Control, "Ca"-Case; + indicates in Hardy-Weinberg Equilibrium adjusting for other variables. Obesity is associated with 2.76 (1.38-5.53) times' higher odds of ASD adjusting for other covariates. The same pattern as in genotype-based MLR was observed: 46% less odds of ASD in ethnic Kazakhs, adjusted for other variables. Males have 61% higher odds of ASD in comparison with females, adjusting for other covariates. T allele of rs1501299 (ADIPOQ) is associated with higher odds of ASD in this study, adjusting for other variables. Namely, it is 1.75 (1.04-2.93). T allele of rs1260326 (GCKR) is associated with 0.6 (0.39-0.93) times odds of ASD and C allele of rs13266634 (SLC30A8) is related to 0.57 (0.36-0.89) times odds of ASD adjusting for variables mentioned above. Thus, they have somewhat "protective" effects in ASD.

DISCUSSION
The case-control study investigated association between ASD risk and glucose/insulin homeostasis gene polymorphisms in 211 children and adolescents in Kazakhstan. We addressed generic variants that were previously reported to be associated with susceptibility to impaired glucose, insulin homeostasis and Type 2 diabetes. To our knowledge, this is the first study showing significant associations between genetic polymorphisms in SLC30A8, GCKR, CDKN2B and ADIPOQ genes and susceptibility to ASD in children and adolescents.
It was found that the T allele of rs1501299 in ADIPOQ gene was a risk factor for ASD in our cohort OR=1.75; CI:1.04-2.93. It is difficult to compare these results to other studies on ASD, since, to the best of our knowledge, there has been no such study published previously. Results from other rs1501299 and diabetes studies show a complex picture, rs1501299 TT genotype in ADIPOQ was known to be associated with increased risk with prediabetes in a Jordanian population (p=0.006) [13] but the G allele was significantly higher in diabetes cases than in controls in Han Chinese population [11]. Interestingly in a cord and early childhood plasma adiponectin autism study, cord adiponectin levels were inversely associated with ASD risk indicating that the lower the level of adiponectin in the plasma, the greater the risk of ASD [12].
For rs1260326 in the GCKR gene, it has been found that the T allele was protective with OR=0.6; CI:0.39-0.93. In the study looking to see if the loci associated with metabolic traits also have a significant role in BMI and mental traits/disorder, the T allele of GCKR rs1260326 was associated with BMI (p=9.2E-05) [30]. It is also known that GCKR modulates TG levels and T allele was found to be strongly associated with lower fasting glucose (8E-13) and fasting insulin (3E-07) levels and higher TG levels (p=1E-04). In this study by Vaxillare et al found that the GCKR-L446 carriers which have the T allele, are protected against type 2 diabetes despite higher triglyceride levels and risk of dyslipidemia [26]. Interestingly, our genotype data showed that CT genotype of GCKR polymorphism was also protective against ASD (OR=0.42; CI:0.23-0.79). A future study is needed to resolve this by increasing the sample size and having their fasting blood glucose level taken.
SLC30A8 rs13266634 T allele was statistically significantly (CI:1.12-2.78) associated with 77% increase in the odds of ASD and the TT genotype was associated with 3.1 fold increase in ASD odds (CI:1.26-7.65). In a study looking at SNPs associated with T2D in a Kazakh population, the T allele was shown to be protective against type 2 diabetes mellitus (OR=0.68) [31]. Its protein mediates zinc transport and zinc plays a role in processing proinsulin to insulin and releasing insulin in response to glucose [40].
Lastly, CDKN2B rs10757278 GG genotype was found to be significantly associated with ASD risk with OR=2.58;CI:1. 24-5.36. In comparison, a study examining genetic variation in 9p21 and its association with fasting insulin, rs107575727 was found to be associated with higher serum insulin in females (p=0.001) [32]. Overall, findings were not consistent with the hypothesis that there is a simple positive relationship between impaired glucose and insulin homeostasis and ASD risk based on the limited SNPs that we have examined.
The chosen study design has several advantages. Namely, it is cost-effective in comparison with other studies such as cohort studies. Moreover, since it is a retrospective study there is no long follow-up period, which is especially beneficial for diseases with long latency periods. In addition, such a design is useful for the study of rare diseases and multiple exposures.
The study is novel and this is the major strength of the study. There is a wide variety of literature that mentions the associations between selected diabetes susceptibility polymorphisms and maternal diabetes and the risk of ASD separately. However, genetic susceptibility of ASD children and its association with genes that impair glucose and insulin homeostasis and diabetes is a new area of research. Moreover, this area of study, based on our knowledge, is absolutely novel to Kazakhstan.
There are several drawbacks of using case-control studies such as selection bias and recall bias. Since this study utilizes healthy controls, DNA samples and medical records, it is possible to say that selection and recall bias are minimized. Another weakness is the inability to calculate incidence rate and the estimation of temporal relationship between outcome and variables [33].
There is no information about maternal or children's diabetes nor impaired glucose/insulin homeostasis data of mothers of study participants. This information would have added clarity to the data that was utilized in this study.
Although sample size was reasonable, there is a threat to the statistical power of the study due to the use of dummy variables. For instance, ADIPOQ was not included in the final genotypic model because of the sample size of homozygous risk allele.
It is known that ASD patients tend to have higher BMI than their healthy counterparts [34]. However, regardless of obesity, there were different outcomes. Thus, although obesity is a risk factor for metabolic diseases, it was not significant in this study. Thus, there is probably another mechanism underlying this condition. It is possible to explain this by considering the impact of antipsychotic induced weight gain (AIWG) [39] caused by the use of second generation of antipsychotic drugs (risperidone and aripiprazole) that are commonly used by ASD patients to help with irritability, self-injurious behaviour and aggressiveness [38]. Several ADIPOQ variants have been mentioned [39] to be associated with AIWG in a Chinese population. To be precise, the G allele of rs1501299 polymorphism was found to be associated with an increase in weight, which is consistent with the above mentioned in the Han Chinese Population [11]. Also, the G allele of rs1501299 was found to be associated with T2D. The authors also mentioned that CDKN2A/B (rs3731245, rs2811708) and SLC6A4 (rs3813034) were associated with risperidone-induced weight gain [39]. Another study also mentions that SLC30A8 (rs13269119) was associated with AIWG [40].
Thus, it will be a good idea to include drug prescription records to the analysis in the future. Moreover, this adds extra importance to carry out genetic profiling of Kazakh population, since the prevalence of these genes and other genes associated with drug metabolism will be different in different ethnicities.
This study is an early study with a limited number of samples, thus it is recommended to increase it for future analysis and to consider gene-gene interactions and haplotype effects. Overall, identifying risk factors and in combination with environmental factors could provide information to help improve prevention and improvement of conditions for those with ASD. To sum up, this study added a new perspective to a pool of papers about polymorphisms and their association to ASD. It is known that selected polymorphisms are strongly associated with diabetes/impaired glucose/insulin homeostasis; however, little or no information about their risks in ASD patients existed. Overall, this study revealed that there is a relationship between genetic susceptibility related to glucose and insulin homeostasis and diabetes with ASD risk.