Genetic Variant in CLPTM1L Confers Reduced Risk of Lung Cancer: a Replication Study in Chinese and a Meta-analysis

Lung cancer is the most common cancer worldwide, both in terms of incidence and mortality, accounting for 13% (1.6 million) of the total cases and 18% (1.4 million) of the deaths in 2008 (Jemal et al., 2011). By far cigarette smoking poses the greatest risk for developing lung cancer. In Asian countries about 30-40% of lung cancers occur in never smokers, though only 10-15% of lung cancers occur in never smokers in Europe and North America (Toh and Lim, 2007). Age, gender, race, pre-existing lung disease, radon exposure, viral infection, environmental pollution, second-hand smoking and occupational exposures also are important contributors. In addition, direct evidence for a genetic predisposition to lung cancer is highlighted by several recent genome-wide association studies (GWAS) that have been done on lung cancer populations (McKay et al., 2008; Wang et al., 2008; Landi et al., 2009; Li et al., 2010; Miki et al., 2010; Yoon et al., 2010; Hu et al., 2011; Lan et al., 2012; Shiraishi et al., 2012). Multiple common single nucleotide polymorphisms (SNPs) in chromosomal loci 15q24-25.1, 5p15.33 and 6p21 have been identified to be associated with lung cancer susceptibility.


Introduction
Lung cancer is the most common cancer worldwide, both in terms of incidence and mortality, accounting for 13% (1.6 million) of the total cases and 18% (1.4 million) of the deaths in 2008 (Jemal et al., 2011). By far cigarette smoking poses the greatest risk for developing lung cancer. In Asian countries about 30-40% of lung cancers occur in never smokers, though only 10-15% of lung cancers occur in never smokers in Europe and North America (Toh and Lim, 2007). Age, gender, race, pre-existing lung disease, radon exposure, viral infection, environmental pollution, second-hand smoking and occupational exposures also are important contributors. In addition, direct evidence for a genetic predisposition to lung cancer is highlighted by several recent genome-wide association studies (GWAS) that have been done on lung cancer populations (McKay et al., 2008;Wang et al., 2008;Landi et al., 2009;Li et al., 2010;Miki et al., 2010;Yoon et al., 2010;Hu et al., 2011;Lan et al., 2012;Shiraishi et al., 2012). Multiple common single nucleotide polymorphisms (SNPs) in chromosomal loci 15q24-25.1, 5p15.33 and 6p21 have been identified to be associated with lung cancer susceptibility.
Among them, rs31489 is located in chromosomal region 5p15, containing two genes, cleft lip and palate transmembrane1-like gene (CLPTM1L) and telomerase reverse transcriptase (TERT). It was first identified to be associated with lung cancer risk in the pooling data by Wang et al. from his own and two other GWAS comprising 5095 cases and 5200 controls among Caucasians (Wang et al., 2008). However, the role of rs31489 in lung cancer has been found to be in the following published studies. For example, within Caucasians, (Landi et al., 2009;Liu et al., 2010;Pande et al., 2011;Wauters et al., 2011) reported that the rs31489-A allele (the minor allele) was associated with reduced risk of lung cancer while de Mello et al. (2013) found no such association. In Asian, two independent studies reported that the rs31489 was not associated with risk of lung cancer (Bae et al., 2012;Zhao et al., 2013). As mentioned above, the positive results were observed only in Caucasian populations rather than in Asian, which may be attributable to the great genomic variation across populations with different ancestries, implying the need to verify the association of rs31489 with lung cancer in diverse populations. Herein, we conducted a replication case-control study including 611 lung cancer cases and 1062 unaffected controls in Chinese Han population and further performed a meta-analysis combining our data with previously published data to provide a more precise evaluation of the association between rs31489 and lung cancer risk.

Study population
This replication study included 611 patients with lung cancer and 1062 healthy control subjects from China. DNA samples from lung cancer cases and healthy controls were provided by Tongji Hospital of Huazhong University of Science and Technology (HUST) from 2009 to 2011. All participants were ethnic Chinese residing in Wuhan or the surrounding regions. Controls were randomly selected from the physical examination program during the same period as the lung cancer cases were recruited, part of which were also involved in our previous epidemiological studies (Chen et al., 2012;Chen et al., 2013;Zhong et al., 2013). The patients and control subjects were adequately matched in terms of gender and age. At recruitment, informed written consent was obtained before 5-ml blood sample and the characteristic data were collected from each subject. Subjects who had smoked more than one cigarette per day for 12 months or longer were defined as smokers, others were regarded as nonsmokers. This study was approved by ethnics committee of Tongji Hospital of Huazhong University of Science and Technology.

DNA isolation and genotyping
DNA was isolated from 5ml of the blood sample by the RelaxGene Blood System DP319-02 (Tiangen, Beijing, China) following the standard procedures. Rs31489 were genotyped on a 7900HT Fast Real-Time PCR System (Applied Biosystems, Foster city, CA, USA) by applying the TaqMan SNP Genotyping Assay (Applied Biosystems, Foster city, CA, USA). The genotyping call rate was 99.9%. Approximate 5% duplicated samples were randomly chosen to evaluate the repeatability and the rate of concordance was 100%.

Statistical analysis
All statistical analyses were performed by using SPSS software package (version 13.0; SPSS, Inc., Chicago, Ill, USA). Pearson χ 2 test and t test were employed to analyze the difference in distribution of demography characteristics and genotype frequencies between case and control subjects, where appropriate. Hardy-Weinberg Equilibrium (HWE) for genotypes in controls was evaluated by Goodness-of-fit χ 2 test. We estimated the effect and strength of rs31489 genotypes on lung cancer risk as odds ratios (ORs) and 95% confidence intervals (95%CIs) by multivariate logistic regression model after adjusting for smoking status, sex and age. Dominant and additive models were assumed to avoid the assumption of genetic models in the association analysis as well. A p value of less than 0.05 was considered representative of significance except for tests of heterogeneity where a level of 0.10 was chosen.

Meta-analysis of rs31489 in association with lung cancer risk
To confirm the effect of rs31489 in lung cancer susceptibility, a meta-analysis combining our data and previously published data was conducted. We searched all the publications updated to May of 2014 from the PubMed, ISI Web of Science and EMBASE databases without language restriction, using the following terms 'rs31489, CLPTM1L or 5p15.33' in combination with 'lung cancer, NSCLC'. Meanwhile, to gain more informations, references listed in every retrieved article were also screened manually. Comments and reviews were also checked for additional studies. The inclusion criteria were: (a) a case-control study assessing the association between rs31489 and lung cancer risk; (b) providing sufficient data to calculate both ORs and their corresponding 95%CIs; (c) genotypes in the control group conforming to Hardy-Weinberg equilibrium (p>0.05). Reviews, animal studies, case reports were not included. Where eligible papers had insufficient information, we contacted authors by e-mail for additional information. The larger or more complete studies were finally selected when the same population was included in a few publications.
The following information was extracted independently and in duplicate by two reviewers (Xia Luo and Laxmi Pangeni Lamsal): the last name of first author, publication year, ethnicity of participants, country of origin, sample size, study design, source of controls (population-or hospital-based controls), cancer type, genotyping method, counts of genotypes (AA, AC and CC) in case and control groups. Pooled frequency of the allele-A in Asian and Caucasian populations was estimated with the inverse variance method (Thakkinstian et al., 2005b). ORs and corresponding 95%CIs were recalculated for AA versus CC, AC versus CC, (AA+AC) versus CC and so was additive model. The adjusted OR and corresponding 95%CI were extracted directly for pooling analysis under the additive model because of a lack of genotype or allele frequency in one study (Liu et al., 2010). In the current study, heterogeneity was significant when the p value was less than 0.1, and we assessed between-study heterogeneity with the Cochran's Q statistic (Lau et al., 1997). To quantify heterogeneity, the statistic of I² was also utilized and a guide for interpretation of I² was as follows: 0% to 25%, no heterogeneity; 25% to 50%, low heterogeneity; 50% to 75%, moderate heterogeneity; 75% to 100%, high heterogeneity (Higgins et al., 2003). A fixed-effects model was applied when heterogeneity was not obvious (p>0.1) (Mantel and Haenszel, 1959); otherwise, it was more appropriate to use a random-effects model (DerSimonian and Laird, 1986). Stratified analyses were performed according to ethnicity (Caucasian and Asian), study design (GWAS and replication study) and source of control groups (population-based and hospital-based). Sensitivity analysis was employed to assess robustness of our meta-analysis and the influence of every study on the overall results (Thakkinstian et al., 2005a). Meanwhile, cumulative analysis was carried out by accumulating all eligible studies sorted by the published time (Mullen et al., 2001). In the end, publication bias was evaluated by the Egger's test (Egger et al., 1997). All analyses were conducted by Stata 10.0 (Stata Corporation, College Station, TX, USA).

Results of case-control study
Population characteristics. Descriptive characteristics of 611 lung cancer patients and 1062 controls were shown in Table 1. Males were 68.6% in cases compared with 70.2% in controls and the mean age was 61.0 years (±10.8) for cases and 61.7 years (±9.4) for controls. There was no statistically significant difference between case and control subjects in distribution of age (p=0.15) and sex (p=0.48). As expected, significant difference in smoking status was observed between case and control groups (p=0.00), with 53.5% smokers among cases and 43.3% among controls. Among 611 cases, there were 427 (69.9%) patients with NSCLC.
Association analysis. Genotyping results were listed in Table 2. There was a significant difference in distribution of genotypes between case and control groups (p=0.0). Genotypes of the controls were in Hardy-Weinberg equilibrium (p=0.13). In the multivariate logistic regression model adjusted for age, sex and smoking status, individuals with the AC genotype had a significant decreased risk of lung cancer (OR=0.68, 95%CI=0.52-0.88) compared to those with CC genotype. While the association between the AA genotype and lung cancer risk was of no significance (p=0.05), possibly due to the small sample size and low population frequency of the variant allele A (15.5% in controls and 11.3% in cases). Owing to the low genotype frequency of AA, a dominant model was carried out in this study. Results showed that the A carrier (AA plus AC) also had a significantly protective effect on lung cancer compared with those with CC genotype (OR=0.65, 95%CI=0.51-0.84). Under the additive model, the rs31489 still showed significant association with a decreased lung cancer risk (OR=0.68, 95%CI=0.54-0.85). Meanwhile, we also analyzed effect between the NSCLC patients and controls, and the results were similar to overall lung cancer analysis ( (Table 2). These findings suggest that genetic susceptibility is often associated with smoking status.

Results of meta-analysis
Study characteristics. As presented in Figure 1, 9 eligible publications including 11 studies (16364 cases and 15835 controls) finally met the inclusion criteria (Wang et al., 2008;Landi et al., 2009;Liu et al., 2010;Pande et al., 2011;Wauters et al., 2011;Bae et al., 2012;de Mello et al., 2013;Zhao et al., 2013). Among these studies, ten studies offered available genotype and allele frequency data to calculate ORs and their corresponding 95%CIs (Wang et al., 2008;Landi et al., 2009;Pande et Wauters et al., 2011;Bae et al., 2012;de Mello et al., 2013;Zhao et al., 2013). Since there is a lack of genotypic frequency data in the other study (Liu et al., 2010), the adjusted OR and corresponding 95%CI were directly extracted as the metrics of effect size for pooling analyses under the additive model. The characteristics of the included studies were presented in Table 3. Pooled frequency of A allele. The frequencies of CLPTM1L gene rs31489-A allele ranged widely between Caucasians and Asians. For example, in control groups, pooled A allele frequencies were 40.2% (95%CI=39.4%-41.1%) in Caucasian under random model (P heterogeneity <0.10), and 14.8% (95%CI=13.9%-15.7%) in Asian under fixed model (P heterogeneity >0.10). The observation was consistent with that in HapMap database (39.3%, 16.1% and 12.8% for Caucasians, Chinese and Japanese, respectively).    DOI:http://dx.doi.org/10.7314/APJCP.2014.15.21.9241 Genetic Variant in CLPTM1L Confers Reduced Risk of Lung Cancer: a Replication Study in Chinese and a Meta-analysis Overall meta-analyses of rs31489 in association with lung cancer. Overall, significant heterogeneity was observed in all genetic models except the heterozygous model, and the random-effects model was carried out to calculate the pooled ORs for homozygous, dominant and additive models. As for heterozygous model, fixed-effects model was applied (Figure 2). Significant association was observed in all of these models, with pooled ORs of 0.76 (95%CI=0.67-0.87), 0.88 (95%CI=0.82-0.95), 0.87 (95%CI=0.82-0.93) and 0.91 (95%CI=0.86-0.95), respectively (Table 4). In brief, the meta-analysis of rs31489 revealed the similar results to our case-control study.
Stratified analyses. In view of significant heterogeneity and to seek for its potential source, stratified analysis was conducted (Table 4). After further stratification by ethnicity, obvious heterogeneity existed both in Caucasian and Asian populations. In Caucasian, all genetic models presented significantly decreased risk of lung cancer while in Asian decreased risk was only conferred in homozygous and additive models. Null significance was observed in heterozygous as well as dominant models (OR=0.89, 95%CI=0.78-1.01; OR=0.86, 95%CI=0.70-1.05, respectively), potentially indicating that the A allele might act differently in different ethnical populations. According to study design, statistically significant findings were seen only in the GWAS without obvious heterogeneity. Heterogeneity still remained large in replication group. As for source of controls, all genetic models presented significantly decreased risk of lung cancer and hardly showed any heterogeneity except for additive model in population-based studies. But negative outcomes appeared in the homozygous and dominant models with obvious heterogeneity under the subgroup of hospital-based studies.
Sensitivity analysis. Considering that significant heterogeneity was observed in the meta-analysis, a sensitivity analysis was conducted to locate the source of heterogeneity and assess the influence of every individual study on the overall estimate. As shown in Table 5, two studies were the main source of the heterogeneity (Pande et al., 2011;Wauters et al., 2011). The heterogeneity was effectively removed after exclusion of these studies (P heterogeneity =0.22, P heterogeneity =0.19, respectively). But the result was not meaningfully altered after exclusion of this study, indicating the stability of the meta-analysis results.
Cumulative meta-analysis. Cumulative meta-analysis of the association of rs31489 with lung cancer was performed. As shown in Figure 2, the 95%CIs were narrowed along with the growing number of studies involved in the heterozygous model, suggesting the precision of the estimates.
Publication bias. As reflected by the Egger's test, no publication bias existed in all four models (P for Egger's test=0.976, 0.310, 0.326, and 0.570, respectively, Table 4).

Discussion
In the current study, we successfully replicated the association between rs31489 and lung cancer in a Chinese Han population, which was identified in some previous GWAS while inconsistent results appeared in following relevant researches. Rs31489 is located at 5p15.33 region which comprises TERT and CLPTM1L, two known candidate susceptibility genes. TERT is the crucial catalytic protein subunit of telomerase, which plays a vital role in control of telomere length (Collins and Mitchell, 2002). Frequent expression of telomerase at high levels is observed in many kinds of cancer, including lung cancer, implying that TERT may involve in lung carcinogenesis (Hahn, 2003;Lantuejoul et al., 2007). CLPTM1L has been documented to be associated with apoptosis and be upregulated in cisplatin resistant cell lines (Yamamoto et al., 2001). Recently, a few SNPs located in the TERT-CLPTM1L have shown consistent associations with lung cancer (Qi et al., 2012;Zhang et al., 2012;Zhao et al., 2014). In the current case-control study and the meta-analysis, we confirmed the significant inverse association between rs31489 and lung cancer risk both in Caucasians and Asians. However, rs31489 is in the intron of CLPTM1L and then is unlikely to be a causative variant. This significant association may be owing to linkage disequilibrium with some other functional variants, so further effort in finding the causative variants in this locus is required.
When subjects were stratified according to smoking status in this study, rs31489 was significantly associated with lung cancer only in nonsmokers but not in smokers. And there was no significant difference in the effect size between the unadjusted and adjusted overall results when smoking was considered as a confounder of the association examination (Table 2). Still, we didn't find  any association between this SNP and smoking status in the control group (data not shown). The above findings imply that rs31489 in CLPTM1L is mainly associated with lung cancer but not smoking. The finding that the protective role of rs31489 AC/AA genotypes for lung cancer only existed in nonsmokers is biologically plausible because smoking can reduce telomere length and increase telomerase activity (Valdes et al., 2005;Choi et al., 2009). A correlation between rs31489 and telomere length has been reported under a recessive inheritance model (Mirabello et al., 2010). Thus, smoking might counteract the protective role of genotypes AC/AA by influencing telomere length. Actually, a recent study of lung cancer in Asia indicated that some variants of the CLPTM1L gene on chromosome 5p15.33 were associated with lung carcinogenesis among never-smokers (Lan et al., 2012). However, it is of note that inconsistent results appeared in two independent studies in Caucasians where smoking status was documented with significant association both in smokers and nonsmokers (Pande et al., 2011), and only in smokers (Landi et al., 2009). Different ethnic background and limited sample size might contribute to the discrepancy, and further investigation with large sample size in different population needs to be conducted.
In the meta-analysis, the obvious heterogeneity should be issued. Significant heterogeneity remained both in Caucasians and in East Asians after stratification by ethnicity. In Caucasians, all genetic models of the A variant allele were significantly associated with reduced lung cancer risk, yet in Asians only homozygous and additive models conferred decreased risk, which might be related to the low allele frequency of A (current study: 0.148; Hapmap data: 0.161 in Chinese, 0.128 in Japanese), making it quite difficult to reveal the weak association in Asians unless examining a large population. Different linkage disequilibrium patterns in different populations might also contribute to the discrepancy. After stratified by study design, heterogeneity in GWAS was significantly reduced; however, in replication studies, the heterogeneity still existed and no significant association was found in all genetic models. This may be due to diverse genotyping methods. In addition, source of controls might confer heterogeneity partly. Sensitivity analysis indicated the stability of the results. The cumulative meta-analysis in chronologic order also confirmed the positive findings, showing that the effect of rs31489 progressively increased the evaluation precision by integrating more data over time.
Several limitations in this study might affect the interpretation of the current results. Firstly, the sample size was relatively small in our replication study. Fortunately, the subsequent meta-analysis with sufficient power had got the consistent results with our replication study. Additionally, lung cancer is caused by a complex interplay between environmental and genetic factors; however, we failed to further detect the environmental effect because of lacking the environment information. Besides, due to limited data, stratified analysis by potentially relevant subgroups such as cancer type could not be performed which calls for further investigation in this field.
In summary, our replication study showed that genetic variation rs31489 in 5p15.33 was directly associated with the reduced risk of lung cancer in Han Chinese. Besides, the findings demonstrated that CLPTM1L gene rs31489-A allele was a protective factor for the development of lung cancer only in nonsmokers, but not in smokers. Furthermore, this significant association was verified by meta-analysis both in Caucasian and Asian population. Nevertheless, further biological analysis of 5p15.33 in cancer cells should be done to detect the functional association with cancer.