Aberrant Methylation of Genes in Sputum Samples as Diagnostic Biomarkers for Non-small Cell Lung Cancer : a Meta-analysis

Lung cancer killed over one million people worldwide every year, and as the leading cause of cancer death in men and second leading cause in women, the significance of this worldwide public health burden was evident (Molina et al., 2008). In United States, lung cancer is the number one cancer killer for both men and women, leading to over 160, 000 deaths each year (Jemal et al., 2007). Lung cancer is divided into two sub-types clinically, small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC). SCLC is the more aggressive sub-type, and accounts for 10-15% of all cases. The remaining 85-90% of cases is classified as NSCLC. Early detection of NSCLC, which is the more common and less aggressive sub-type, has the highest potential for saving lives (Paul et al., 2008). At the present time, various screening options are available for the early detection of NSCLC, including chest X-ray (Gavelli et al., 2000), sputum cytology (Bach et al., 2003), low dose spiral computed tomography (LDSCT) (Carter et al., 2007), auto-fluorescence bronchoscopy (AFB) (Feller et al., 2005) and so on (Ziaian et al., 2014). However, none of these methods is truly optimal, either on account of improper sensitivity or specificity of the tests, or the methods are costly and invasive (Melvyn et al.,


Introduction
Lung cancer killed over one million people worldwide every year, and as the leading cause of cancer death in men and second leading cause in women, the significance of this worldwide public health burden was evident (Molina et al., 2008).In United States, lung cancer is the number one cancer killer for both men and women, leading to over 160, 000 deaths each year (Jemal et al., 2007).Lung cancer is divided into two sub-types clinically, small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC).SCLC is the more aggressive sub-type, and accounts for 10-15% of all cases.The remaining 85-90% of cases is classified as NSCLC.Early detection of NSCLC, which is the more common and less aggressive sub-type, has the highest potential for saving lives (Paul et al., 2008).
At the present time, various screening options are available for the early detection of NSCLC, including chest X-ray (Gavelli et al., 2000), sputum cytology (Bach et al., 2003), low dose spiral computed tomography (LDSCT) (Carter et al., 2007), auto-fluorescence bronchoscopy (AFB) (Feller et al., 2005) and so on (Ziaian et al., 2014).However, none of these methods is truly optimal, either on account of improper sensitivity or specificity of the tests, or the methods are costly and invasive (Melvyn et al.,

RESEARCH ARTICLE
Aberrant Methylation of Genes in Sputum Samples as Diagnostic Biomarkers for Non-small Cell Lung Cancer: a Meta-analysis Xu Wang 1 , Li Ling 2 , Hong Su 1 *, Jian Cheng 1 , Liu Jin 1 2000; Paul et al., 2008).Since these options have not been proven effective as early detection methods, sensitive and specific diagnostic methods remains to be found (Parkin et al., 2001).
To fill this void, research focus has been moved to molecular approaches (Suzuki et al., 2008;Ramshankar et al., 2013).The goal was to identify molecular biomarkers (generally DNA) that can be utilized for early detection of these lesions at the pre-invasive stage.Initial DNA methylation studies for NSCLC focused on single gene that was chosen because of its potential function in cancer (Jarmalaite et al., 2003;Xie et al., 2006).However, single DNA methylation marker cannot be expected to detect all cases of a particular cancer.The way to solve this problem is to discuss the DNA methylation status of multiple loci (a panel).Further studies employ panels with more than one loci for DNA methylation profiling to detect a designated cancer (Tsou et al., 2007;Feng et al., 2008).
Many studies have shown that sputum could be a promising "remote medium" for early detection of NSCLC (Miozzo et al., 1996;Palmisano et al., 2000;Belinsky et al., 2005).The advantages of sputum as "remote media" included its non-invasive procurement, and the fact that it contains cells from the lungs and lower respiratory tract (Olaussen et al., 2005;Paul et al., 2008).Shin et al. (2012) described a combination of 2 loci, MAGE A3 and p16 that served as a good panel for early detection of lung cancer in sputum specimen.Similarly, Li et al. (2007) reported a combination of FHIT and HYAL2 in sputum samples with 76% sensitivity and 85% specificity.Hwang et al. (2011) recommended HOXA9 gene methylation in sputum, the sensitivity was 90.5% and the specificity was 97.5%, and Belinsky (2006) showed that the combined effect of methylation of at least one of the four most significant genes in sputum increased the positive predictive value to 86%.In contrast, Cirincione et al. (2006) reported that 3 loci, RARβ2, p16 and RASSF1A genes in sputum had a limited diagnostic value in early detection of lung cancer.
During the past years, an increasing number of researches had been published utilizing aberrant methylation of sputum DNA as diagnostic biomarkers for NSCLC (Georgios et al., 2012;Skin et al., 2012;Hubers et al., 2013).However, the results of these studies were variable and inconsistent.To the best of our knowledge, there was still no comprehensive evaluation on the diagnostic accuracy of methylation markers in sputum samples for NSCLC.Hence, we performed a comprehensive review on the diagnostic value of sputum DNA testing for patients with NSCLC.

Search strategy and selection criteria
The PubMed, Science Direct, Web of Science, Chinese Biological Medicine (CBM), Chinese National Knowledge Infrastructure (CNKI), Wanfang, Vip Databases and Google Scholar were systematically searched by two authors (Wang and Ling).Key words including "lung cancer or lung carcinoma or non-small cell lung cancer or NSCLC", "sputum or flema", "diagnostic", "sensitivity and specificity" and "methylation or hypermethylation or hypomethylation or demethylation" were used to identify appropriate studies published in English and Chinese from Jan 1, 2003 to 2013.In addition, the reference lists of all identified studies were manually searched to identify any additional studies.Duplicated results, irrelevant articles were removed from this study.To be eligible for inclusion, studies had to utilize one or more methylated biomarkers in sputum samples for detecting NSCLC.
Two reviewers (Wang and Ling) independently evaluated the full text of each manuscript.Studies were chosen to investigated the association between sputum DNA with patients' diagnosis by histopathology (endoscopy) and provided data on the numbers of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN).Articles were excluded if date were insufficient to calculate the numbers of TP, FP, TN and FN; subjects were enrolled without a diagnosis; noncase-control studies, this study was based on tissue or animals, the study's purpose was to evaluate technical or mechanical aspects of the NSCLC detection assay, review article or letter, single case report, or conference summary or memorandum.

Data extraction procedure and quality assessment
Two reviewers (Wang and Ling) independently extracted the following data from each article: name of first author, year of publication, source populations, number of patients and controls included, TP, FP, FN, TN, detection techniques used, target gene (s), the score of the quality assessment of studies of diagnostic accuracy (QUADAS) and studies with or without blinded.Any disagreements between the reviewers were resolved by discussion.
The standards for reporting of diagnostic accuracy (STARD) initiative and QUADAS guidelines were utilized to assess the methodological quality of each study (Bossuyt et al., 2003;Whiting et al., 2004).There are 25 items in the STARD initiative checklist, and a score of 1 was given when the item was yielded (Bossuyt et al., 2003).While 14 items were included in the QUADAS tool, whereby a score of 1 was given when a specific item was fulfilled, 0 if this item was unclear and -1 for the item not achieved (Whiting et al., 2004).All of these studies were evaluated independently and discussed by the reviewers until a consensus was reached.

Statistical methods
We used standard methods recommended for metaanalysis of diagnostic test evaluations (Deville et al., 2004).Analyses were performed by using two statistical software programs (Meta-Disc 1.4 for Windows and Stata, version 10.0).Pooled estimates on sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR) and DOR were employed to examine the diagnostic accuracy of sputum DNA testing.Spearman's correlation coefficient was assessed to determine the threshold effect (Lijmer et al., 1999).The heterogeneity amongst studies was assessed on the basis of the χ 2 test using the Cochran Q statistic.The I 2 statistic, which measures the extent of inconsistency between studies, was also assessed (Deville et al., 2002).
For further explore heterogeneity, subgroup analyses were conducted.Separate analyses investigated the effect of source populations (Asians and non-Asians), the classification of control groups (cancer-free smokers, patients with pulmonary benign disease and both of all), studies with or without blindness (blind method and not mentioned), study quality (high quality: QUQDAS ≥8, medium quality: QUADAS =7 and low quality: QUADAS ≤6 ), assay method (qualitative and quantitative) and biomarkers status (single genes and multiple genes), given their potential impact on test performance.
Publication bias was detected by the Deeks' funnel plot asymmetry test.A P value of less than 0.1 for the slope coefficient was considered as significant asymmetry, which indicated potential publication bias (Pai et al., 2003).

Subgroup analysis of the classification of control groups
sROC curve analysis of controls of cancer-free smokers produced sensitivity of 0.637 (95%CI: 0.600 to 0.673), specificity of 0.665 (95%CI: 0.631 to 0.697) and AUC of 0.784.The corresponding values for the patients with pulmonary benign disease were 0.584 (95%CI: 0.528 to 0.639) for sensitivity, 0.978 (95%CI: 0.950 to 0.993) for specificity and 0.885 for AUC.
The results of both cancer-free smokers and patients with pulmonary benign disease showed a sensitivity of 0.632 (95%CI: 0.556 to 0.704), a specificity of 0.677 (95%CI: 0.600 to 0.747) and an AUC of 0.726 (see Table 2).

Subgroup analysis of the study quality
In lower study quality subgroup, the sensitivity of 0.682 (95%CI: 0.626-0.734),specificity of 0.929 (95%CI: 0.8640.969)and AUC of 0.825 were achieved for detection of NSCLC.The corresponding values of the subgroup with higher study quality were 0.678 (95%CI: 0.638 to 0.716) for sensitivity, 0.715 (95%CI: 683 to 0.746) for specificity and 0.793 for AUC.It showed sensitivity of 0.473 (95%CI: 0.418 to 0.528), specificity of 0.679 (95%CI: 0.620 to 0.733) and AUC of 0.692 in the medium study quality subgroup (Table 2).
Publication bias was evaluated by using the Deeks' test.It showed no significant publication bias among the studies that evaluated biomarkers in sputum samples from NSCLC patients (Figure 3).

Discussion
Lung cancer is responsible for a million cancer deaths per year worldwide, and its detrimental effects will continue to increase (Paul et al., 2008).As yet, no effective

Threshold effect
Computation of the spearman correction coefficient between the logit of sensitivity and that of 1-specificity of sputum DNA testing was 0.432 (p=0.045).

Diagnostic accuracy analyses
For all studies, the pooled DOR was 10.31 (95%CI: 5.88 to 18.08), Cochran Q =91.87 (p=0.000) and I 2 = 77.10%.There appeared to be heterogeneity between studies, as assessed by inspection of the forest plot (Figure 2E). Figure 2F presented the symmetrical ROC curve of sputum DNA testing and the AUC was 0.7827.The meta- approach for early detection was one of the important reasons for the high lung cancer mortality (Gavelli et al., 2000).In the present meta-analysis, we found that methylated genes in sputum samples for the early detection of USCLC yielded an overall sensitivity of 62% and an overall specificity of 73%.The AUC was 0.783, indicating an accuracy of middle level.Furthermore, the PLR was 3.86, NLR was 0.46 and DOR value was 10.31.Taken all together, it indicated that overall accuracy of USCLC detection utilizing sputum DNA testing was not good enough.
The results of the subgroup analysis recommended the high diagnostic value of multiple markers.Therefore, 11 studies that involved single markers appeared to have lower sensitivity and AUC.In addition, our results noted that the accuracy of quantitative method for the detection of NSCLC was higher than those routinely qualitative analysis.
Subgroup analysis of source populations showed that researches on Asians had higher specificity and AUC than Non-Asians (Europe and America).It appeared that lung cancer in Asians had unique characteristics (Federico et al., 2010).In Asian group, lung cancer often occured at an earlier age, was more common in people who had never smoked, and had a better overall prognosis (Jiang et al., 2012).To our surprise, there was a lack of studies that focused on Africans.The reasons might be that the incidence of lung cancer was low and sputum DNA testing had not been popular investigated in most African countries (Jacques et al., 2010).However, the incidence of lung cancer is increasing worldwidely, environmental exposure to asbestos, a dusty occupation, and perhaps indoor air pollution may also contribute to the development of lung cancer in African (Abdul et al., 2010).Therefore, early detection remained the key to successful outcomes (Claudia et al., 1999), and the participation of scientists around the world especially African areas was always required.
The heterogeneity had decreased when control groups were divided into cancer-free smokers, patients with pulmonary benign disease or both of them.Most of the recent diagnostic guidelines had concluded the diagnostic testing compared the results of the index test in patients with an established diagnosis of the target condition with its results in healthy controls or controls with other diagnosis (Lijmer et al., 1999).The present meta-analysis noted that control groups of only cancer-free smokers had the highest value for early detection, patients of the pulmonary benign disease were the lowest.Our results indicated that the diagnostic accuracy might be over-or under-estimated respectively in sputum DNA testing with only healthy controls or patients of pulmonary benign disease (Anne et al., 2006).Therefore, screening programs should pay more attention on the selection of the controls in order to assess diagnostic value accurately.
Another important factor that influenced the diagnostic value of sputum DNA testing was the quality index of the selected studies.Methodology checklist of diagnostic test accuracy covered participants representative, selection criteria, selection method, blind method and so on (Penny et al., 2011).The present meta-analysis concluded high quality studies had higher sensitivity than the low and medium quality studies.Therefore, these findings indicated that robust methodology design was significant for the diagnostic test (Brian et al., 2011).In addition, the presented study suggested that diagnostic accuracy had been overestimated in some low quality studies.
After systematic review of 22 studies, we identified several common limitations and insufficiency.Firstly, many studies were flawed by choosing controls in the wrong way.An ideal diagnostic test should recruited from both healthy controls and patients with pulmonary benign disease at the same time (Whiting et al., 2004).In these studies, 11 of 22 studies only selected cancer-free smokers as controls.However, defective controls would likely lead to over-estimations of specificity (Lijmer et al., 1999;Pai et al., 2003).Furthermore, many of the available studies did not report on the blind method (Jadad et al., 1996).The present meta-analysis revealed that the absence of blind method and low-quality study design would likely lead to over-estimations of diagnostic accuracy.Lastly, most studies suffered from a small sample size, as only 6 studies had a sample size greater than 150.Small sample size problem was a serious limitation when interpreting the findings and increased the potential bias of data (Gordon et al., 2011).
In conclusion, this was the first meta-analysis about sputum DNA testing and NSCLC.The current evidence suggested that the diagnostic accuracy of aberrant methylation of genes in sputum samples was not lower than single biomarkers for NSCLC, at least.However, the overall accuracy of the test was currently not strong enough to be the detection of NSCLC for clinical application.The discovery and evaluation of additional biomarkers with improved sensitivity and specificity from studies rated high quality deserved further investigation.

Figure 2
Figure 2. A) Forest plots of sensitivity of sputum DNA testing in NSCLC; B) Forest plots of specificity of sputum DNA testing in NSCLC; C) Forest plots of PLR of sputum DNA testing in NSCLC; D) Forest plots of NLR of sputum DNA testing in NSCLC; E) Forest plots of DOR of sputum DNA testing in NSCLC; F) SROC curves for sputum DNA testing for the detection of NSCLC.

Figure 1 .
Figure 1.Flow Diagram of Studies of Studies through the Review Process

Figure 3 .
Figure 3. Assessment of the Potential Publication Bias in the Detection of NSCLC