Transcriptional Profiles of Peripheral Blood Leukocytes Identify Patients with Cholangiocarcinoma and Predict Outcome

Cholangiocarcinoma (CCA) is a relatively rare type of primary liver cancer worldwide; however, its incidence is extremely high in the Southeast Asia, especially in the northeast of Thailand. Risk factors for the development of CCA include long standing inflammation and chronic injury of the biliary epithelium. Infection with the liver flukes, Opisthorchis viverrini and Clonorchis sinensis, has often been singled out as the leading risk factor in east and southeast Asia (Sithithaworn et al., 2014). The malignancy of CCA is normally difficult to diagnose until the disease becomes advanced or disseminated which makes CCA a tumor with extremely poor prognosis. Prognostic markers that predict tumor behavior would help notify the patient and clinician during the decision-making process. Wholeexome sequencing revealed several somatic mutatons found in different carcinogenic exposured CCA (Ong, et al., 2012; Chan-On etal., 2013). The identification of molecular markers associated with patient survival by a non-invasive approach, therefore, is essential for effective treatment of this cancer (Vaeteewoottacharn, et


Introduction
Cholangiocarcinoma (CCA) is a relatively rare type of primary liver cancer worldwide; however, its incidence is extremely high in the Southeast Asia, especially in the northeast of Thailand. Risk factors for the development of CCA include long standing inflammation and chronic injury of the biliary epithelium. Infection with the liver flukes, Opisthorchis viverrini and Clonorchis sinensis, has often been singled out as the leading risk factor in east and southeast Asia (Sithithaworn et al., 2014). The malignancy of CCA is normally difficult to diagnose until the disease becomes advanced or disseminated which makes CCA a tumor with extremely poor prognosis. Prognostic markers that predict tumor behavior would help notify the patient and clinician during the decision-making process. Wholeexome sequencing revealed several somatic mutatons found in different carcinogenic exposured CCA (Ong, et al., 2012;Chan-On etal., 2013). The identification of molecular markers associated with patient survival by a non-invasive approach, therefore, is essential for effective treatment of this cancer (Vaeteewoottacharn, et al., 2014). Moreover, biomarkers associated with adverse outcome may themselves represent novel therapeutic targets (Wongkham and Silsirivanit, 2012;Silsirivanit et al, 2014).
It is now becoming clear that the tumor microenvironment, which is largely populated by leukocytes, is capable of promoting tumor cell invasion through expression of signaling molecules such as cytokines, chemokines, and growth factors. Peripheral blood leukocytes (PBLs) normally act as a comprehensive surveillance system that may change function in the face of inflammation, infection and other diseases, including cancer. Recent studies have shown that PBLs have the ability to respond differentially to varying environmental, physiological or pathological conditions of the body. Therefore, a PBL gene expression signature has the potential to be a disease specific marker, such as autoimmune, cardiovascular, neurological diseases and cancers (Mohr and Liew, 2007), and be useful in diagnosis or predicting response to therapy of the diseases (Baranzini et al., 2005;Desire et al., 2013).
Tumor associated macrophages are known to be the major class of infiltrating leukocytes in solid tumors and are accepted as playing promotional roles in tumorigenesis (Mantovani et al., 2004). The pro-tumorigenic functions of tumor associated macrophages in certain cancers are related to their differentiation state as M2-polarized macrophages that release various factors supporting tumor growth, metastasis, angiogenesis, tissue remodeling and suppress adaptive immunity (Pollard, 2004). We have recently shown that tissue macrophages with positive MAC387 through their proteolytic activities play important roles in CCA tumor progression and significantly related to shortened survival of the patients (Subimerb et al., 2010a). Our group also demonstrated that level of CD14 + CD16 + monocytes was elevated in CCA patient blood and correlated with degree of MAC387 positive-tumor associated macrophages (Subimerb et al., 2010b). Recently, it has been shown that activated macrophages secreted various cytokines that could induce epithelial-mesenchymal transition in CCA cell lines (Techasen etal., 2012).
Since circulating blood is easily accessible and has been suggested as an alternative to tissue samples for molecular profiling of human disease (Wopereis et al., 2012), disease risk (Yang et al., 2012) and cancer diagnosis or prognosis (Burczynski et al., 2005;Osman et al., 2006), we therefore, hypothesized that the expression profile of circulating blood cells from CCA patients could be used to predict clinical outcome. In this study, we show for the first time that a small set of three expressed PBL genes related to proteolytic function discriminated well from poor prognostic outcome in CCA patients.

Study subjects
CCA subjects were patients who admitted to Srinagarind hospital, Faculty of Medicine, Khon Kaen University, Thailand, for surgical treatment of CCA. Peripheral blood was collected prospectively before any therapy. Only samples from patients with histologically proven CCA were included in the study. Tumor stage was defined according to the American Joint Committee on Cancer Staging Manual (Greene et al., 2002). Survival of each CCA subject was recorded after surgery for an average of 8 months and 48.57% (17/35 cases) of the patients died during this follow-up period. Control subjects were blood donors with no known medical condition who had normal blood laboratory tests who were age and sex matched with CCA patients. All subjects provided written informed consent before entry into the study. The protocol for this study was approved by the Human Research Ethics Committee, Khon Kaen University (HE471214 and HE480312).

RNA extraction and one cycle eukaryotic expression sample processing
PBL gene expression profiles were performed with 9 CCA patients and 8 healthy subjects. Heparinized blood (0.5 ml) was added into 4.5 ml of TRIzol ® reagent (Invitrogen, Carlsbad, CA, USA). Total RNA was extracted using TRIzol ® reagent with Purelink TM Micro-to-Midi system kit (Invitrogen) according to the manufacturer's protocol. The integrity and amount of RNA was determined by 2100 Bioanalyzer RNA LabChip (Agilent Technologies, Palo Alto, CA, USA) and Nano drop (ND-1000 spectrophotometer and ND1000 version 3.2.1 software), respectively. Total RNA was converted to cDNA and synthesized to Biotinylated cRNA which was then fragmented and hybridized into the Oligonucleotide microarray Human Genome_U133 Plus 2.0 array (approximately 54,675 probe or ~39,000 genes; Affymetrix Inc., Santa Clara, CA, USA ) by GeneChip ® Hybridrization Oven 640. Array chips were stained with streptavidin phycoerythrin biotinylated anti-streptavidin antibody using GeneChip ® Fluidics Station 450. RNA expression levels were quantitated by measuring the fluorescence intensity using GeneChip ® scanner 3000.

Gene expression analysis
Arrays were scanned by using standard Affymetrix protocols and Affymetrix GSC3000 scanner. Each probe set was considered as a separate gene. Expression values for each gene were captured by using Gene Chip Operating Software; GCOS (Affymetrix, Santa Clara, CA). Partek software (Agilent Technologies, Palo Alto CA) was used for downstream analysis of GCOS processed data (Principle component analysis, differentially expressed genes, Hierarchical Clustering analysis, Venn diagram and Gene Ontology). Principal component analysis (PCA) was conducted on all genes analyzed to assign the general variability in the data to a reduced set of variables called principal components. Replicate sample analysis was performed on all possible pair-wise comparisons between arrays originating from CCA patients and healthy controls. Signals from all probe sets were normalized using Human Genome U133 Plus 2.0 Array Normalization Controls. Differentially expressed genes were treated by the analysis of variance technique (ANOVA). Hierarchical cluster analysis was done on each comparison to assess correlations among samples for each identified gene set. The criteria for selecting differentially expressed genes between CCA and healthy subjects were (i) Mean fluorescence intensity in each probe set should be equal or more than 8 for all up or down regulated genes; (ii) it has significant expression at a p<0.05; and (iii) the expression level compared to the control group was 1.5 fold different.

Real-time RT-PCR assay validation
Transcript copy number for specific genes of interest was determined in array originate samples (9 CCA patients and 8 healthy subjects) and training set (35 CCA patients and 20 healthy subjects) and measured using an adaptation of a two-step real-time reverse transcriptase-polymerase chain reaction (Real time RT-PCR) method. Real time RT-PCR for specific genes of interest and internal control was performed using a SYBR green assay. Kit for RT-PCR [AMV] kit according to the manufacturer's instructions. PCR was performed on a LightCycler using the LightCycler FastStart DNA Master SYBR Green I kit and ~5 ng cDNA sample. Threshold cycles (Ct) for each amplification reaction were determined using LightCycler Software version 3. All samples were amplified with the human b-actin LightCycler-Primer Set. β-actin was used as internal control for normalization. Relative change in CCA specific gene expression was quantified by using the 2 -∆∆Ct method where ∆∆Ct=(Ct target -Ct actin ) patients-(Ct target -Ct actin ) healthy . The relative change in gene expression of healthy subjects was indicated to be 1X expression of each target genes. Thus, relative change values more than 1X expression represented up-regulation, whereas those with less than 1X expression represented down-regulation.

Statistical analysis
Statistic analyses were done using SPSS statistical software version16.0.1 (SPSS Inc., Chicago, Illinois, USA) and STATA version 8 (Stata Corporation, College station, Texas, USA). The different expression of each candidate gene between groups was compared using Student-t-test. Cox regression was used to establish prognostic index from the expression levels of candidate genes and eventual patient outcome. Kaplan-Meier survival analysis was used to estimate the disease-specific survival and comparison between groups were done with a log-rank test. Cross tabulations were analyzed with chisquared -test for the associations between prognostic index with clinicopathological features of CCA patients.

Identification of differentially expressed genes through cDNA microarray
Since we attempted to investigate the molecular signature of PBLs that can be used as prognostic marker for CCA, the differential expression profiles of PBLs obtained from CCA patients and healthy control were compared. Peripheral blood cells from four intrahepatic and five extrahepatic CCA patients (mean age=57.7 years, range: 45.7-69.7 years) and eight healthy individuals (mean age=49.1 years, range: 45.54-52.6 years) were used for Affymetrix GeneChip system analysis. The mean age of these two groups was not significantly different. The principal component analysis (PCA) for determining expression trend within the dataset of CCA patients

Figure 1. Principle Component Analysis for Expression Trend between CCA Patients and Healthy Subjects.
A) The Ellipsoid view showed that specimens were grouped by disease. The result showed 49.2% of the system variance between the expression of PBLs from 9 CCA patients (red dots) and 8 healthy subjects (blue dots); B) Of the 177 genes, 117 genes were up regulated and 60 genes were down regulated in the PBLs from CCA patients. Hierarchical clustering analysis of up; C) or down; D) regulated genes in the expression profiles of PBLs from CCA patients and healthy controls and healthy subjects demonstrated the system variance transcriptome of 49.2% difference ( Figure 1A). Using a P value less than 0.05 and 1.5 fold change, there were 2,199 genes equally expressed in the PBLs from both groups, whereas 117 genes were up regulated and 60 genes were down regulated in the PBLs from CCA patients ( Figure  1B). The differentially expressed genes between CCA and healthy subjects were arranged according to the similarities in gene expression patterns using Cluster analysis. The hierarchical clustering analysis of the up or down regulated genes resulted in a clear separation of the CCA patients from healthy controls ( Figure 1C-1D). The molecular and cellular functions predicted for the differentially expressed genes were determined according to the Gene Ontology database. Genes that were differentially expressed in peripheral blood leukocytes of CCA patients are shown in Table 1. The over-expressed genes related to protease, peptidase and invasion; growth factor/angiogenic/ cell adhesion; M2 related immune response, cytokine/ chemokine; whereas the down-regulated genes were associated with M1 related immune response, cytokine/ chemokine; protein biosynthesis and regulatory proteins. The top significant biological functions using Ingenuity Pathway Analysis (IPA) software was predicted to be antigen presentation, cell death, cellular movement, cell   to cell signaling and interaction, and cellular growth and proliferation (Figure 2).

Selection of candidate genes and verification
Since our recent studies indicated that tumor associated macrophages have proteolytic function and elevated levels of these cells were associated with poor CCA patient [11,12], therefore a set of 11 genes which were differentially expressed in PBLs from CCA vs. healthy subjects and associated with tumor progression were selected. The initially identified candidate genes of the nine up-regulated genes (PLAU, SERPINB2, VEGFA, EREG, MMP9, IL-8, PTGS, CTSL, and CXCL3) and the two down-regulated genes (CXCL10 and TLR8) were investigated by quantitative reverse transcription (RT)-PCR. Using the same set of specimens as for the microarray analysis, the expression level of each gene obtained from the quantitative RT-PCR was comparable with those identified in the microarray analysis ( Figure  3).

Construction and validation of classification models for predicting the outcome of CCA patients
The nine up-regulated genes were evaluated for the potential of being a prognostic signature for CCA in a training set of 35 CCA patients (Table 2) and 20 controls using quantitative RT-PCR. The expression level of each gene was evaluated as 2 -ΔΔCt , where ΔΔCt=(Ct target -Ct ref ) CCA -(Ct target -Ct ref ) Healthy, and the expression of β-actin was used as reference. The 2 -ΔΔCt of the nine up-regulating genes from the training data set was used to obtain the regression coefficient and hazard ratio which were then used to define the "prognostic index" by the Cox regression model. The combination equation to obtain the best prognostic discrimination power was from three invasion related genes PLAU, SERPINB2 and CTSL. The final equation for predicting the survival outcome of CCA patients (prognostic index) was 1.020 PLAU+1.214  To determine if the prognostic index was related to patient prognosis, the Kaplan-Meier curve analysis and log rank test based on the levels of prognostic index were tested. Using the prognostic index at 50th Percentile of the training set (=4.21) as the cutoff point, the CCA patients were stratified into two groups: patients with low (<4.21) and those with high (≥4.21) prognostic index. The overall survival of CCA patients who had a high prognostic index (mean survival 239 days; 95% CI, 167-311 days) was significantly reduced when compared with those who had a low prognostic index (mean survival 403 days; 95% CI, 320-485 days) (p=0.021, Figure 4). Since several clinical parameters have been shown to correlate with prognosis of CCA patients, we further determined whether the prognostic index was a confounding factor underlying the clinical condition by performing univariate and multivariate Cox proportional hazard regression analysis. A univariate analysis of various clinical variables demonstrated that only staging and "prognostic index" were significant predictors of patient survival (p<0.05) ( Table 3). The multivariate Cox regression model for survival which controlled for age, sex, staging and "prognostic index" of the patients indicated that both staging and "prognostic index" were independent predictors of survival for CCA with hazard ratio of 3.85 (95% CI, 0.64-23.05; between stage III versus I-II), 7.35 (95% CI, 1.40-40.31; between stage IV versus I-II) and 3.61 (95% CI, 0.95-13.72; between high versus low score), respectively.

Discussion
Differential gene expression signature profiling between primary tumor and normal tissues has been used for diagnostic and prognostic purposes in various cancers including CCA (Subrungruanga et al., 2014). In this study, we report the feasibility of using PBL gene expression profiling to predict the overall survival of CCA patients.
The principal component and hierarchical clustering analysis of the expression profiles from PBLs obtained from CCA patients and healthy subjects indicated a distinct set of 177 genes which were differentially expressed in the PBLs of CCA patients. Convincingly, the biological functions of the differentially expressed genes in PBLs of CCA patients suggest the pro-tumorigenesis role and suppression of immune response in CCA patients. Growth and cell cycle regulatory genes, e.g., epiregulin (EREG), vascular endothelial growth factor (VEGF), and transforming growth factor, beta 1 (TGFB1) were 1.5-4 fold up-regulated in PBLs of CCA patients. The PBL signature of CCA patients was also associated with tumor progression. For example, expression of angiogenic chemokines (CXCL2, CXCL3 and CXCL8/IL8) that are potent promoters of angiogenic activity were up-regulated whereas CXCL10 (an interferon-inducible chemokine), a potent inhibitor of angiogenesis was suppressed. The contribution of CXCL1, 2, and 3 to angiogenesis and tumor progression has been shown in immortalized murine melanocytes (Luan et al., 1997;Owen et al., 1997). Elevation of tumor associated CXCL8/IL8 within tumors correlates with neovascularization and is inversely correlated with survival in patients with ovarian carcinoma and non-small-lung cell carcinoma (Smith et al., 1994;Yoneda et al., 1998), whereas CXCL10 mediates its angiostatic activity via CXCR3 on endothelium (Strieter et al., 2006). Taken together, the up-regulation of angiogenic CXC chemokines (CXCL2, 3 and 8) and down-regulation of angiostatic CXC chemokine (CXCL10) observed in CCA-PBLs may reflect the tumor angiogenic environment in CCA. A recent limited genetic study of CCA patient blood showed that a minor blood monocyte subpopulation (CD14 + CD16 + ) expressed elevated levels of growth and angiogenic factor related genes, e.g., epiregulin, VEGF-A, and CXCL3, as compared to normal and biliary disease patients (Subimerb et al., 2010a). Although the current study reports the CCA PBL transcriptome this previously described CD14 + CD16 + monocyte subset within the PBL population may be responsible for the disease specific genes identified under this study. The use of PBLs for genetic analysis in CCA allows for the development of a simple test not requiring separation into cellular subsets prior to analysis.
Certain chemokines are differentially expressed in polarized macrophages (e.g., CXCL10 for classically activated macrophages or M1; CCL17 for alternatively activated macrophages or M2 (Martinez et al., 2006). In the present study, the down regulation of CXCL10 in CCA-PBLs may have been associated with a Th2 response known to promote monocytes/macrophage into a M2 phenotype. Moreover, it was shown recently that supernatants from human CCA cell lines induced an activation of signal transducers and activators of transcription-3 (Stat3) and macrophage polarization toward the M2 phenotype (Hasita et al., 2010). These M2 type macrophages are well known for supporting tumor progression and suppression of immune responses.
Seven genes involved in proteolytic degradation of extracellular matrix, a process which supports tumor invasion, were differentially expressed in CCA-PBLs. These genes included plasminogen activator urokinase (PLAU), matrix metalloproteinase 9 (MMP9), serpin peptidase inhibitor clade B member 2, cathepsin L (CTSL), ADAM metallopeptidase domain 9 (ADAM9), and TIMP metallopeptidase inhibitor 1 (TIMP1). This information supports our recent report that tumor associated macrophages (TAMs) in CCA tissues, especially at leading edge of the tumor, expressed PLAU and MMP9 proteins (Subimerb et al., 2010b). In addition, the patients with high density MMP9 and PLAU expressing TAMs had a reduced overall survival after surgical resection (Subimerb et al., 2010b). Recently, it was shown that lipopolysaccharides (LPS) could elevate Wnt3 expression and activate several cytokine productions in the macrophage cell line. The LPS-activated macrophages could reduce the expression of epithelial markers E-cadherin and CK-19 and enhanced the expression of mesenchymal markers, S100A4 and MMP9 (Techasen et al., 2012) and beta-catenin (Loilome, 2014) in CCA cell lines. These data suggest that through evaluation of PBL gene expression, genes encoding molecules present in TAMs and critical for tumor cell invasion directly associated with CCA patient survival may be identified.
Recently, a small set of genes from differential expression profiles of several cancers has been used as a molecular signature for tumor diagnosis and prognosis in patients with urinary bladder cancer, breast cancer (Osman et al., 2006;de Reynies et al., 2009;Ma et al., 2008), adrenocortical tumor, colorectal cancer (Xu, et al, 2013) and intrahepatic CCA (Nishino et al., 2008;Kraiklang, et al, 2014). From the typical expression profile found in PBLs of CCA patients, we identified a set of three proteolytic related genes (PLAU, SERPINB2, CTSL) that when computed into "prognostic index", had significant power to predict the prognosis of CCA patients. CCA patients who have PBLs with high expression of PLAU, SERPINB2, and CTSL (high prognostic index) have shorter survival than patients who have PBLs with low expression of these genes (low prognostic index). This association corresponds and supports our previous finding that CCA patients with high density of MMP9 and PLAU expressing TAMs in tumor tissues had shorter survival than those with low density of MMP9 and PLAU expressing TAMs (Subimerb et al., 2010b). In addition, the multivariate Cox regression model for survival indicated that the prognostic index was an independent predictor of survival for CCA. In summary, this study supports the idea of using PBL transcriptome analysis as an accessible surrogate monitor of a tissue and system that are not easily obtained by standard approaches.
From a technical point of view, one can argue that the PBLs from CCA patients may be contaminated with circulating tumor cells which may have affected gene expression patterns in the present study, since tumor cells a low frequency are known to circulate in the blood (Paterlini-Brechot and Benali, 2007). To test whether the results from the current study were influenced by the presence of blood borne tumor cells, three databases containing gene expression profiles from primary tissues of CCA (Jinawath et al., 2006;Obama et al., 2005;Wang et al., 2006) were checked for similarities with the profile from the PBL transcriptome. No highly expressed transcripts present in primary tumor CCA tissues were found in our database and vice versa. In addition, differentially expressed genes in CCA-PBLs did not include any epithelial cell related genes. Therefore the data presented in this study reflect the expression of PBL genes without any significant contribution of genes expressed in CCA tissue.
The expression profile of PBLs from CCA subjects is probably be used for diagnostic marker for CCA. More control groups from related pathological conditions, such as patients with benign biliary diseases, liver fluke infection and other gastro-intestinal cancers, however, are needed. In conclusion, informative gene expression profiles of PBLs that could reliably distinguish CCA patients from healthy subjects were indentified. On the basis of this data, a small set of candidate genes differentially expressed in CCA-PBLs has the potential of being developed into a test that could be a predictor for survival of CCA patients. However, the potential of this approach will need to be evaluated more extensively in a larger sample size for a better discrimination power before being applied to clinical practice.