Artificial Neural Network for Prediction of Distant Metastasis in Colorectal Cancer

Colorectal cancer (CRC) is one of the most common cancer, major malignancies and health problem worldwide (Boyle and Langman, 2000; Parkin, 2004; Moradi et al., 2009). The higher incidence rate was reported in Australia, North America and Northern and Western Europe and the lower rates were reported in Asia and particularly Africa (Ferlay et al., 2010). however, CRC incidence rate is increasing in Asia (Song et al., 2005; Missaoui et al., 2010; WHO, 2010). In Iran, CRC is third common cancer in females and fifth in males (Iranian annual national cancer registration report, 2007) and five year survival probability of CRC patients were reported 0.45 and 0.39 for women and men respectively (Moradi et al., 2009). Distant metastasis (DM) in CRC patients is as metastasis to one organ or site and metastasis in more than one organ/site or the peritoneum (American cancer society, 2011). Main sites of metastasis of colon cancer are liver, peritoneum and lungs and for rectum cancer is liver, lungs and adrenal gland (Disibio and French, 2008; Talmadge and Fidler, 2010; Coghlin and Murray, 2010; National cancer Institute, 2010). The treatments for DM cancer are made on with systemic therapy (chemotherapy,


Introduction
Colorectal cancer (CRC) is one of the most common cancer, major malignancies and health problem worldwide (Boyle and Langman, 2000;Parkin, 2004;Moradi et al., 2009).The higher incidence rate was reported in Australia, North America and Northern and Western Europe and the lower rates were reported in Asia and particularly Africa (Ferlay et al., 2010).however, CRC incidence rate is increasing in Asia (Song et al., 2005;Missaoui et al., 2010;WHO, 2010).In Iran, CRC is third common cancer in females and fifth in males (Iranian annual national cancer registration report, 2007) and five year survival probability of CRC patients were reported 0.45 and 0.39 for women and men respectively (Moradi et al., 2009).
Distant metastasis (DM) in CRC patients is as metastasis to one organ or site and metastasis in more than one organ/site or the peritoneum (American cancer society, 2011).Main sites of metastasis of colon cancer are liver, peritoneum and lungs and for rectum cancer is liver, lungs and adrenal gland (Disibio and French, 2008;Talmadge and Fidler, 2010;Coghlin and Murray, 2010;National cancer Institute, 2010).The treatments for DM cancer are made on with systemic therapy (chemotherapy,

Artificial Neural Network for Prediction of Distant Metastasis in Colorectal Cancer
Akbar Biglarian1 *, Enayatollah Bakhshi 1 , Mahmood Reza Gohari2 , Reza Khodabakhshi3 biological therapy, targeted therapy, and hormonal therapy that is not usual), local therapy (surgery, radiation therapy) or a combination of them.In addition, researchers are studying new ways to treat metastasis cancer (Disibio and French, 2008;Talmadge and Fidler, 2010;Coghlin and Murray, 2010;National cancer Institute, 2010).
In the present study, the ANN model was used to predict DM of CRC patients and then its accuracy was compared to logistic regression (LR) model.

Materials and Methods
In this study, we analyzed the data from 1219 patients with CRC were been collected by cancer registry of the Research Center for Gastroenterology and Liver Disease of Shahid Beheshti University of Medical Sciences, Tehran, Iran (Asghari et al., 2009).At first we dropped out those patients with lower than one-month or higher than six years survival time and also who were deaths for other causes.Accordingly, a total of 1007 patients (786 colon cancer and 204 rectum cancer patients) were entered in the study.Observed distant metastasis (metastasis to one organ or site and metastasis in more than one organ/site or the peritoneum) over the follow-up was considered as the outcome variable.The covariates for this outcome was consisted of age at diagnosis (year), sex (female/male), ethnicity (Persian/Kurd/Azeri/Lur/other), marital status (married/other), high risk behavior (i.e.tobacco smoking, or alcohol history, or opium, or IV drug user, or betel use), pathologic stage grouping (primary/advanced), and first treatment (surgery/biopsy/chemotherapy/radiotherapy).
To predict DM by LR, the back-ward stepwise selection method was used to model building based on main effect and all possible interaction terms and the p-value less than 0.05 was considered significant.
In the ANN strategy, at first the data was divided into two subsets: training/learning (70%) and testing/validation (30%) subset.Model building process was made on training dataset based on multilayer perceptron (MLP).Afterward, the model is validated by testing dataset.In this context, the areas under receiver operation characteristic curve (AUROC) and concordance index (C index) were used for comparing the prediction ability of the described models.It is mentioned that the C index estimates the probability of concordance/agreement between predicted and observed responses.Note that, in fitting ANN model we used a three-layer MLP network with 7 variables in input layer, 5 to 16 nodes in middle layer and one node in output layer, the sigmoid transfer function in middle and output layers, a back-propagation learning algorithm.data were analyzed using R 2.14.1 software (available at: http://cran.r-project.org/).

Results
Of the 204 rectum cancer patients 128 (62.7%) were men and others were women.92.5% of these patients were married.The mean±SD of age at diagnosis (in year) for men and women were 51.47±14.3 and 53.96±12.3,respectively.In addition, the 25, 50 and 75 percentiles of age at diagnosis for men were 43, 52, and 63 year and for women were 44, 54, and 66 year, respectively.Of the 788 colon cancer patients 484 (61.6%) were men and others were women.92.8% of these patients were married.The mean±SD of age at diagnosis for men and women with this cancer were 51.5±14.2 and 55.11±14.6,respectively.In addition, the mean±SD of age at diagnosis for patients with distant metastasis was 52.88±13.99(49.73±15.23 for women and 54.31±13.07for men).Most of CRC patients had a surgery as the first treatment (91.7% for rectum and 73.6% for colon).Only 42.5% of rectum and 36.9% of colon cancer patients have had at least one high risk behavior.The advanced stage of tumor for rectum and colon cancer was 48.5% and 57.8%, respectively (Table 1).determined as ordered important factors for rectal cancer (Table 2).
Model vs. LR model for colon cancer data was calculated 91.4% vs. 92.3%and 48.6% vs. 32.4%,respectively.This means that, the ability of ANN and LR predictions to identify patients without DM is similar but the ability of the ANN predictions to identify patients with DM is better than LR predictions.For rectum cancer data, specificity and sensitivity of the ANN model was calculated 85.7% and 44.4%, respectively (Table 3).

Discussion
After primary therapy including main treatment and adjuvant treatment; the patients usually put under follow up schedule.Sometimes we loss the patients during the follow up; so if we have an ability to define high risk patients we could concentrate our program for detecting distant failure in proper time.Such a prediction could increase our insight in future.Researchers are studying new ways to treat metastasis cancer (Disibio and French, 2008;Talmadge and Fidler, 2010;Coghlin and Murray, 2010;National cancer Institute, 2010).Published related studies have reported the ANN prediction of lymph node metastasis was more accurate in esophageal cancers (Kan et al., 2004), gastric cancer (Bollschweiler et al., 2004), head and neck cancer (Darby et al., 2005), breast cancer (Baltzer et al., 2010).However, our findings in the present study showed that ANN strategy is more accurate than LR model to predict DM in CRC patients.It is obvious that, true prediction of DM may be improving CRC care and may be affecting the survival of the patients.In conclusion, the ANN model is suggested to predict DM in CRC patients as a suitable tool and also may possibly be applied clinically in the future.

Table 2 . ANN and LR Modeling Results to Determine The Important Factors on DM in CRC
Based on validation set, the NN model was used to determine the important factors.Based on importance analysis in ANN strategy, pathologic stage grouping, first treatment, sex, age at diagnosis, ethnicity, marital status, and high risk behavior variables were determined as ordered important factors for colon cancer.First treatment, ethnicity, pathologic stage grouping, age at diagnosis, sex, high risk behavior, and marital status variables were * For the rectum cancer data, LR model did not fit to the data

Table 3 . Classification Accuracy of ANN and LR Models for DM in Validation set of CRC
For the rectum cancer data, LR model did not fit to the data. *