MicroRNA Expression Profile Analysis Reveals Diagnostic Biomarker for Human Prostate Cancer

Prostate cancer is a highly prevalent disease in older men of the western world (Chan et al., 2004; Linton and Hamdy, 2004). It is estimated that 241,740 men will be diagnosed with and 28,170 men will die of prostate cancer in 2012 in the United States (Siegel et al., 2012). Although the age-adjusted rate of cancer deaths has decreased steadily in the past 10 years, prostate cancer remains the second leading cause of cancer deaths in men after lung cancer (Shen and Abate-Shen, 2010). The morbidity and mortality of prostate cancer is principal caused of its propensity to metastasize to other tissue, such as lung, liver and bone (Bubendorf et al., 2000; Logothetis and Lin, 2005). MicroRNAs (miRNAs) were discovered in1993 by Victor Ambros et al. during a study of the gene lin-14 in C. elegans development (Lee et al., 1993). They are short, non-coding RNAs with an average of 22 nucleotides in length that usually bind to partially complementary sites in the 3’-untranslated region (UTR) of their mRNA targets (Lee et al., 1993; Wightman et al., 1993). They regulate gene expression by mRNA cleavage and at posttranslational level by translational suppression and play important roles in various biological and metabolic processes (Bartel, 2004; Min and Yoon, 2010). Associations of miRNA expression and tumorigenesis have been observed in a variety of human malignancies

Several miRNAs expression profiles have been reported regarding Prostate cancer (Lu et al., 2005;Mattie et al., 2006;Porkka et al., 2007;Ozen et al., 2008). However, there is still an urgent need to identify new miRNAs as diagnostic biomarkers for prostate cancer. In this study, we collected miRNA expression microarray dataset of prostate cancer from GEO database and identified differentially expressed miRNAs in cancerous tissue compared with normal tissue. After network analysis and pathway enrichment analysis, we finally demonstrated that miR-20 may play an important role in the regulation of prostate cancer onset.

Affymetrix microarray data
We extracted the miRNA expression profile from the study of Wach et al. (2012), which were deposited in GEO (Gene Expression Omnibus) database (ID: GSE23022). The study was carried out to identify and characterize the diagnostic potential of miRNAs in prostate cancer. A total of 40 chips were available, including 20 chips from moderately differentiated prostate cancer and 20 chips from adjacent noncancerous tissue. These tissues were prepared from prostatectomy specimens from men with so far untreated prostate cancer between 1994 and 1999. None of patients had detectable distant metastases at the time of surgery.
Pathway data KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of online databases dealing with genomes, enzymatic pathways, and biological chemicals (Kanehisa, 2002). The PATHWAY database records networks of molecular interactions in the cells, and variants of them specific to particular organisms (http://www.genome.jp/ kegg/). Total 130 pathways, involving 2287 genes, were collected from KEGG.

Data preprocessing
The probe-level data in CEL files were converted into expression measures and performed background correction and quartile data normalization by the robust multiarray average (RMA) (Irizarry et al., 2003) algorithm with defaulted parameters in R affy package (Gautier et al., 2004;Team, 2011).

Differentially expressed miRNA (DE-miRNA) analysis
The t-test and wilcox test were used to identify miRNAs that were significantly differentially expressed between prostate cancer tissue and noncancerous tissue, respectively. Then, we selected the overlapping DE-miRNAs of these two methods as the final result. The p-value adjusted by the Benjamin and Hochberg (BH) (Benjamini, 1995) method based on the multtest package (van der Laan et al., 2004) of 0.05 was used as the cut-off criterion.

Obtaining target genes for DE-miRNAs
TargetSan (Garcia et al., 2011), miRanda (John et al., 2004) and PicTar (Krek et al., 2005) databases were used to retrieve the target genes of DE-miRNAs. They use both miRNA sequences and 3'UTR of proteincoding mRNA sequences as input files generally in fasta format and determine their binding ability by calculating the minimum free energy for hybridization. After the prediction of putative miRNA target genes by each tool with the default parameters, we extracted those target genes shared by at least 2 of these 3 tools to obtain a more solid result.

Network analysis and pathway enrichment analysis
The STRING (Search Tool for the Retrieval of Interacting Genes) (Szklarczyk et al., 2011)database provides both experimental and predicted interaction information. Version 9.0 of STRING covers more than 1100 completely sequenced organisms. All associations are provided with a probabilistic confidence score, which is derived by separately benchmarking groups of associations against the manually curated functional classification scheme of the KEGG database. Each score represents a rough estimate of how likely a given association describes a functional linkage between two proteins that is at least as specific as that between an average pair of proteins annotated on the same 'map' or 'pathway' in KEGG. We used the STRING database to annotate functional interactions between DE-miRNA target genes and other genes by calculated their confidence score.
To functionally classify these genes in the interaction network, we performed pathway enrichment analysis by mapping these genes to KEGG database (Kanehisa, 2002). The count number larger than 2 and FDR less than 0.01 were chosen as cut-off criterion.

Differentially expressed miRNA analysis between prostate cancer and healthy control
We obtained publicly available microarray dataset GSE23022 from GEO database. The t-test and wilcox test were used to identify the miRNAs specifically differentially expressed between prostate cancer tissue and noncancerous controls, with multiple testing correction. At an adjusted p-value of 0.05, 16 miRNAs showed a significant differential expression (Table 1).

Target genes of DE-miRNAs Obtainment
S i n c e m i R N A s p l a y i m p o r t a n t r o l e s o f posttranscriptional regression by targeting mRNAs, we studied the function of the DE-miRNAs by identifying   putative target genes. Target genes were retrieved from TargetScan, miRanda and PicTar databases. Finally, we obtained 9 target genes corresponding to 3 miRNAs (Table 2). Among the 9 target genes, 4 (bolded ones) were validated previously by experiments and the rest could be putative ones.

Interaction network construction of DE-miRNA target genes
We mapped the DE-miRNA target genes to STRING database and screened significant interactions with score larger than 0.9. By integrating these relationships above, we constructed an interaction network between DE-miRNAs and their interactive genes (Figure 1). Only 3 of the above 9 target genes existed in this network: HIF1A (hypoxia inducible factor 1 A), VEGFA (vascular endothelial growth factor A) and CDKN1A (cyclin dependent kinase inhibitor 1A). The average degrees of these three genes were 44, 43 and 33. The average degree is the average number of edges connecting all the nodes in the network. Higher values for average degree indicates a better connected network and is likely more robust. This result suggests that these three genes were hub nods in the network and play critical roles in the prostate cancer.

Pathway enrichment analysis of genes in the interaction network
We performed pathway enrichment analysis by mapping genes in the interaction network to KEGG database. A total of 16 pathways were enriched with the strict cutoff criterion, including pathways in cancer, cell cycle and prostate cancer (Table 3). Figure 2 shows the KEGG map of prostate cancer. We could find that the DE-miRNA target gene of CDKN1A (cyclin-dependent kinase inhibitor 1A, p21) is an oncogene of prostate cancer. As CDKN1A is the target gene of hsa-miR-20, we concluded that miRNA 20 might play an important role in the regulation of prostate cancer onset.

Discussion
In this study, we analysed the expression profile of 678 human miRNAs in 20 matched tissue samples of histologically confirmed prostate cancer tissue and adjacent nonmalignant tissue downloaded from GEO database. A total of 16 miRNAs displayed a significant differential expression in cancerous tissue compared to noncancerous tissue. Of these DE-miRNAs, we identified that miRNA-20 may play an important role in the regulation of prostate cancer onset.
Two previous studies using these same expression data have been published. The first was the initial description of the population and the array data (Wach et al., 2012), in which the authors identified a total of 25 miRNAs whose expression differed between the groups and indicated that MiRNAs as single biomarkers or in combination could be useful in the diagnosis of prostate cancer. The second study reanalyzed the original data , and identified a miRNA for normalization in miRNA expression studies of prostate cancer. Our current study reanalyzes the original data and adds interaction network analysis which provides information on the relationships between DE-miRNA target genes and other genes. With the strict cut-off score of 0.9, 3 target genes (HIF1A, VEGFA and CDKN1A) were identified in the network and were shown hub nodes of the network. The importance of a gene is often dependent on how well it associates with other genes in a network. Studies suggest that more centralized genes in the network are more likely to be key drivers to proper cellular function than peripheral genes (nodes) (Horvath et al., 2006).
HIF1A is a key transcription factor that has been implicated in promoting tumor cell survival, proliferation and invasion following the onset of tumor hypoxia (Semenza, 2003). Increased expression of HIF1A in Prostate cancer cells has been correlated with faster tumor growth and higher metastatic potential (Hao et al., 2004;Kimbro and Simons, 2006). HIF1A expression has also been observed to increase as prostate tumors progressed from androgen-dependent to androgenindependent states (Zhong et al., 1998). VEGFA is a member of the VEGF growth factor family which promotes endothelial cell proliferation, survival and migration via binding to 2 specific tyrosine kinase receptors (Ferrara et al., 2003). Up-regulation of VEGF has been associated with significantly increased risk of prostate cancer in two small case control studies (McCarron et al., 2002;Sfar et al., 2006). CDKN1A functions as a regulator of cell cycle G1 phase arrest in response to a variety of stress stimuli. CDKN1A has a great impact on the cell cycle of prostate cancer cells and may play a role in the cancer cells in a p53-independent pathway (Wang et al., 2005).
By mapping the genes in the interaction network to KEGG pathways, we concluded that miRNA-20 whose target gene is CDKN1A might play an important role in prostate cancer onset. The human miR-20 located on chromosome 13q31, undergoes loss of heterozygosity in several different cancers, including prostate cancer. It increased apoptosis in A549 lung cancer cells and promoted osteogenic differentiation of human mesenchymal stem cells by co-regulating BMP signaling (Zhang et al., 2011). MiR-20 regulates cell growth via suppression of E2F1 expression (O'Donnell et al., 2005) and appears to be over-expressed in colon, pancreas and prostate tumors while being down-regulated in breast cancer tumors (Volinia et al., 2006).
In conclusion, we have used the miRNA expression profile downloaded from GEO database to identify miRNAs that differentially expressed in prostate cancerous tissue compared to noncancerous tissue. Our analysis identified several DE-miRNA target genes might play crucial roles in prostate cancer onset, including HIF1A, VEGFA and CDKN1A. Further, we demonstrated that miR-20 might play an important role in the regulation of prostate cancer onset. MiR-20 as single biomarker or in combination could be useful in the diagnosis of prostate cancer. We anticipate our study could provide groundwork for further experiments.