Identifying Differentially Expressed Genes and Small Molecule Drugs for Prostate Cancer by a Bioinformatics Strategy

PURPOSE
Prostate cancer caused by the abnormal disorderly growth of prostatic acinar cells is the most prevalent cancer of men in western countries. We aimed to screen out differentially expressed genes (DEGs) and explore small molecule drugs for prostate cancer.


MATERIALS AND METHODS
The GSE3824 gene expression profile of prostate cancer was downloaded from Gene Expression Omnibus database which including 21 normal samples and 18 prostate cancer cells. The DEGs were identified by Limma package in R language and gene ontology and pathway enrichment analyses were performed. In addition, potential regulatory microRNAs and the target sites of the transcription factors were screened out based on the molecular signature database. In addition, the DEGs were mapped to the connectivity map database to identify potential small molecule drugs.


RESULTS
A total of 6,588 genes were filtered as DEGs between normal and prostate cancer samples. Examples such as ITGB6, ITGB3, ITGAV and ITGA2 may induce prostate cancer through actions on the focal adhesion pathway. Furthermore, the transcription factor, SP1, and its target genes ARHGAP26 and USF1 were identified. The most significant microRNA, MIR-506, was screened and found to regulate genes including ITGB1 and ITGB3. Additionally, small molecules MS-275, 8-azaguanine and pyrvinium were discovered to have the potential to repair the disordered metabolic pathways, abd furthermore to remedy prostate cancer.


CONCLUSIONS
The results of our analysis bear on the mechanism of prostate cancer and allow screening for small molecular drugs for this cancer. The findings have the potential for future use in the clinic for treatment of prostate cancer.


Introduction
As a kind of malignancy, prostate cancer is resulting from the pathological changes in men's prostate tissue (DeMarzo et al., 2003).The incidence of prostate cancer has significant geographic and racial differences.In
In this paper, microarrays were utilized to identify differentially expressed genes (DEGs) between cancer and normal prostate cells.Significance of differential expression was tested by Limma and adjusted for multiple testing with the Benjamini and Hochberg (BH) procedure.The functions of DEGs were investigated by Gene Ontology (GO) and pathway enrichment analysis.Additionally, several target sites of the transcription factors and some regulatory microRNAs were screened out, helping us to elucidate the mechanism of the prostate cancer on a molecular level.In addition, candidate small molecules were identified for their potential use in the treatment of prostate cancer.

Derivation of genetic data
The gene expression profile of GSE38241 (Aryee et al., 2013)were downloaded from a public functional genomics data repository GEO (Gene Expression Omnibus, http://www.ncbi.nlm.nih.gov/geo/)database.Total 39 specimens, including 21 normal samples and 18 prostate cancer specimens, were available based on the GPL4133Platform.

DEGs analysis
We analyzed the derived genetic data by Geoquery package and Limma package in R language (v.2.13.0) (Team, 2011).Geoquery can quickly access the expression profiling data on the GEO database, while Limma is the most popular method in the statistical analysis to study the DEGs (Diboun et al., 2006;Smyth, 2004).The preprocessed microarray data were obtained by Geoquery package then performed log2 transformation.We applied the Limma package, a linear regression model, to compare the normal samples and prostate cancer samples.Only the genes with p-value < 0.05 were screened out as DEGs.

GO enrichment analysis
GO analysis has become a common approach for functional annotation of large-scale genomic data (Hulsegge et al., 2009).GOEAST (Gene Ontology Enrichment Analysis Software Toolkit) an easyto-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets (Zhengand Wang, 2008).
GOEAST was utilized for GO enrichment analysis to identify the locations of DEGs in cellular compartments and the DEGs affected molecular functions, based on the hypergeometric distribution, with the false discovery rate (FDR) less than 0.05.

Biological pathway enrichment analysis
Biological pathways were studied to explore the prostate cancer cells changes on the molecular level.All metabolic and non-metabolic pathways were downloaded from the open WikiPathways (Pico et al., 2008;Kelder et al., 2012) database, and conducted Wikipathways cluster analysis (Zhang et al., 2005;Duncan et al., 2010) to the DEGs by Gene Set Analysis Toolkit V2 platform.The count number larger than 2 and p-value less than 0.00001 were chosen as cut-off criterion.

Exploring potential target sites of transcription factors and potential regulatory microRNAs
Well-annotated gene sets in MsigDB (Molecular Signature Database, http://www.broadinstitute.org/gsea/msigdb/index.jsp) were conducted Gene Set Enrichment analysis (GSEA) (Subramanian et al., 2005).Then the GSEA results were statistically accounted with the hypergeometric distribution.Afterwards, the consequences were adjusted for multiple testing with BH.Finally, the target sites with FDR < 0.05 were selected as the potential target sites, which can regulate transcription factors.Similarly, the potential regulatory microRNAs were identified with FDR < 0.05.

Identification of candidate small molecules
CMap (The connectivity map) database contains 7056 gene-expression profiles data involving 6100 small molecules treatment-control pairs (Lamb et al., 2006).The DEGs were divided into up-and down-regulated groups.The top 500 genes with a highly significant correlation in each group were screened out.Then, these genes were conducted GSEA and compared with the DEGs in CMap database.Finally, a correlation score for each perturbagen was calculated, ranging from -1 to +1 (Braconi et al., 2010).

Screening DEGs in prostate cancer cells
Based on the Geoquery and Limma package in R language, at a p-value of 0.05, a total of 10611 probes were identified to be differentially expressed in prostate cancer samples compared with normal controls, which corresponded to 6588 DEGs.

Gene function annotation
To determine the function of DEGs in prostate cancer,   the DEGs were mapped to the GO database.Figure 1 showed the cellular compartments in which the DEGs were most located, such as centrioles, spindle fibers, microtubules (Figure 1).Meanwhile, Figure 2 displayed the molecular functions of DEGs, for instance, the binding of double-stranded DNA, transcription cofactor activity, signaling transduction activity (Figure 2).

Pathway enrichment analysis
In order to gain further insights into the changes of biological pathways in prostate cancer cells, we adopted WikiPathways cluster analysis to identify the significant pathways related with the DEGs.Total 49 pathways were identified and the top 10 pathways with a highly significant correlation were listed in Table 1 (Table 1).The most significant enrichment pathway was focal adhesion (p-value = 7.08E-21).And the genes enriched in focal adhesion were ITGB6, ITGB3, ITGAV and ITGA2.

Exploring potential target sites
As an important regulatory element, transcription factors can regulate the gene expression.Taking upstream sequences of the DEGs as the analyzed object, we explored the potential target sites of the transcription factor.In Table 2, the top 20 target sites with a highly significant correlation were listed (Table 2).The most significant transcription factor was SP1, and SP1 may regulate the ARHGAP26 and USF1 by binding the target sequence GGGCGGR.

Exploring the potential regulatory microRNA
MicroRNA can regulate the gene expression by adjusting the stability of mRNA.The potential regulatory microRNAs were screened out based on the sequences of DEGs.The top 20 instances with a highly significant correlation were enumerated in Table 3 (Table 3).The most significant microRNA was MIR-506, and MIR-506 may regulate the ITGB1 AND ITGA3 by binding the target sequence GTGCCTT.

Related small molecule drugs screening
In order to screen out small molecule drugs, we performed computational bioinformatics analysis of DEGs using the CMap.Total 20 related small molecules with a highly significant correlation were listed in Table 4, including 9 negatively related molecules and 11 positively related small molecules (Table 4).Among these molecules, MS-275, 8-azaguanine and pyrvinium with the highest negative correlation had the potential to treat the prostate cancer.

Discussion
The clinical heterogeneity and high prevalence of prostate cancer, raises challenges in the management of newly diagnosed patients (Taylor et al., 2010).Therefore, there is an urgent need to explore the mechanism of prostate cancer, and develop an effective prevention strategy for it.In the present study, we utilized the gene expression profile downloaded from GEO to explore the mechanism of prostate cancer development.Furthermore, small molecule drugs which had the potential to treat the prostate cancer were screened out.Finally, a total of 6588 DEGs were identified between prostate cancer samples and normal controls.
Current strategies typically study entire pathways, whether by singular enrichment analysis or by gene set enrichment analysis (Shermanand Lempicki, 2009).In this study, we identified 49 pathways and screened out focal adhesion as the most significant pathway in the development of prostate cancer.The focal adhesion is a prominent determinant in cancer initiation, progression and metastasis (Luoand Guan, 2010).The genes in integrin family, such as ITGB6, ITGB3, ITGA2 and ITGAV, are closely related to focal adhesion.Among this family, ITGB6 (integrin-beta 6), which mediates the interactions between adjacent cells and between cells and extracellular matrix, is down-regulated during tumor progression (Pontes-Júnior et al., 2009).Meanwhile, ITGB3 and ITGA2 have been found to play important roles in breast cancer and colorectal cancer (Langsenlehner et al., 2006).In addition, ITGAV is related to many cancer types among which prostate and breast cancer for which it is important in the bone environment to the growth and pathogenesis of cancer bone metastases (Daemen et al., 2008).Therefore, our study indicates that these integrin genes located in focal adhesion play crucial roles in cancer initiation, progression and metastasis.
Great deals of reports have declared that abundant of transcription factors are overactive in human cancer cells, which make them targets for the studies of  cancer mechanisms.In this paper, the most significant transcription factor is SP1 (specificity protein 1), which binds to GC/GT-rich promoter elements, such as GGGCGGR.A number of genes, such as ARHGAP26 and USF1, which contain this sequence, can be identified by SP1.ARHGAP26 has been found shown to be expressed in a subset of ovarian cancer tissues at high levels, while it is absent or present only at low levels in normal tissues (Jarius et al., 2013).USF1 (upstream stimulatory factor 1) is a transcriptional suppressor of human telomerase reverse transcriptase in oral cancer cells (Chang et al., 2005).
MicroRNAs are small regulatory RNAs that regulate the translation and degradation of target mRNAs and are extensively involved in human cancers (Fabbri et al., 2007).The most significant microRNA in our study is MIR-506 and its targeting sequence is GTGCCTT.Genes, such as ITGB1 and ITGA3, which contain this sequence can be regulated by MIR-506.ITGB1 in focal adhesion has been found play crucial roles in biological pathway.Additionally, it also has been discovered can be regulated by MIR-506.Therefore, ITGB1 may play an essential role in the pathogenesis of prostate cancer.Because this finding has not been reported by any other researchers, our study sets a new insight to explore the mechanism of prostate cancer.
Although there are several useful medicines in prostate cancer treatment, such as dutasteride (Andriole et al., 2010) and microRNA miR-34a inhibitor (Liu et al., 2011), these medicines are far from enough for prostate cancer treatment.Recent studies elucidate that small molecules can enhance the efficacy of cancer drugs and treat cancer (Sugahara et al., 2010).In our study, three molecule drugs (MS-275, 8-azaguanine and pyrvinium) were screened out which may effective for prostate cancer treatment.MS-275, a histone deacetylase inhibitor, has been demonstrated to display antiproliferative activity towards several human cancer cell lines, including breast, colorectal, lung, ovarian and pancreatic cancer cells (Altmann et al., 2010).Meanwhile, the incorporation of 8-azaguanine in m-RNA of tumor cells has been found to inhibit protein synthesis and has been implicated as a lead molecule in cancer therapy (Gogiaand Puranik, 2013).Additionally, pyrvinium, a classical anthelminthic, potently inhibited proliferation and STAT3 Tyr705 phosphorylation of human myeloma/erythroleukemia cells (Harada et al., 2012).Consequently, our present findings conform to the previous studies.In conclusion, our studies shed new light on the mechanism and treatment of prostate cancers.DEGs of prostate cancer were analyzed by a computational bioinformatics approach.Meanwhile, the changed biological pathways in cancer cells were identified, and the potential targets of transcription factors and regulatory microRNAs were enriched.Furthermore, three small molecule drugs (MS-275, 8-azaguanine and pyrvinium) capable of treating the prostate cancer were screened.Our research may provide a new strategy in the medical therapy of prostate cancer.Since the increasing public availability of genomic data, we predict that this approach will be an attractive strategy that could be used in many other researches.

Figure 1 .
Figure 1.The Enriched Gene Ontology (GO) Terms of the Cellular Compartment of the Differentially Expressed Genes (DEGs).The colored entries are the significant aggregation (FDR < 0.05) ones, and the deeper colored entries represent the more significant ones

Figure 2 .
Figure 2. The Enriched Gene Ontology (GO) Terms of the Molecular Function of the Differentially Expressed Genes (DEGs).The colored entries are the significant aggregation (FDR <0.05) ones, and the deeper colored entries represent the more significant ones