Bioinformatics Analysis Reveals Significant Genes and Pathways to Targetfor Oral Squamous Cell Carcinoma

Purpose: The purpose of our study was to explore the molecular mechanisms in the process of oral squamous cells carcinoma (OSCC) development. Method: We downloaded the affymetrix microarray data GSE31853 and identified differentially expressed genes (DEGs) between OSCC and normal tissues. Then Gene Ontology (GO) and Protein-Protein interaction (PPI) networks analysis was conducted to investigate the DEGs at the function level. Results: A total 372 DEGs with logFC| >1 and P value < 0.05 were obtained , including NNMT, BAX, MMP9 and VEGF. The enriched GO terms mainly were associated with the nucleoplasm, response to DNA damage stimuli and DNA repair. PPI network analysis indicated that GMNN and TSPO were significant hub proteins and steroid biosynthesis and synthesis and degradation of ketone bodies were significantly dysregulated pathways. Conclusion: It is concluded that the genes and pathways identified in our work may play critical roles in OSCC development. Our data provides a comprehensive perspective to understand mechanisms underlying OSCC and the significant genes (proteins) and pathways may be targets for therapy in the future.


Introduction
Oral squamous cell carcinoma (OSCC) is one of the common solid tumors originating from abnormal squamous cell in oral region.The incidence of OSCC is highest (80%) among head and neck squamous cell carcinomas (LandisMurray et al., 1999).OSCC is characterized by invasion of epithelial tumor cells to underlying tissues and more prevalent in old population (Deyhimi et al., 2013;Neville et al., 2002).OSCC is often asymptomatic in early stage until it is advanced to late stage (Scott et al., 2005).The diagnosis of OSCC is usually delayed and the fiveyear survival rates among the patients with advanced oral cancers were only 20% (Spiro, 1985;Mashberg and Samit, 1989).OSCC is a serious global health issue remained to be resolved.
OSCC with remarkable incidence and fairly poor prognosis have encouraged many studies to explore the underlying mechanism of OSCC development.Some genes were indicated to be aberrantly methylated in the progression of OSCC such as p16, p15, hMLH1 (Human Mut-L Homologue 1), MGMT (O 6-methylguanine-DNA methyltransferase) and E-cadherin (Viswanathanet al., 2003).Patients with these methylated genes were indicated to have high risk for OSCC.In addition, some cytokines and pathways were reported to play a key role in OSCC development.The expression of interleukin-8 (IL-8) was demonstrated to be elevated and speculated to contribute to the invasion of oral squamous tumor cell by regulating
Although immeasurable contributions have been made, the molecular mechanism of OSCC seems to be less well clarified.In this work, we use bioinformatics methods to explore the differentially expressed genes (DEGs) between oral squamous carcinoma tissues and normal tissues.And the GO (Gene Ontology) analysis was constructed to investigate the critical genes in the progression of OSCC.Furthermore, protein-protein interaction (PPI) networks construction and pathway enrichment analysis was performed to estimate the significant pathways.This work provided a systematic perspective to understand the underlying mechanism in OSCC development.

Affymetrix microarray data
The transcription profile of GSE31853 was obtained from GEO (Gene Expression Omnibus) database (http:// www.ncbi.nlm.nih.gov/geo/).Total 11 chips were deposited by YapJenei et al. (2009).In this paper, we only selected 9 samples for analysis including 8 samples of oral squamous carcinoma tissues and 1 sample of normal tissue based on the platform of GPL96 [HG-U133A] Affymetrix Human Genome U133A Array.The raw data and the probe annotation files were downloaded for further analysis.

Differentially expressed genes (DEGs) analysis
The raw data in CEL files were normalized using RMA (Robust Multiarray Analysis) algorithm in R Affy package (IrizarryHobbs et al., 2003).Then 9 datasets were assigned into two groups: oral squamous carcinoma group (8 samples) and control group (1 sample).The differentially expressed genes between oral squamous carcinoma tissue and normal tissue were estimated by T-test of limma package (DibounWernisch et al., 2006).A mass of LogFC and P values were obtained and we set |logFC|>1 and P<0.05 as the cutoff criterion.

Gene Ontology (GO) analysis
Gene Ontology (GO) is a freely available tool for gene or gene sequence annotation, which concerns domains of molecular and cellular biological (HarrisClark et al., 2004).DAVID bioinformatics resources possesses abundant tools for systematic and integrative analysis of large lists of genes (Da Wei Huangand Lempicki, 2008).The DEGs obtained in this paper were divided into two groups including up-regulated group and down-regulated group.To investigate the differentially expressed genes in a functional level, DEGs in two groups underwent GO analysis independently by using DAVID (Huang daSherman et al., 2007).Finally, we identified the overrepresented GO categories with P < 0.05.

Protein-protein interaction (PPI) network
Protein-protein interactions can help to understand functions of proteins in molecule level and explore cell regulatory mechanism.And the interactions of proteins can be inferred from the respective interactions of genes coded them (Von MeringHuynen et al., 2003a).STRING database is used for evaluating and pre-computing proteins interactions on global view (von MeringHuynen et al., 2003b) which contains dataset of 89 whole genomes sequencing including 261033 orthologous genes.
The DEGs in up-regulated and down-regulated group were mapped into PPI network by cytoscape software.The interactions between two proteins were scored by STRING 9.0 tool.The protein pairs with Required Confidence Score > 4 were defined to have close interactions.

Cluster analysis for PPI networks
We performed cluster analysis for PPI networks using ClusterONE of Cytoscape software and set P value <1.0E-5 to select the enriched modules.
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database consists of a large scale of genome information which can provide classifications of predicted genes to their respective pathways (Altermannand Klaenhammer, 2005).Then the KEGG pathway analysis was conducted to evaluate the modules in functional level by DAVID online tool.We defined the significant pathways with P value<0.05.

Differentially expressed genes (DEGs)
To explore the DEGs between oral squamous carcinoma tissues and normal tissues, we applied limma package to analyze the GSE31853 dataset from GEO.Finally, we selected 372 DEGs with |logFC|>1and P value<0.05including 67 up-regulated genes such as NNMT (Nicotinamide N-methyltransferase), VEGFA (Vascular endothelial growth factor) and 305 down-regulated genes such as MMP9 (Matrix Metalloproteinases-9), LAMC2 (gama two chain gene), BAX (Bcl-2-associated X protein) and CD44 (a cell-surface glycoprotein).The   Nodes represent proteins, edges represent interactions between two proteins.The higher the node shape, the greater degree of connection DEGs expression heat map was shown in Figure 1.

GO enrichment analysis
To identify the functions of the DEGs, we performed GO analysis and set P<0.05 as the cutoff value.The results were shown in Table 1 and 2 The most enriched GO terms of down-regulated DEGs were relevant with BP (sterol metabolism, cholesterol metabolism, steroid metabolism and isoprenoids metabolism), endoplasmic reticulum related CC (endoplasmic reticulum membrane, nuclear envelopeendoplasmic reticulum network, endoplasmic reticulum) and MF associated with enzymic inhibition (endopeptidase inhibitor activity, peptidase inhibitor activity, serine-type endopeptidase inhibitor activity and enzyme inhibitor activity).

Cluster analysis for PPI networks
In order to explore the functions of the modules of networks, we implemented the cluster analysis.As shown in Figure 3, 1 module (P value <1.0E-5) of up-regulated PPI network was obtained with 31 nodes and 271 edges.In down-regulated PPI network, there were 2 enrichment modules: A with 18 nodes and 97 edges, B with 28 nodes and 41 edges (Figure 4).
After KEGG pathway analysis for the enriched modules, we obtained significant pathways with P value<0.05(Figure 5).In the up-regulated module, the significant pathways were DNA replication and Cell cycle, while in the down-regulated modules, the significant pathways in A and B were Steroid biosynthesis, Terpenoid backbone biosynthesis, synthesis and degradation of ketone bodies; Notch signaling pathway, respectively.

Discussion
Oral squamous cell carcinoma (OSCC) with ranked mortality rate has been a health problem hotly discussed all over the world (Lo et al., 2007).A comprehensive analysis of the mechanism underlying OSCC development is of crucial importance for management policy.In this paper, we provided a systematic perspective to understand the mechanism using bioinformatics method.We first analyzed the differentially expressed genes (DEGs) between oral squamous carcinoma tissues and normal tissues.Besides we performed GO analysis to evaluate the DEGs in function level and construct the protein-protein interaction (PPI) networks for significant pathways.
Results showed that there were total 372 DEGs detected in our study.The differentially expressed genes including NNMT, BAX, MMP9 and VEGF have been reported to be tightly associated with OSCC disease.NNMT is indicated to be significantly up-regulated in tumor tissues compared with normal tissue in our study.NNMT coded for Nicotinamide N-methyltransferase play a key role in biotransformation of drug or other xenobiotics.A previous study suggested that NNMT, a novel gene, regulated cell migration which was necessary for cancer cell invasion and tumor stage (Wu et al., 2008).The expression of NNMT was extremely elevated in the progression and development of several tumor cells.It has been reported that the NNMT expression is related with differentiation of oral cancer cells and may be a biomarker for OSCC treatment (Emanuelli et al., 2010).A recent study revealed that the abnormal expression of NNMT was closely associated with the metabolism of oral squamous carcinoma cells (Pozzi et al., 2013).The decreased expression of NNMT was able to suppress cancer cell proliferation and tumor development.The inhibition of NNMT showed potential capacity for oral cancer treatment by molecular approach (Pozzi et al., 2013).Besides, MMP9 (Matrix Metalloproteinases-9) is reported to be significantly activated in malignant tissues of patients with oral cancer (Patel et al., 2007).The expressions of BAX and VEGF are demonstrated to be the prognosis biomarker for OSCC (Bose et al., 2012;Kämmereret al., 2012).
After GO analysis, the results showed that the overrepresented GO terms were related with nucleoplasm, response to DNA damage stimulus and DNA repair.Nucleoplasm plays a fundamental role in the proliferation of cancer cells (Tsai and McKay, 2002).Nucleophosmin  x-coordinate represents gene count; y-coordinate represents pathway term.Orange: pathways in up-regulated PPI network; Blue: pathways in down-regulated network A; Green: pathways in down-regulated network B DOI:http://dx. doi.org/10.7314/APJCP.2014doi.org/10.7314/APJCP. .15.5.2273Bioinformatics Analysis Reveals Significant Genes and Pathways to Target for Oral SCC (NPM1) has multiple functions in transcription, centrosome duplication, ribosome biogenesis and genomic stability.The overexpression of acetylated NPM1 is found to be existing in high grade of OSCC (Shandilya et al., 2009).Acetylated NPM1 primarily located in nucleoplasm and regulated cell survival and proliferation of oral cancer cells (Shandilya et al., 2009).Endogenous DNA damage increased the risk of cancer development (Loft and Poulsen, 1996).Cdc7-Dbf4 kinase (Dbf4-dependent kinase, DDK) is a critical factor regulating DNA replication and DNA damage (Costanzo et al., 2003).Cdc7 was found to affect the outcome of OSCC patients with chemotherapy.The up-regulated Cdc7 enhanced drug resistance of patients by suppressing the apoptosis and promoting DNA repair of oral squamous cancer cells (Cheng et al., 2013).
Furthermore the results from PPI network analysis indicated that the significant hub proteins (GMNN and TSPO) and the obviously dysregulated pathways (Steroid biosynthesis and Synthesis and degradation of ketone bodies) played key roles in OSCC development.GMNN is a critical factor in cell cycle, especially in S phase to M phase transition (Ma et al., 2012).Recently, GMNN was revealed to be a common cancer gene and amplified in tumor cells (Kim et al., 2012).It was also found that the expression of GMNN was increased in oral epithelial dysplasia (OED) cases, likewise the GMNN/Ki67 ratios were also significantly increased compared with premalignant and malignant tumours (Torres-Rendon et al., 2009).GMNN was considered to be a prognostic biomarker for OSCC progression.TSPO (translocator protein) was found to be involved in cell proliferation, tumor invasion, and metastasis (Batarseh and Papadopoulos, 2010).There was a significant increase of TSPO expression in oral cancer tumors compared with adjacent normal tissues.And patients with high levels of TSPO showed low five-year survival rate (Nagler et al., 2010).Although the detailed mechanism of TSPO affecting on OSCC is far from being clear, TSPO is of critical importance in OSCC development and progression.
Steroid biosynthesis pathway was found to be dysregulated compared with Oral squamous cancer cells and normal cells.The reports concerning the role of steroid in OSCC are relatively rare.A recent study suggested that the steroid hormones (estrogen beta, ERβ) was abundantly expressed in oral squamous cancer cells from both female and male patients (Marocchioet al., 2013).The detection of steroid expression (ERβ) may helpful to understand the role of these proteins in OSCC progression.Another significant pathway in this paper was found to be synthesis and degradation of ketone bodies.The production of ketone bodies has been found in the systemic metabolic response to early stage oral cancer.And the accumulation of ketone bodies is more pronounced in later stage cancer (Tiziani et al., 2009).The synthesis of ketone bodies was considered to be a marker for therapeutic implication of cancers (Veech, 2004).Recently, a novel compound BBSKE (1, 2-[bis (1, 2-Benzisoselenazolone-3 (2H)ketone)]ethane) was reported to possess anti-cancer properties and have potential therapeutic effect on oral squamous cell carcinoma (OSCC) (Xing et al., 2008).BBSKE showed significant effect on inhibiting cancer cell proliferation and inducing cancer cell apoptosis.The synthesis and degradation of ketone bodies was of crucial importance throughout the process of OSCC development.
In summary, the differentially expressed genes, hub proteins and pathways identified in our work showed close association with the progression of OSCC.The bioinformatics analysis provided a comprehensive perspective to understand the mechanism underlying OSCC development.The significant gene and pathways may be targets of treatment management for OSCC.However, further investigations are still necessary for unraveling the mechanism in the process of OSCC development.

Figure 1 .
Figure 1.The Heat Map of DEGs.The x-coordinate represents sample symbols(from left to right: 1 sample of normal tissue and 8 samples of oral squamous carcinoma tissues), y-coordinate represents differentially expressed probes.Green: low expression and Red: high expression . The DEGs mainly enriched in 3 GO categories including Biological Process (BP), Molecular Function (MF) and Cellular Component (CC).The up-regulated DEGs mainly enriched in CC and BP.The CC involved GO terms contained nucleoplasm, nuclear lumen, membrane-enclosed lumen, organelle lumen, intracellular organelle lumen and nucleolus.And the BP associated GO terms included DNA metabolic process, response to DNA damage stimulus, cellular response to stress, DNA repair, double-strand break repair, DNA recombination, chromosome, chromosomal part, condensed chromosome and nuclear chromosome.

Figure 3 .
Figure 3. Cluster Ananlysis for Up-regulated Network (P value=0).Nodes represent proteins, edges representinteractions between two proteins.The higher the node shape, the greater degree of connection

Figure 4 .Figure 5 .
Figure 4. Cluster Ananlysis for Down-regulated Network.A: P value=5.914E-7.B: P value=1.488E-6.Nodes represent proteins, edges represent interactions between two proteins.The higher the node shape, the greater degree of connection