Microarray and Next-Generation Sequencing to Analyse Gastric Cancer

Gastric cancer was one kind of tumors that originates in any part of the stomach. Currently the fourth most common malignancy in the world, gastric cancer is second to lung cancer as the leading cause of cancer mortality (Kamangar et al., 2006). The current global annual was shown 990, 000 new cases and 740, 000 deaths cases of gastric cancer. China is a high incidence country, each year up to 40 million new cases with gastric cancer develop, with 30 million deaths. If gastric cancer is diagnosed in an early stage, curative treatment can be achieved through complete surgical removal of the tumor tissue (Yamashita et al., 2011). Gastric cancer causes few symptoms in its early development, and a diagnosis is usually made after the cancer reaches an advanced stage. Moreover, even after having a gastric tumor surgically removed, many patients will experience disease recurrence and die within a few months to years. The 5-year relative survival rate of gastric cancer has not improved during the past 35 years, and remains stubbornly at 20-30%. So the person who with early detection and curve with small operation will extend long survival rate and times, which need new and good candidates for early detection of gastric cancer. Doctor administer the right drug combination for the


Introduction
Gastric cancer was one kind of tumors that originates in any part of the stomach. Currently the fourth most common malignancy in the world, gastric cancer is second to lung cancer as the leading cause of cancer mortality (Kamangar et al., 2006). The current global annual was shown 990, 000 new cases and 740, 000 deaths cases of gastric cancer. China is a high incidence country, each year up to 40 million new cases with gastric cancer develop, with 30 million deaths. If gastric cancer is diagnosed in an early stage, curative treatment can be achieved through complete surgical removal of the tumor tissue (Yamashita et al., 2011). Gastric cancer causes few symptoms in its early development, and a diagnosis is usually made after the cancer reaches an advanced stage. Moreover, even after having a gastric tumor surgically removed, many patients will experience disease recurrence and die within a few months to years. The 5-year relative survival rate of gastric cancer has not improved during the past 35 years, and remains stubbornly at 20-30%. So the person who with early detection and curve with small operation will extend long survival rate and times, which need new and good candidates for early detection of gastric cancer.
Doctor administer the right drug combination for the

Microarray and Next-Generation Sequencing to Analyse Gastric Cancer
Yuan Dang 1 *, Ying-Chao Wang 2, 3 , Qiao-Jia Huang 1 right person was the most important for extend patient life time, better longer than 5 years. So try to find the right cure root/way and right medicine to the right patient to inhibit tumor growth and make patient survival longer was the best hope of doctor. Recently years, microarray and NGS has become a very useful tool of comprehensive researching gastric cancer. Advances in high-throughput technologies such as microarray and NGS for gene expression profiles and oncogenic signaling pathways have reinforced the discovery of treatment targets and predictive biomarkers. To exploit informative biomarker is also obligatory to develop target treatment. The DNAbased markers include mutations, single nucleotide polymorphisms (SNPs), chromosomal aberrations, changes in DNA copy number, differential methylation. The RNA-based biomarkers include over-expressed or under-expressed transcripts and microRNAs. Microarray and NGS using the same kind sample of gastric cancer, only the determination are not the same, using NGS more optional to find some novel parts or mutations, but using microarray is superior to find some known genes or transcripts altered condition. Choosing either of technology depending on DNA or RNA level can solve some problems of gastric cancer. Microarray technology is a powerful tool for genomic analysis. It gives a global view of the genome in a single experiment. Data analysis of the microarray is a vital part of the experiment. Each microarray study comprises multiple microarrays, each giving tens of thousands of data points. In the study by lncRNAs chip (Song et al., 2013), testing total of 135 lncRNAs, which differential expression levels between tumor and non-tumorous tissues were more than two fold. When deep analysis with two lncRNAs, they found that the lncRNAs expression profile in gastric cancer suggests the potential roles of lncRNAs in gastric cancer occurrence and development. Overexpression of H19 in gastric cancer suggests that H19 may participate in gastric cancer. The decreased expression of uc001lsz in gastric cancer cell lines and tissues, which is associated with Tumor Node Metastasis (TNM) stage formation and dysregulation in early cancer and precancerous lesions, suggest that uc001lsz may be a potential marker for the diagnosis of early gastric cancer. NGS provides an unbiased approach to profiling all the transcribed molecules in a sample, which is not limited to the previously known or annotated transcripts. NGS accommodates a large dynamic range of expression levels, thereby allowing for a much more accurate quantification of genes with very low or high expression levels (Wang et al., 2009). There two kinds of NGS: genome and transcriptome sequencing. In addition to gene-level expression quantification, RNA-seq can detect other types of transcriptional signals, including alternative splicing, transcriptional starts/stops, gene fusion, and expressed alleles (Kahvejian et al., 2008). Recently, some researchers pay attention to some important part of genome (exon), and transcriptome (small RNA or lncRNAs), because they need research on small group data with deep analysis for useful and interesting part with useful function. Some groups performed next-generation transcriptome sequencing in gastric cancer, a lot of fusion transcript candidates were detected by bioinformatics filtering. After validated some of fusion transcripts, a novel DUS4L-BCAP29 fusion transcript was found in 2 cell lines and 10 gastric cancer tissues. Knockdown of DUS4L-BCAP29 transcript using siRNA could inhibit cell proliferation .
Using microarray and NGS to find new altered genes or transcripts are not the unique aim, the other aim (probably more important) was found the reason of alteration, and the altered gene how to affect gastric cancer in clinical. Fusion gene and lncRNAs was two important kind altered in studies, and more people pat attention on them for gastric cancer clinical analysis. Gene fusion is involved in the development of various types of malignancies. Recent advances in NGS technology have facilitated identification of gene fusions and have stimulated the research of this field in cancer . Fusion genes act as potent oncogenes, resulting from chromosomal rearrangements or abnormal transcription in many human cancers. Although multiple gastric cancer genomes have been sequenced, the driving recurrent gene fusions have not been well characterized (Zang et al., 2011). Long noncoding RNAs (lncRNAs) have emerged recently as major players in governing fundamental biological processes, and many of which are altered in expression and likely to have a functional role in tumorigenesis (Sun et al., 2014b).

Integrative Analysis Gastric Cancer by NGS
More and more researchers pay attention to DNA and RNA marker and hope to use them as the novel marker clinically. The DNA-based markers include mutations, SNPs, chromosomal aberrations, changes in DNA copy number, differential methylation. The RNAbased biomarkers include upregulated or downregulated transcripts and microRNAs. In this review, SNP, gene fusion and altered expression RNA were biased to specific description.
Some persons  use whole genome and transcriptome sequencing to test gastric cancer cell lines and primary tissues mutations. They identified 18, 377 Microsatellite (MS) mutations of five or more repeat nucleotides in coding sequences and un-translated regions of genes, and discovered 139 individual genes whose expression was down-regulated in association with UTR MS mutation. In addition, they found that 90.5% of MS mutations with deletions in gene regions occurred in UTRs. This analysis emphasizes the genetic diversity of MSI-H gastric tumors and provides clues to the mechanistic basis of instability in microsatellite unstable gastric cancers.
Exome sequencing using sequence capture technology and efficient whole genome exon regions of DNA capture and enrichment of the high-throughput sequencing , exon sequencing the genome relative low cost sequencing , the study of known genes the SNP, Indel etc. have greater advantages. Exome sequencing can be directly related to the discovery of protein function mutation of genetic variation, and thus to find disease genes and susceptibility genes associated with various diseases, diagnosis and treatment for the disease. The use of exome sequencing of tumor tissue targeted deep sequencing, chromosome mutations can be found in protein-coding regions and rearrangement events, revealing mechanisms of cancer development. Exon (expressed region) is part of a eukaryotic gene, it will be saved in the splice (Splicing), and the protein can be expressed as the process of protein synthesis. The final exon in the mature RNA gene sequences, also known as expression sequences, exists in the original transcript, but also present in the mature RNA molecule nucleotide sequence. In human genome there are about 180, 000 exons, accounting for 1% of the human genome, approximately 30MB.
There are 3 papers using whole exome sequencing analysis gastric cancer using different types and number samples. In 2011, Wang have sequenced 22 gastric cancer samples, and they found and validated frequent inactivating mutations or protein deficiency of ARID1A, in 83% of gastric cancers with microsatellite instability (MSI), 73% of those with Epstein-Barr virus (EBV) infection. The mutation spectrum for ARID1A differs between molecular subtypes of gastric cancer, and mutation prevalence is negatively associated with mutations in TP53. Clinically, ARID1A alterations were associated with better prognosis in a stage-independent manner . In 2012, Zang have sequenced 15 gastric adenocarcinomas and their matched normal DNAs. They have identified frequently mutated genes including TP53, PIK3CA and ARID1A, FAT4, ARID1A, MLL3 and MLL and reported the gene coding FAT4 and ARID1A exerts tumor-suppressor activity (Zang et al., 2012). In 2013, Kang sequenced 4 samples from patients with early gastric cancer (EGC) and compared the results with those from advanced gastric cancer (AGC). In both EGCs and AGCs, a total of 268 genes were commonly mutated and 516 genes or 3104 genes were independent mutations in EGCs and AGCs respectively. The DYRK3, GPR116, MCM10, PCDH17, PCDHB1, RDH5 and UNC5C genes are recurrently mutated in EGCs and may be involved in early carcinogenesis (Kang et al., 2013).
By incorporating the results of the expression data of protein-coding and miRNAs, researchers identified six key miRNAs in gastric cancer (miR-548d-3p, miR-20b, miR-135b, miR-140-3p, miR-93 and miR-19a) (Kim et al., 2012). These key miRNAs not only show significant expression variations across different sample groups, but also have detectable repression effects on the expression of their target genes. Ribeiro-dos-Santos sequenced a small RNA library of normal stomach tissue and identified 15 highly expressed miRNAs (Ribeirodos-Santos et al., 2010). Li sequenced small RNAs from one pair of noncancerous/tumor gastric tissues . Interestingly, they found that the 5p arm and 3p arm miRNAs derived from the same pre-miRNAs have different tissue preferences (noncancerous tissue vs. tumor tissue), implying a novel mechanism regulating mature miRNAs selection.
They applied an RNA-sequencing approach to gastric tumor and noncancerous specimens of Asian population, generating 680 million informative short reads to quantitatively characterize the entire transcriptome of gastric cancer (including mRNAs and microRNAs). They identified the central metabolic regulator AMPK-α as a potential functional target in Asian gastric cancer. Further, they also demonstrated the translational relevance of AMPK-α as a potential therapeutic target for early-stage gastric cancer in Asian patients (Kim et al., 2012).
They used paired-end transcriptome sequencing to identify novel gene fusions PPP1R1B-STARD3 in 18 human gastric cancer cell lines and 18 pairs of primary human gastric cancer tissues and their adjacent normal tissues. The presence of PPP1R1B-STARD3 correlated with elevated levels of PPP1R1B mRNA. PPP1R1B-STARD3 fusion transcript was detected in 21.3% of primary human gastric cancers but not in adjacent matched normal gastric tissues. This increased proliferation was mediated by activation of phosphatidylinositol-3-kinase (PI3K)/AKT signaling (Yun et al., 2013).
Some groups performed RNA-seq experiments comparing gastric cancer with normal tissues to find differentially expressed transcripts in intergenic regions. By analyzing RNA-seq and public microarray data, they identified 31 transcripts, including a known expressed sequence tag, BM742401, which was downregulated in cancer, and whose down-regulation was associated with poor survival in gastric cancer patients. Ectopic overexpression of BM742401 inhibited metastasisrelated phenotypes and decreased the concentration of extracellular MMP9. These results suggest that BM742401 is a potential lncRNAs marker and therapeutic target (Park et al., 2013).

Integrative Analyses of Gastric Cancer by Microarray
Recently, more and more researcher tries to find some SNPs with GWAS chip to analysis gastric cancer (Sakamoto et al., 2008;Roukos et al., 2009;Abnet et al., 2010;Wang et al., 2010;Luo et al., 2011;Shi et al., 2011;Jin et al., 2012;). For example: PSCA gene is possibly involved in regulating gastric epithelial-cell proliferation, influences susceptibility to diffuse-type gastric cancer by GWAS analysis (Sakamoto et al., 2008); another two new susceptibility loci was identified for non-cardia gastric cancer at 5p13.1 and 3q13.31 by genome-wide association study (Shi et al., 2011). Rs2274223 was a notable signal, a non-synonymous SNP located in PLCE1, for gastric cancer and SEC (Abnet et al., 2010). PLCE1 and C20orf54 were also focused in a study of esophageal squamous cell carcinoma (ESCC) by GWAS, and they have important biological implications for both ESCC and gastric cancer. The study shows that PLCE1 might regulate cell growth, differentiation, apoptosis and angiogenesis. C20orf54 is responsible for transporting riboflavin, and deficiency of riboflavin has been documented as a risk factor for ESCC and GCA (Wang et al., 2010). From the previous report shown GWAS was one of optional technologies to find new way for gastric cancer diagnose and therapy (Roukos et al., 2009). Research not only find candidate SNP for diagnoses and biomarker, some also try to precisely study the candidate SNP, such as PLCE1 at 10q23 (Luo et al., 2011). Yang selected seven top significant associated single nucleotide polymorphisms (SNPs) at 1q22 and 10q23 and conducted a population-based case control study in a southern Chinese population. Data from recent genome-wide association studies of gastric cancer and oesophageal squamous cell carcinoma in Chinese living in the Taihang Mountains of north-central China suggest that 1q22 and 10q23 are susceptibility-associated regions for gastric cancer. The results show that two SNPs at 1q22, rs4072037 and rs4460629, were significantly associated with a reduced risk of gastric cancer, best fitting the dominant genetic model (Yang et al., 2012).
Furuta evaluated gastric and colorectal cancer cell lines using Gene Chip Human Exon 1.0 ST Array (Affymetrix), focusing on protein kinase genes and genes belonging to the cadherin-catenin family. They found isoform 1 of PRKACB was a novel cancer-related variant transcript in gastric cancers. In addition, they clearly identified exon 3 or exon 3-4 skipping in catenin beta 1, a short intron insertion with exon 9 skipping in CDH1, and a deletional transcript of CDH13 in the exon array analysis (Furuta et al., 2012). Cui used exon arrays to identify 80 paired gastric cancer/reference tissues differentially expressed mRNAs. They found 715 and 150 genes exhibit significantly differential expressions in all cancers and early-stage cancers versus reference tissues respectively (Cui et al., 2011).
Some researcher pay attention to developed lymph node metastasis patients and hope to find new or altered lncRNAs as novel candidate biomarkers for the clinical diagnosis of advanced stage gastric cancer and that could be targets for further therapy. Comparison of differentially expressed transcripts between the groups identified 26 pathways that corresponded to altered transcripts. That shows that lncRNAs dysregulation exerts important roles in gastric cancer lymph node metastasis . Using lncRNAs chip could test hundreds of altered lncRNAs, but in gastric cancer, H19 was the important and always detect in altered lncRNAs. Levels of H19 and uc001lsz were confirmed the relationship between their levels and clinic pathological factors of patients with gastric cancer. The reduced expression of uc001lsz in gastric cancer cell lines and tissues, its associations with TNM stage, and its dysregulation in early cancer and precancerous lesions suggest that uc001lsz may be a potential marker for the diagnosis of early gastric cancer (Song et al., 2013). A receiver operating characteristic (ROC) curve was constructed for differentiating gastric cancer from benign gastric diseases (Song et al., 2013). H19 also was found highly expressed in stomach and liver cancer cell lines, while lowly expressed in lung cancer and prostate cancer cell lines. Uc001lsz was lowly expressed in gastric, lung and liver cancer cell lines, while highly expressed in prostate cancer (Song et al., 2013). Researchers download data online and reanalyzed them. The results indicated 88 lncRNAs were differentially expressed in gastric cancer, some of which have been reported to play a role in cancer, such as LINC00152, taurine upregulated 1, urothelial cancer associated 1, Pvt1 oncogene, small nuclear RNA host gene 1 and LINC00261 . That was a new way to analysis gastric cancer by microarray data.
A novel miRNAs expression microarray was developed and used to gastric cancer study. Volinia (Volinia et al., 2006) and Ueda (Ueda et al., 2010) evaluated aberrant miRNAs expression signatures in gastric tumor samples from Italian and Japanese patients, respectively.

Genes Detected by Microarray and NGS Associated with Gastric Cancer
After testing by microarray and NGS, there always a lot of altered genes were detected. Do these altered gene attended normal tissue canceration? How to exert their capacity? Altered gene could tell us what the difference between cancer or normal tissue, and how to happen and the mechanism of TNM staging and lymph node metastasis of gastric tumors, cell proliferation, cell apoptosis, et al. We will overview two kind altered genes, fusion gene and lncRNAs. Gene fusion was two different origin genes re-integration into a new gene, due to gene translocation, interstitial deletion, or chromosomal inversion. The fusion gene will possess new functions, or fusion gene can affect the expression amount of original two genes. Maybe also affect the function of the original two genes. Since gene fusion can introduce dramatic functional consequences in gene function, a number of fused genes have been identified as therapeutic targets.
Often, fusion genes are oncogenes. For example: include BCR-ABL (Nowell et al., 1962), TEL-AML1, and TMPRSS2-ERG with an interstitial deletion on chromosome 21, often occurring in prostate cancer (Tomlins et al., 2005). Alternatively, a proto-oncogene is fused to a strong promoter, and thereby the oncogenic function is set to function by an upregulation caused by the strong promoter of the upstream fusion partner (Vega et al., 2003). Oncogenic fusion transcripts may also be caused by trans-splicing or read-through events (Nacu et al., 2011).
PPP1R1B-STARD3 fusion gene was first identified in breast cancer cell line. Its fusion transcript was also detected in gastric cancer with NGS, which only existed in gastric cancers tissues but not normal tissues. Overexpression of PPP1R1B-STARD3 significantly enhanced colony formation, tumor growth (Yun et al.. 2013). Recently, and more fusion genes were detected by sequencing in gastric, involved CDK12-ERBB2 (Zang et al., 2011), NEU-ROD2-ERBB2 (Zang et al., 2011), AGTRAP-BRAF (Palanisamy et al., 2010) and DUS4L-BCAP29. When function was in process, the overexpression of fusion transcripts was associated with function of cell proliferation.
LncRNAs played essential regulatory roles and are dysregulated in a variety of tumors. However, which lncRNAs alters in gastric cancer and their functional mechanisms remain largely unknown. Recently, more and more lncRNAs were detected whether upregulated [e.g. HULC (Zhao et al., 2014), HOTAIR (Hajjari et al., 2013)] or downregulated [e.g. MEG3 (Sun et al., 2014b)] in gastric cancer tissues compared with adjacent normal tissues. And their expression levels were substantially correlated with TNM stages, depth of invasion, lymph node metastasis, distant metastasis, advanced tumor node metastasis stages and tumor size, poor survival. Moreover, patients with low levels of some lncRNAs expression had a relatively poor prognosis. GHET1 was up-regulated in gastric carcinoma. GHET1 over-expression promotes the proliferation of gastric carcinoma cells in vitro and in vivo. Knockdown of GHET1 inhibits the proliferation of cells (Yang et al., 2014).
The lncRNA H19 was the most often detected in microarray study of most cancers [ovarian cancer (Liu et al., 2013), kidney cancer (Zhou et al., 2014), et al]. It was also upregulated and play important roles in gastric cancer tumorigenesis. Recently, a new pathway of H19/ miR-675/RUNX1 was discovered, which regulates gastric cancer development (Zhuang et al., 2014). HULC was overexpressed in gastric cancer cell lines and gastric cancer tissues compared with normal controls. Overexpression of HULC promoted proliferation and invasion and inhibited cell apoptosis, while knockdown Using Microarray and Next-Generation Sequencing to Analyse Gastric Cancer of HULC in cells showed the directly opposite effect (Zhao et al., 2014). MEG3 downregulated in gastric cancer tissues compared with adjacent normal tissues. Knockdown of MEG3 expression could promote cell proliferation, and while overexpression of MEG3 has the opposite effect involved inhibited cell proliferation, promoted cell apoptosis, and modulated p53 expression in gastric cancer cell lines (Sun et al., 2014b). HOTAIR is one of long non-coding RNA which is associated with the progression of some cancer types (cervical cancer (Huang et al., 2014), small-cell lung cancer (Ono et al., 2014), breast cancer (Bhan et al., 2014), endometrial carcinoma (He et al., 2014), et al). HOTAIR is upregulated in gastric adenocarcinoma samples compared with normal adjacent gastric epithelium tissues, and upregulated HOTAIR was associated with TNM staging and lymph node metastasis of gastric tumors (Hajjari et al., 2013).
Downregulation of lncRNA GAS5 (Growth Arrest-Specific Transcript) in several cancers has been studied, such as Renal cell carcinoma (Qiao et al., 2013). Qiao study shown expressional level of GAS5 in RCC specimens was obviously lower than that in adjacent normal tissues and overexpression of GAS5 can inhibited cell proliferation, induced cell apoptosis and arrested cell cycling. In gastric cancer study, Sun found that GAS5 expression was markedly downregulated in gastric cancer tissues, and associated with larger tumor size and advanced pathologic stage. Moreover, ectopic expression of GAS5 was demonstrated to decrease gastric cancer cell proliferation and induce apoptosis in vitro and in vivo, while downregulation of endogenous GAS5 could promote cell proliferation. Finally, we found that GAS5 could influence gastric cancer cells proliferation, partly via regulating E2F1 and P21 expression (Sun et al., 2014a).

Challenges and Future Directions
The above-mentioned microarray and NGS studies of gastric cancer have provide tremendous insights into the identified molecular mechanism of gastric cancer and novel potential therapeutic targets, but there are still limitations.
Firstly, most of group employed small sample sizes to do this kind of study, because the clinic samples accumulation always take a heap of time, and single sample assay with microarray or NGS cost very high expense (around 1000-3000$), not to mention multi-samples assay. Thus, only the research group with sufficient funding could execute the individual study with microarray and NGS technology. Sample bioinformation analysis limit was at least 15-paired, however in little research lower than this mount due to cost and resource constraints. Gastric cancer is a heterogeneous disease in which each cancer patient with the big difference of genetic and molecular profile, there is limited statistical power to accurately detect prevalent therapeutic targets based on 30 samples. It was best to select additional samples for analysis in order to obtain reliable data. Although a big mounts difference or mutant was identified by each microarray or sequence study, high-frequency mutations was very difficult to be identified finally, because that gene was minority, dependent on huge samples assay and well funding.
Secondly, microarray and NGS analysis on DNA and RNA is usually employed as the only approach to comprehensively detect gene alterations. Gastric cancer is a complex disease, involving interactions between multiple layers of aberrations. The function of the gene was exhibited by proteins or some pathway. Formation of gastric cancer is a very complex process that involves many factors, participation of Helicobacter pylori, living habits, genetic factors, and even the lymph nodes metastasis and other organizations transfer. Using 1-2 genes or pathways to explain the occurrence and development, reliable diagnostic target of gastric cancer is impossible. It needed a variety of means, the combination of a variety of skills and knowledge to study gastric cancer.
Thirdly, researches about functions of genes which discovered in microarray and NGS studies are not enough. Resource constraints and not enough researchers are the main reason that only one or two gene was chose for functional investigation in individual lab, it was taking a long time to further analyze the gene, and also the gene was chosen to be analyzed depend on research perspective.
Fourthly, some results sharing platform was need constructed, so that results could be submitted to there, which limited the analyzing of small sample size thoroughly.

Conclusions
Gastric cancer is one of the most common cancers and one of the most frequent causes of cancer-related deaths. However, the prognosis and 5 years survival of gastric cancer remain very poor. As description above, the technology of microarray and NGS are useful to study the formation, development and progress of gastric cancer in a high-throughput way. Besides carcinogens, tumor suppressor, and biomarkers identification, both of two technologies help us to further understand the mechanism of tumorigenesis, metastasis, infiltration and so on. They also can be applied to early diagnosis or personal treatment by gastric cancer biomarker or durg-resistant gene determination respectively. Believing that with the development of gene and transcript of technology, in the near future gene and transcript will bring breakthrough for tumor diagnosis and treatment.