Linear and Conformational B Cell Epitope Prediction of the HER 2 ECD-Subdomain III by in silico Methods

Human epidermal growth factor receptor 2 (HER2), also known as ErbB2, c-erbB2 or HER2/neu, is a 185 kDa protein (p185) involved in signal transduction as well as regulating cell growth. Since the discovery of the role of HER2 in tumorigenesis, it has received great attention in cancer research and vaccine design during the past two decades (Tai et al., 2010). HER2 is a transmembrane glycoprotein contains three distinct regions: N-terminal extracellular domain (ECD), a single transmembrane α-helix (TM), and an intracellular tyrosine kinase domain. The largest part of HER2, the N-terminal ECD is composed of approximately 600 residues (90–110 kDa) which could be divided into four subdomains (I–IV). Subdomains I (residues 23-217) and III (343-510) created a binding site for the receptor’s potential ligands (Lax, 1988). While the cysteine-rich subdomains II (218-342) and IV (511-582) are involved in the homodimerization and heterodimerization (Fiore, 1987; Pietras, 1995). HER2 is highly expressed in a significant proportion of breast, ovarian, gastric, colon, prostate and lung cancers. HER2 overexpression in these cancers often correlates with


Introduction
Human epidermal growth factor receptor 2 (HER2), also known as ErbB2, c-erbB2 or HER2/neu, is a 185 kDa protein (p185) involved in signal transduction as well as regulating cell growth. Since the discovery of the role of HER2 in tumorigenesis, it has received great attention in cancer research and vaccine design during the past two decades (Tai et al., 2010). HER2 is a transmembrane glycoprotein contains three distinct regions: N-terminal extracellular domain (ECD), a single transmembrane α-helix (TM), and an intracellular tyrosine kinase domain. The largest part of HER2, the N-terminal ECD is composed of approximately 600 residues (90-110 kDa) which could be divided into four subdomains (I-IV). Subdomains I (residues 23-217) and III (343-510) created a binding site for the receptor's potential ligands (Lax, al., 2003;Park et al., 2008). Hence, these two MAb with clinical effects bind to the HER2 receptor in two different regions corresponding to domain II and IV, which indicate the potential inhibiting effects of other MAbs that target HER2 at different sites (Rockberg et al., 2009). However, more studies in clinical application showed that Patients who have had a significant therapeutic effect by Herceptin treatment started to appear the drug resistant (Calabrich et al., 2008). These results urge researchers to conduct more investigations and develop novel humanized recombinant MAbs for HER2. Afterwards, anti-HER2 MAb muA21 and CH401 were developed (Wang et al., 2000;Ishida et al., 1994). MAb muA21 was found specifically inhibited the growth of the human breast cancer SK-BR-3 cells and MAb CH401 had cytotoxity against HER2 expressing tumor cells. These MAbs were targeted the subdomain I of HER2 ECD.
HER2 is also considered as a target for vaccine development for primary prevention of cancers (Wiwanitkit, 2007). Although passive immunotherapy with trastuzumab is approved for treatment of breast cancer, a number of concerns exist with passive immunotherapy. For instance, the treatment is expensive, has a limited duration of action, and necessitates repeated administrations of the MAb. Active immunotherapy with synthetic peptide B-cell epitopes increases the possibility of generating an adaptive immune response, eliciting protein reactive high affinity MAbs (Garrett et al., 2007). Therefore, there is a need to further explore new epitopes of HER2 ECD and utilize them as peptide vaccine. Identification and selection of suitable epitopes is a time-consuming, expensive and difficult work requiring accurate experimental screening. Hence, cheaper strategies such as computational predictions could be used for faster and cheaper screening. Today, there are many bioinformatics tools available that can be used as primary screening step to identify targets of interest more efficiently (Chen et al., 2011). But there are limitations; it is not possible to develop an exact algorithm, consider to the defective knowledge about the immune response and these methods are working approximately well. Moreover, the prediction results produced by different tools may have distinct differences and multiple tools should be used to gain a consensus result. Therefore, in the current research new linear and conformational B-cell epitopes were predicted by combination of bioinformatics analyses.
B-cell epitopes are defined as regions on the surface of the native antigen that are recognized and bind to B-cell receptors or specific antibodies. B-cell epitopes can be categorized into two types: linear (continuous) epitopes and conformational (discontinuous) epitopes (Zhang et al., 2011). Linear epitopes comprise residues that are continuous in the sequence, while conformational epitopes consist of residues that are distantly separated in the sequence but have spatial proximity. These epitopes are the focus of pathogenesis and immunological research as well as the targets of development of vaccine, cancer therapy and diagnostic reagent (Jiang et al., 2010). Our studies showed that there is no MAb that targets subdomain III of HER2 due to the role of this subdomain in formation of ligand binding site for the receptor; we are trying to design B cell epitopes in the subdomain III. Therefore, the identification of B-cell epitopes for subdomain III of HER2 ECD can provide important information for the development of new MAb and cancer vaccines that target different epitopes or structural domains of HER2 ECD. In the present study, the linear and conformational B-cell epitopes of HER2 ECD were screened and identified using bioinformatics comprehensive analyses. An integrated strategy was employed for B cell epitope prediction by combining results from sequence and structure based methods and protein-protein interaction tools. Furthermore, we used SUPERFICIAL software that based on protein structures and generates library proposals including linear and nonlinear peptides.

Prediction of B cell epitopes by integrated strategy
For determination of linear epitopes, the sequence of ECD HER2 receptor was submitted to ABCpred, BCPREDS, Bcepred, Bepipred and Ellipro web servers Raghava, 2004: 2006;Larsen et al., 2006;Ponomarenko et al., 2008;Manzalawy et al., 2008a,b). For prediction of discontinuous or conformational epitopes, the PDB structures: 1N8Z, 1S78, 2A91 and sequence based PDB structure obtained by m4t server, a fully automated comparative protein structure modeling server (http://www.fiserlab.org/servers/m4t) was submitted to Discotope 1.2 server (Haste Andersen et al., 2006). Also CBtope was used to predict conformational B cell epitopes using antigen primary sequence (Ansari et al., 2010). Protein-protein interaction sites on the PDB structures were predicted using cons PPISP online tool (Bradford and Westhead, 2005). Solvent accessibility of all the residues was calculated by ASAView (http://www.netasa. org/asaview). Absolute surface area (ASA) of each residue was calculated by DSSP program. These values were transformed to relative values of ASA. Default settings were applied to all the tools used. The regions recognized frequently by four or more than four tools were selected. The protein structures were analyzed and viewed with PyMOL V 1.

Theoretical physico-chemical properties of the peptides
The theoretical physicochemical properties of the synthetic peptides (such as the ionic status, calculated as the isoelectric point, and the hydrophobicity, measured as the grand average of hydropathicity (GRAVY) index) were analyzed using the Prot Param algorithm (http:// www.expasy.ch/tools/protparam. html). The GRAVY index indicated the hydrophobicity of the peptide and was calculated as the sum of the hydropathy values (Kyte and Doolittle parameters) of the composing amino acids, divided by the number of residues in the sequence. Peptides with positive GRAVY index are hydrophilic whereas peptides with negative GRAVY index are hydrophobic (Lebreton et al., 2011).

Prediction of B cell epitopes by SUPERFICIAL software
We used SUPERFICIAL (http://bioinformatics.charite. de/superficial) a structure-based program for designing of peptide libraries that mimic the whole surface or a particular region of a protein. It automatically defines the protein surface, using preset default values useful to a range of proteins. SUPERFICIAL uses protein structures (PDB codes) as input and generates library proposals consisting of linear and nonlinear peptides. Neighboring sequence segments are linked by short spacers to conserve local conformation. All generated peptides are listed within a table. Such a structure-based peptide library provides the source for chemically-prepared peptides to identify and characterize binding sites, respectively (Goede et al., 2005).

Sequence Alignment
Amino acid sequence of ECD of human and mouse ErbB2 receptor and peptides (P1 C , P2 C ) was aligned using Cobalt Constraint-based Multiple Protein Alignment Tool (http://www.ncbi.nlm.nih.gov/tools/cobalt/cobalt. cgi?link_loc=BlastHomeAd) and The Universal Protein Resource (UniProt)/Align (http://www.uniprot.org/) with default parameters. The output was manually checked to optimize the alignment.

Prediction of B cell epitopes by integrated strategy
The predicted linear B-cell epitopes of subdomain III of HER2 ECD are shown in Table 1. For each peptide, sequence, position, molecular weight, isoelecteric pH and gravy score was indicated. As shown in Table 1, P5 and P15 are recognized frequently by four and five tools used in this work. These peptides are related to regions 379-394 and 491-506, respectively. P8 and P12 were only identified by ABCpred.
Furthermore, in CBtope software four conformational B-cell epitope regions were identified which two of them were in the region of linear B-cell epitopes (376-394 and 492-508). Also conformational B-cell epitope (376-394) was identified by discotope server (Table 2). Further analyses for protein-protein interaction sites and solvent accessibility of all the residues on the PDB structures using cons PPISP and ASAView online tools defined that B cell epitopes of 378-393 (PESFDGDPASNTAPLQ), P1 C and 500-510 (PEDECVGEGLA), P2 C have higher solvent accessible and their residues are exposed on the surface (Figure 1). These are important factors for immunogenicity of an antigen or synthetic peptide. Around 43% of the 170 residues (in subdomain III) were non polar, 22% were charged residues and 34% were recorded as polar residues for subdomain III of HER2 ECD. The relative surface accessibility values calculated by ASAView showed 34.91% residues with greater than 30% whereas 27.81% residues were with greater than 50% solvent accessibility. Cons PPISP online tool was determined buried residues, 40.24% of residues in subdomain III of HER2 ECD were

Figure 3. Multiple alignments between peptides (P1 C , P2 C ) and ErbB2 ECD of human and mouse using Cobalt Constraint-based Multiple Protein Alignment
Tool.

Prediction of B cell epitopes by SUPERFICIAL software
The PDB structures 1N8Z, 1S78, 2A91 and sequence based PDB structure obtained by m4t server, was submitted to SUPERFICIAL software. This program uses protein structures as input and generates library proposals consisting of linear and nonlinear peptides (Table 3 and 4).
In Table 3 and 4, linear and nonlinear peptides were shown, respectively. Linear peptides were predicted by SUPERFICIAL software was similar to Linear B cell epitopes was obtained by integrated strategy (Table 1). In Table 4, conformational B-cell epitopes P4-P9 were in the region of P1 C : 378-394 of HER2 ECD subdomain III and also P13-15 was nearly similar to P2 C . SUPERFICIAL predicted new conformational B cell epitope or nonlinear peptides such as P1-3 and P10-12. To preserve their conformation, the gaps between the peptide-fragments are filled with linkers, short amino acid sequences derived from the LIP (Loops in Proteins) database (Michalsky et al., 2003). Certainly sometimes amino acids were represented by the character "X" that could be replaced in praxis by poly-Alanine and/or Glycine instead of using the Linker. Figure 2 is showing the 3D view of the predicted conformational B-cell epitope PESFDGD-X -TAPLQ (378-385, 389-393) in SUPERFICIA and Pymol softwares.

Sequence alignment
The sequence alignment between ECD of human and mouse ErbB2 receptor isoform using UniProt was shown 85.42% identity, 539 identical positions and 63 similar positions. Additionally the sequence homology of predicted HER2 peptides were tested between the human and mouse ErbB2 sequences (Figure 3). It was shown these peptides had homology of 100% with HER2 ECD. This alignment was also indicated 64 and 50% homology of peptides P1 C and P2 C with mouse ErbB2 ECD, respectively. These results were described the rate of foreignness of HER2 ECD for immunization of mouse. It should be noted the foreignness is a very important factor Asian Pacific Journal of Cancer Prevention, Vol 13, 2012 3057 DOI:http://dx.doi.org/10.7314/APJCP.2012.13.7.3053 Linear and Conformational B Cell Epitope Prediction of HER 2 ECD-Subdomain III by   for immunogenicity of an antigen or synthetic peptide (Richard et al., 2003).

Discussion
Our study is the first to report the identification of novel conformational B-cell epitopes of HER2 ECD subdomain III. Despite of majority of epitopes being conformational, most of the computational methods and databases centered at the sequential epitopes (Vita et al., 2010). Linear epitope prediction methods can be categorized into physicochemical property, Hidden Markov Mode (HMM) and Artificial Neural Networks (ANN) based (Saha and Raghava, 2004;Larsen et al., 2006;Saha and Raghava, 2006). For this purpose, ABCpred, BCPREDs, Bepired, Bcepred and Ellipro were used in this research. If the tertiary structure of an antigen is known, there are improved methods for the prediction of conformational B cell epitopes. Examples are SUPERFICIAL software and Discotope web server (Goede et al., 2005;Haste Andersen et al., 2006;Moreau et al., 2008;Ponomarenko et al., 2008). These are based on features like flexibility, solvent accessibility and amino acid propensity scales (Kulkarni-Kale et al., 2005;Haste et al., 2006).
In the current study attempts have been made to predict linear and conformational B cell epitopes in HER2 receptor from both primary sequence and tertiary structure. Two different methods were used based on bioinformatics analyses. In the first method, integrated strategy, the best peptides were the linear B-cell epitopes P5: 379-394 and P15:491-506 (Table 1) and conformational B-cell epitope P1 C : 378-393 and P2 C : 500-510 (Table 2) which was different from the previously reported epitopes for the subdomain III of the receptor. Both of the linear and conformational epitopes were approximately in the identical regions of HER2 ECD. Based on secondary structure prediction results (PSIPRED), position 378-393 of HER2 ECD had an alpha helical structure. In the second method, SUPERFICIAL software were predicted many new conformational peptide linked by the linkers (Table 4) and also this software is predicted conformational B-cell epitope P1 C : 378-393 and P2 C : 500-510 in the form of two segments or linear B-cell epitopes that are linked by the linker (Figure 2 and Table 3).
In the majority of the past studies on HER2 receptor, either Linear B cell epitopes or T-cell epitopes were used as synthetic peptide or antigen, while the vaccine peptides were designed to be chimeric with multi-epitopes of B-cells and helper T cells (Dakappagari et al., 2000;Montgomery et al., 2005). Synthetic peptides can be presented to in different ways, such as conjugation to different carrier proteins, incorporation into liposomes, covalent conjugation to fatty acids and generation of multiple antigen peptides (MAP) (Haro and Gomara, 2004;Veje et al., 2011). Miyako et al. (2011) generated 22 kinds of synthetic MAPs containing 20 amino acid residues based on the region covering from N: 143 to N: 370 of HER2 ECD. They identified a new epitope of MAb CH401 against HER2 extracellular domain (N: 167-175), and evaluated the effect of active immunization of the 20mer peptide containing the epitope (CH401 peptide). The 20mer peptide (N: 163-182) including CH401 epitope (CH401 epitope peptide) was immunogenic, causing a high titer of peptide-specific IgG antibody in immunized BALB/c mice because it contained epitopes for both B-cells and helper T-cells. In contrast to the present study, these epitopes were located on subdomain I of HER2 ECD. In addition, Dakappagari et al. (2003) identified two linear HER2 ECD B cell epitopes (316-339 and 485-503) based on computer aid analyses. These epitopes incorporated to the measles virus fusion (MVF) 'promiscuous' T cell epitope via a four-residue linker sequence. They induced high titrated Abs inhibiting the growth of human breast cancer cells (Dakappagari et al., 2003). Moreover, Garrett et al. (2007) reported phase I clinical trial using the first generation peptide vaccines MVF 316-339 and MVF 628-647. They also used peptide constructs MVF 613-626, MVF 563-598 and MVF 597-626. All epitopes were immunogenic in FVB/n mice and 597-626 epitope significantly reduced cancer growth in transgenic BALB-neuT mice. Besides, Wiwanitiki (2007) predicted B cell epitopes of HER2 oncoprotein by ABCpred prediction server. He determined the peptide 947SRMARDPQRFVVIQNE963 with the best binding affinity. Thus, the finding of present work is different from those predicted epitopes or peptides of the previous studies (Dakappagari et al., 2003;Garrett et al., 2007;Wiwanitiki, 2007;Furthermore et al., 2009;Miyako et al., 2011) generated antibodies to linear epitopes of the HER2 that effect cell growth by a strategy based on polyclonal antibodies. The functional antibodies with growth inhibition properties located on domain II and III of the receptor, respectively. One of the epitopes mapped to domain II overlapped with the conformational epitope recognized by pertuzumab (266-296), while similar to our work, another epitopes on the surface of the domain III were close to each other in a linear stretch of 25 amino acids LPESFDGDPASNTAPLQPEQLQVF. Jasiinska et al. (2003) also identified seven B cell epitopes on sequence of HER2 ECD by computer analyses. The peptides (P1: 115-132AVLDNGDPLNNTTPVTGA; P 2 : 1 4 9 -1 6 2 L K G G V L I Q R N P Q L C ; P 3 : 2 7 4 -2 9 5 Y N T D T F E S M P N P E G R Y T F G A S ; P 4 : 3 7 8 -3 9 8 P E S F D G D PA S N TA P L Q P E Q L Q ; P 5 : 4 8 9 -5 0 4 P H Q A L L H TA N R P E D E ; P 6 : 5 4 4 -5 6 0 C RV L Q G L P R E Y V N A R H C ; P 7 : 610-623YMPIWKFPDEEGAC were used for immunization, successfully induced humoral immune response with anti-tumor activity in an animal model. The linear peptide P5: 379-394 that we are predicted is similar to epitope P4: 378-398 which is used in Jasinska et al. (2003) research. In another study, Dakappagari et al. (2005), designed a complex immunogen and conformational B-cell epitope peptide vaccine derived from the extracellular domain of HER2-(626-649) that represents a three-dimensional epitope. They successfully introduced two disulfide bonds into this sequence. Their study demonstrate the feasibility and importance of designing conformational epitopes that mimic the tertiary structure of the native protein for eliciting biologically relevant anti-tumor antibodies. Such approaches are a preequipment to the design of effective peptide vaccines and immunogenic synthetic peptides. In the present work, new conformational B cell epitopes P1 C : 378-393 (PESFDGDPASNTAPLQ) and P2 C : 500-510 PEDECVGEGLA (Table 2) were predicted by bioinformatics analyses (integrated strategy and SUPERFICIAL software). These peptides are discontinuous parts of region 378-393 and 500-510 of HER2 ECD which were predicted with Linear B cell epitope tools.
Tolerance to self antigens such as HER2, may limit a functional immune response to whole protein based antigens due to activation of suppressor T cells that maintain tolerance to host Antigens (Disis. and Cheever, 1997;Sakaguchi, 2000;Bonilla and Oettgen, 2010), while immunizing with peptides derived from self antigens may be an effective means of circumventing tolerance (Ercolini et al., 2003). However, it should be noted that peptides with a sequence identical to that of self antigen will not be able to break the threshold of self tolerance status and specific lymphocytes is expected to be deleted during the establishment of self identity. Mittelman et al. (2002) analyzed the similarity level of HER2 peptides to the host's proteome and understand the relationship between molecular mimicry and peptide immunogenicity. They reported low level of sequence similarity to the host's proteome plays an important role in shaping the pool of B cell epitopes and determining peptide immunogenicity. This finding has important implications in HER2 cancer therapy. For this purpose, in our study the foreignness or similarity level of peptides were evaluated by multiple sequence alignment. The results indicated lower similarity of peptides with mouse ( Figure 3) compared to HER2 ECD. Furthermore, the peptides, P1 C and P2 C , are indicated high degree of foreignness (50 and 36%, respectively) for immunization of mice.
In conclusion, the findings of present work using bioinformatics analyses could be applied in MAb production, cancer therapy, developing vaccine design and diagnostic tools reducing the time and minimizing the total number of required tests to find possible proper epitopes. In the next step, in vitro synthesis of determined peptides and in vivo experimental studies are essential for assurance of the predicted epitopes. In these experiments, identification of epitopes on HER2 that are either stimulatory or inhibitory is very important for development of strategies to better manipulate the Ab response for therapeutic benefit. Thus, selection of suitable epitopes from specific regions of HER2 capable of inducing cancer inhibitory MAbs and careful elimination of epitopes that stimulate cancer cell growth, will be identified candidate peptides that could provide beneficial effects.