Targeted Resequencing of 30 Genes Improves the Detection of Deleterious Mutations in South Indian Women with Breast and / or Ovarian Cancers

Hereditary breast and ovarian cancers constitute around 5-10% of breast and ovarian cancers. In almost all of the urban cancer registries in India, breast cancer has overtaken cervical cancer and is the most common cancer among women in these cities (Ferlay et al., 2013).The identification of BRCA1 and BRCA2 as genes associated with hereditary breast and /or ovarian cancers and the availability of reliable technologies for identification of the mutations in all the coding regions of the genes has helped in screening high risk, cancer affected women for the mutations in these genes (Majeed et al., 2014). Patients with deleterious mutations in TP53 have an increased risk for breast cancer in addition to brain tumours, acute leukaemias, bone cancers, sarcomas, gastrointestinal cancers etc (Nichols et al., 2001). In addition to the above mentioned genes, additional genes contributing to intermediate risk for breast cancer have been identified. These include STK11 (Peutz-Jeghers syndrome) (Hearle et al., 2006), PTEN (Cowden’s


Introduction
Hereditary breast and ovarian cancers constitute around 5-10% of breast and ovarian cancers.In almost all of the urban cancer registries in India, breast cancer has overtaken cervical cancer and is the most common cancer among women in these cities (Ferlay et al., 2013).The identification of BRCA1 and BRCA2 as genes associated with hereditary breast and /or ovarian cancers and the availability of reliable technologies for identification of the mutations in all the coding regions of the genes has helped in screening high risk, cancer affected women for the mutations in these genes (Majeed et al., 2014).Patients with deleterious mutations in TP53 have an increased risk for breast cancer in addition to brain tumours, acute leukaemias, bone cancers, sarcomas, gastrointestinal cancers etc (Nichols et al., 2001).
In addition to the above mentioned genes, additional genes contributing to intermediate risk for breast cancer have been identified.These include STK11 (Peutz-Jeghers syndrome) (Hearle et al., 2006), PTEN (Cowden's

RESEARCH ARTICLE
Targeted Resequencing of 30 Genes Improves the Detection of Deleterious Mutations in South Indian Women with Breast and/or Ovarian Cancers Thangarajan Rajkumar 1 *, Balaiah Meenakumari 1 , Samson Mani 1 , Veluswami Sridevi 2 , Shirley Sundersingh 3 syndrome) (Saal et al., 2008), CDH1 (gastric cancer and lobular breast cancer) (Masciari et al., 2007;Schrader et al., 2008).NBN (Nijmegen breakage syndrome) has been shown to be associated with 3 fold increased risk for breast cancer in carriers of a 5-bp deletion (Bogdanova et al., 2008).Other Intermediate risk genes include RAD50, CHEK2, BRIP1, PALB2, FANCD1 and ATM (Ripperger et al., 2009;Yang et al., 2012).These genes have been associated with an odds ratio above 2, when there is a heterozygous germline mutation.The Mismatch Repair genes (MMR) MSH2, MLH1, MSH3, MSH6, PMS1, PMS2 and MUTYH have been known to be associated with increased ovarian cancer risk, while their association with breast cancer risk has been ranged from intermediate risk to low risk (Scott et al., 2001;Vasen et al., 2001).Scott et al (2001) had studied MSH2 and MLH1 mutations in HNPCC patients and found a significantly increased risk for breast cancer among those with MLH1 mutations (Scott et al., 2001).
Low penetrance single nucleotide polymorphisms (SNP) contributing to breast cancer susceptibility Thangarajan Rajkumar et al constitute the third, low risk group of genes.These SNP's generally having an Odds Ratio less than 2, usually around 1.2.These include SNP's in but not limited to those in FGFR2, TGFB1, LSP1, MAP3K1 and TOX3 (Cox et al., 2007;Easton et al., 2007;Samson et al., 2007;Shin et al., 2008;Samson et al., 2009;Luo et al., 2014;Sharma et al., 2014).Several genome wide association studies which are underway may provide additional information on these low risk gene polymorphisms (Mahdi et al., 2013).
The introduction of Next generation sequencing platforms has enabled more comprehensive coverage of potential genes associated with these cancers.Targeted resequencing with these massively parallel sequencing technologies not only helps in multiplexing samples thereby bringing the cost down but also allows more potential genes of interest to be studied.With the increasing capacity of the technology to generate terabyte amount of data, it is likely that whole exome sequencing at a fraction of the current cost is possible.Walsh et al., (2010) used targeted resequencing to analyse 21 genes likely to be involved in hereditary breast and ovarian cancers (Walsh et al., 2010).
NGS platform comes with its own set of problems as well.The volume of data generated is huge and requires immense storage capacities.The method of creating libraries may have implications on the patterns of errors in the reads.Additionally, the data analysis algorithms are still evolving and will require some time for the analysis to be free of false positives (Meldrum et al., 2011;Braggio et al., 2013).
In our earlier study we had used PCR-dHPLC followed by Sanger sequencing when an abnormality was detected (Soumittra et al., 2009).Since June 2012, we have started to use targeted resequencing covering 30 genes involved in DNA repair pathways including BRCA1 and BRCA2.This manuscript discusses the results on this analysis.

Patient samples
One hundred and forty eight eligible cases including the 91 published (Soumittra et al., 2009) were screened for mutations in BRCA1 and BRCA2 genes.The eligibility criteria are as described earlier (Soumittra et al., 2009).The screening for the mutations in these two genes was done using PCR-dHPLC and when an abnormality was detected, it was followed by Sanger sequencing, as described earlier (Soumittra et al., 2009).Twenty one samples were found to contain a deleterious mutation in either BRCA1 or BRCA2 and these were excluded for NGS analysis.Of the remaining 127 samples which were not found to contain a deleterious mutation using PCR-dHPLC, 74 had sufficient and good quality DNA for targeted sequencing and were included, while the rest had either no/ insufficient/ poor quality DNA and hence were excluded.In addition, 17 patients' samples which had not been analysed earlier were also included, bringing the total to 91.In Run3, a patient DNA sample with a 7 base pair insertion in Exon 16 of BRCA1 was included as a positive control and in addition, DNA from 2 healthy individuals  with no family history of cancer was included, as healthy controls.The study analysed 30 genes essentially involved in the DNA repair pathway (Table 2) (Walsh et al., 2010).
All the patients had provided their informed consent for the analysis and the study had been cleared by the Institutional ethical committee.

DNA extraction
Peripheral blood mononuclear cells were separated from 20 ml of peripheral blood that had been collected in EDTA.The cells were stored in aliquots and one aliquot was used for DNA extraction using the Qiagen's QIAmp DNA blood kit, as per the manufacturer's protocol.The quantity, quality and integrity of the extracted DNA were checked using Nano-drop and Bio-analyser.

Targeted resequencing
Briefly, 1µg of DNA was sheared using Bioruptor (Diagencode, Belgium) with a fragment size of 250bp.The DNA was end repaired, A tailed, adapter ligated and run on 2% gel for size selection of 300-500 size fragments and then used for library preparation using the TruSeq DNA sample preparation kit (Illumina, Inc, USA), as per the manufacturer's instructions.Using TruSeq custom selected oligos enrichment kit (Illumina, Inc, USA) which allows capture of around 2.4 Mb of the targeted sequences including 2kb upstream and downstream of the 30 genes (Table 1) were captured.10pmoles of the captured library DNA was denatured and subjected to cluster generation using cBOT instrument (Illumina, Inc, USA), on a paired end flow cell v.3.0.Thesequencing was done using SBS version 3.0 reagent, in a HiScan SQ instrument (Illumina, Inc, USA).The samples were run in 3 different flow cells; in the first run 24, second run 25 samples and in the third 60 samples (includes 13 samples with HNPCC and other conditions).
R1S2 sample which had multiple intronic and exonic polymorphisms in BRCA1 and BRCA2, detected by our earlier PCR-dHPLC was used as a positive control, to help in setting the parameters for variant detection (Table 2).

Data analysis
The .bcl files generated by the instrument were converted into FASTq files using Casava software (Illumina, Inc, USA).CLC Bio suite (CLC Bio, Aarhus, Denmark) version 5.5 was used for further processing of the data.Alignment was done using all the 24 chromosomes as reference from NCBI GRCH37.
The mapping parameters were as follows: No masking; Mismatch cost 2, Insertion cost 3, Deletion cost 3, Length fraction 0.8, Similarity fraction 0.8; No Global Alignment and Ignore non-specific matches.For the Variant detection, Quality based variant detection tool was used using the following parameters: Neighbourhood radius 5, Maximum gap and mismatch count 2, Minimum neighbourhood quality 30, Minimum central quality 35, Minimum variant frequency (%) 25, Minimum coverage 30 and Ignore nonspecific matches and broken pairs.Indels in coding region and SNP's resulting in premature termination were taken up for validation using Sanger sequencing.

Sanger sequencing
NGS identified deleterious mutations (indel and Termination SNP's) were validated using Sanger sequencing as described earlier (Rajkumar et al., 2003).All the samples showing a deleterious abnormality were reconfirmed using DNA extracted freshly from a different aliquot of cells from the same patient.

Survival analysis
Association between Clinico-pathological features including disease free status (age, stage, grade, disease free status) and deleterious mutation status was analysed using Chi-square test of independence.Kaplan Meier survival probabilities disease free survival [DFS] and overall survival [OS]) were computed for deleterious mutationpositive and negative cases, and the differences were tested using the Log-rank test.

Results
Of the 94 samples, two were samples from healthy individuals and one was a positive control (BRCA1:c.4704_4705insTGGAATCp.S1569Wfs*7).Thus the data presented here corresponds to 91patient samples of which 74 had been analysed earlier using d-HPLC.The bam files created after the alignment have been submitted to SRA (PRJNA208774).
The R1S2 sample had the 6 intronic and exonic polymorphisms in BRCA1 and one in BRCA2 seen using our earlier PCR-dHPLC method.The targeted resequencing analysis parameters set picked up all the 7 polymorphisms (Figure 1) and 4 additional ones in the exonic region, as well (Figure 2).The 7bp TGGAATC insertion in BRCA1 in sample R3S21, is a repeat sequence, was not correctly aligned by the software (Figure 3) but called it as a 7 bp CTGGAAT insertion.
Including the samples analysed earlier, we have the (n=10; 22%) in our series is the Ashkenazi mutation (BRCA1 c.68_69delAG; p.Glu23Val fsX16).The ethnic charcateristics of the women who had the Ashkenazi mutation is given in Table 3.This mutation was seen in a wide range of ethnic populations from different parts of the country and also without any family history of cancer (n=3).A multicentre study is ongoing to do the haplotype analysis in these patients.
The number of novel deletrious mutations in BRCA1 and BRCA2 were 6 and 2, respectively, while the deleterious mutations in ATM, RAD50, RAD52 and TP53BP1 were all novel.The clinico-pathologic variates such as age, stage of disease, histologic type, grade of tumour, ER, PR, HER2, lymph nodal status and disease free status were analyzed for correlation with the deleterious mutation status.We did not find any significant correlations between the clinicopathologic parameters and the deleterious mutation status.
The Kaplan Meier survival analysis was done by grouping patients with deleterious mutations compared with those without detectable deleterious mutations in the genes studied.Additionally, the analysis was also done limiting the genes to BRCA1 and BRCA2.For the latter we had survival details available in 29 mutation positive patients and 82 patients who did not have a deleterious mutation.There was no significant difference in survival between deleterious mutation positive and negative patients.

Discussion
Our targeting resequencing study to analyse 30 genes in patients with possible hereditary breast &/or ovarian cancers is the first one from India.The genes were chosen based on their role in DNA repair pathway as well as based on the literature [e.g.CDH1 in lobular carcinoma and gastric cancer].The 30 gene panel also served to help in analysis of other hereditary cancer syndromes such as Hereditary Non-polyposis colon cancer [HNPCC].The total number of samples with evaluable data is 91, of which 74 samples were analysed earlier using PCR-dHPLC and were not detected to carry a deleterious mutation.In addition, data from 17 patients' samples being analysed for the first time were evaluable.This study primarily presents the data on frame-shift mutations and nonsense mutations, leading to potential deleterious effect.We have not pursued the missense mutations with unknown functional effect as well as the known SNP's.
BRCA1 deleterious mutations were the most common compared to BRCA2.This is different from the incidence seen in Philippines wherein BRCA2 deleterious mutations predominate (Laudico et al., 2009).Of the deleterious mutations detected in our population, the BRCA1 c.68_69delAG; p.Glu23Val fsX16, is the most common [n=10].Interestingly, 3 of the 10 patients with the mutation did not have a family history of cancer.The others were from different regions and castes with low frequency of intercaste marriages.This suggests that the mutation may not be a classical founder mutation for south Indian patients.However, we will need to wait for the haplotype analysis.Bar-Sade et al. (1998) have also shown non-Ashkenazi population with distinct dissimilar haplotype to carry the same mutation (Bar-Sade et al., 1998).Similar to our study Vaidyanathan et al. (2009) had found 10/61 patients in their series to carry the Ashkenazi mutation (Vaidyanathan et al., 2009).However, there was no mention about the ethnicity of the patients studied.Kooshyar et al., 2013 had described detecting 185delAG in 2/39 patients with breast cancer in northeastern Iran.
The 5 mutations [4 missense and 1 nonsense] seen in p53 are likely to interfere with the function of the p53 protein, as per the IARC p53 database (Petijean et al., 2007).Of these G245S and R273H are non-functional, while R267Q, R273H and N263D are partially functional.The R306* is a non-sense mutation with functional effects.Among these mutations, N263D has not been reported as a germline mutant in the IARC database (Petijean et al., 2007).Of the 5, one was found in a family with Li Fraumeni syndrome (Tinat et al., 2009) with breast cancer in proband at 41 years, brain tumour [12 year old daughter] and breast cancer in her sister at 60 years; another patient had early onset ovarian carcinoma [27 years] and with her  et al., 1996).
In addition, we saw the S346* and Y415* variants which are classified as non-pathogenic polymorphisms in 5 and 8 samples respectively (Bell et al., 1999;Tong et al., 2003).Interestingly, Han et al. (2002) while reporting on their case-control study on the risk of S346* to breast cancer, as having no significant risk mentioned about a patient with the S346* polymorphism who had both breast and uterine cancers.Our family with a deleterious RAD52 mutation [c.519delA; p.Arg173Serfs*3] had breast and uterine cancers.Han et al. (2002) study looked only for the specific S346* mutation and hence could have missed other potential deleterious mutations in Rad52 (Han et al., 2002).
The ATM c.8495_8505delGTTACTTCTGC; p.Arg2832Hisfs*12 was seen in a lady with breast cancer at the age of 38 years [proband].Her paternal aunt had ?cancer ovary [no documents about site available] at 48 years of age and died.ATM is a kinase involved in sensing DNA damage and activates several proteins involved in DNA damage repair.The kinase domain [Amino acid residues 2712-2962] and the FATC domain [AA residues 3024-3056] are likely to be affected by the nonsense mutation (De la Torre et al., 2003).
The TP53BP1 c.1362delT; p.Ile455Serfs*36 frameshift mutation is seen in a patient who also had a deleterious mutation in BRCA2 Ex11D c.5388delT; p.Asp1796Glufs*9.The proband was an early onset breast cancer patient [33 years] whose mother had breast cancer at the age of 47 years.The Tudor-like domain [1483-1604residues], BRCT1 [1724-1848 residues], BRCT2 [1864-1964 residues] are likely to be lost compromising the function of the protein.
There were 6 novel mutations in BRCA1, 2 in BRCA2; one in p53 while the mutations in ATM, RAD50, RAD52 and TP53BP1 were all novel [13/45; 29%].This highlights the genetic differences in the south Indian women and emphasizes the fact that complete analysis of the genes, rather than looking for a few known mutants, may be required to identify these high risk women.
Kaplan Meier survival curves did not show any significant difference between patients with deleterious mutants compared with patients with no deleterious mutations.
In conclusion we present the targeted resequencing data for 30 genes in 91 breast cancer patients with family history or early onset disease [35 years or less].Our data indicates that the NGS platform can help identify more individuals with deleterious mutations in a cost effective manner.

Table 2 . Collated Data on: of Deleterious Mutants
*One patient had BRCA2 and a TP53BP1 deleterious mutation data on 165 samples.Of the 148 samples run with PCR-dHPLC we had identified deleterious mutation in 21 of them.Now, including the NGS data, we have a total of 45 deleterious mutations detected in 44 patients, which involves both high risk (BRCA1, BRCA2 and p53) and intermediate risk (RAD50, RAD52, ATM and TP53BP1) genes.The deleterious mutation in TP53BP1 was seen in a patient who also had a deleterious BRCA2 mutation.The collated data is summarised in Table2.The most common deleterious mutation seen Figure 1.

TTGAATC is Repeated Twice in the Normal Sequence; However, the Faded out Sequence, Clearly Shows the First Run of the Repeat Sequence which is Followed by Two more Repeats
DOI:http://dx.doi.org/10.7314/APJCP.2015.16.13.5211TargetedResequencing of 30 Genes for Detection of Mutations in South Indian Women with Breast and/or Ovarian Cancers BRCA1-Exon 16-c.4704_4705insTTGAATC,p.Ser1569Leufs*7

Table 3 . Ethnic Charcateristics of the Women who had Ashkenazi Mutation
The proband had developed breast cancer at the age of 28 years, her mother had died due to uterine cancer detected at 32 years of age and her sister had cancer breast at the age of 34 years.The RAD52 gene is an important component of the homology dependent double stranded DNA break repair mechanism.Sequences distal to residue 220are critical for binding to RPA [Replication protein A complex] and residues 290-330 are needed for RAD51 binding (Park