Characterization of BRCA1/2 mutations in patients with family history of breast cancer in Armenia

Background. Breast cancer is one of the most common cancers in women worldwide. The germline mutations of the BRCA1 and BRCA2 genes are the most significant and well characterized genetic risk factors for hereditary breast cancer. Intensive research in the last decades has demonstrated that the incidence of mutations varies widely among different populations. In this study we attempted to perform a pilot study for identification and characterization of mutations in BRCA1 and BRCA2 genes among Armenian patients with family history of breast cancer and their healthy relatives. Methods. We performed targeted exome sequencing for BRCA1 and BRCA2 genes in 6 patients and their healthy relatives. After alignment of short reads to the reference genome, germline single nucleotide variation and indel discovery was performed using GATK software. Functional implications of identified variants were assessed using ENSEMBL Variant Effect Predictor tool. Results. In total, 39 single nucleotide variations and 4 indels were identified, from which 15 SNPs and 3 indels were novel. No known pathogenic mutations were identified, but 2 SNPs causing missense amino acid mutations had significantly increased frequencies in the study group compared to the 1000 Genome populations. Conclusions. Our results demonstrate the importance of screening of BRCA1 and BRCA2 gene variants in the Armenian population in order to identity specifics of mutation spectrum and frequencies and enable accurate risk assessment of hereditary breast cancers.


Introduction
Breast cancer (BC) is one of the most common cancers in females worldwide 1 and particularly in Armenia 2 . Despite the high prevalence of this disease in developed countries, it has become highly prevalent in developing countries (50% of all cancer cases) and is characterized by high mortality rate (58% of all breast cancer related deaths) 3 .
The germline mutations of the BRCA1 4 and BRCA2 5 genes are the most significant and well characterized genetic risk factors for hereditary breast cancer, which constitutes about 5-10% of all cases 6 . Inherited mutations in BRCA1 and BRCA2 genes account for 30-50% of all known mutations associated with this disease 7,8 . Women who carry BRCA1 mutations are particularly susceptible to the development of breast cancer before the age of 35-40 with a probability rate of 45%-60%, whereas women who inherit a BRCA2 mutation have a 25%-40% risk of developing breast cancer 7,8 . The association of BRCA1/BRCA2 gene mutations with breast cancer was first well described in Ashkenazi Jews [8][9][10][11] . Intensive research in the last decades has demonstrated that the incidence of mutations in high-risk families varies widely among different populations 6 . For example, the mutations in BRCA1 and BRCA2 were each estimated to account for 45-50% of families with multiple cases of breast and ovarian cancer in UK and USA 3,12 , whereas mutation prevalence among African-Americans with family breast and ovarian cancer history was 16.3% for BRCA1 and 11.3-14.4% for BRCA2 13,14 , which is significantly lower compared to Caucasian populations. Identification of the BRCA1/BRCA2 mutations in different populations and ethnic groups is an important endeavor, which enables geneticists and oncologists to make more specific choices in genetic testing of members of high-risk families 15-17 .
Here we have attempted to perform a pilot study for identification and characterization of mutations in BRCA1 and BRCA2 genes among Armenian patients with family history of breast cancer and their healthy relatives.

Samples
Six patients with confirmed family history of breast cancer (at least two cases in a family) and their first-degree healthy relatives were recruited in this study (except for the BC10 patient, see Table 1

Results
In this study we have performed exome sequencing of BRCA1 and BRCA2 genes in patients with a positive family history of breast cancer and their healthy relatives of Armenian origin. Patients' clinical data and family structure of the studied subjects are presented in the  Table 2 and Supplementary file S1.
In total, variant calling resulted in detection of 232 sequence variations (200 SNVs and 32 indels, Supplementary datasets S2 and S3). Thirty-nine SNVs and 4 indels passed the thresholds after applying hard filters (Table 3).
From these variants, 18 were novel (15 SNV and 3 indels), and the rest have already been described in 1000 Genomes populations (   HGVSg -genomic position of mutation notation by Human Genome Variation Society; Consequence -consequence of mutation; Impact -functional impact of mutation (MD -modifier, MO -moderate, L -low, H -high); HGVSp -protein sequence name notation by Human Genome Variation Society; SIFT -prediction of protein function change depending on amino acid substitution using SIFT software (http://sift.jcvi.org/); PolyPhen -prediction of protein function change depending on amino acid substitution using PolyPhen software (genetics.bwh.harvard.edu/pph2/). Page 4 of 9

Discussion
This study provides preliminary characterization of variations in BRCA1 and BRCA2 genes in Armenian patients with family history of breast cancer. Our data suggest that no known clinically significant variants 22 contribute to the disease development in these patients. Meanwhile, two other frequent mutations were identified that cause missense substitutions in coding regions of BRCA1 and BRCA2 and were predicted as having pathogenic consequence. The results of this study are in agreement with a a previous report, which also failed to identify known high risk mutations of BRCA1 and BRCA2 genes in Armenian patients using high-resolution melting PCR approach 23,24 .
Mutations in BRCA1 and BRCA2 genes are known markers for hereditary breast/ovarian cancer 25 . Currently more than 100 clinically important mutations and polymorphisms have been described. Genetic testing of these mutations was among the first included in the guidelines for cancer prognostics 3,4 . Nowadays, in many countries genetic testing is routinely prescribed to patients in highrisk groups for hereditary breast and ovarian cancer [26][27][28] . However, it has also become apparent that the distribution and appearance of particular risk alleles in BRCA1 and BRCA2 genes is population dependent, and in many cases population specific mutations are being identified 8-11 . This is especially relevant to populations that have for a long time remained culturally and genetically isolated 8-11 , as in the case of Armenians. Recent research has demonstrated that the genetic structure of Armenians "stabilized" about 4000 years ago and has remained almost unchanged since that time 20 . Furthermore, our own data indicate that the frequencies of genetic variations associated with various complex human diseases share similarities both with European and Asian populations [29][30][31] . From the other side, Armenian genomes are highly underrepresented in the current human genome sequencing initiatives and little

Supplementary File 1. Sequencing statistics: Coverage and target enrichment statistics.
This file contains details on sequencing coverage and enrichment, which were extracted from the QC report compiled by Admera Health LLC, South Plainfield, NJ, USA.
Click here to download the file. is known about genetic predisposition to complex diseases in this particular population.
In conclusion, despite the small sample size limitation, our results demonstrate the importance of screening of BRCA1 and BRCA2 gene variants in the Armenian population in order to identity specifics of mutation spectra and frequencies and enable accurate assessment of the risk of hereditary breast cancers.
Author contributions AA conceived the study, performed data analysis and drafted the manuscript. SA, AC and RZ performed experiments, data analysis and participated in drafting. NB and SA were responsible for patient selection, data analysis and contributed to manuscript writing.

2.
3. I have few notes to consider:

Open Peer Review
In the section Methods, the authors used 1000 Genomes phase 1 genotype data for variations filtration. Is there any reason why they prefer phase1 data but not phase 3, which they used for assessing allelic frequencies?
I noticed that the authors did not verify the NGS detected variants by other methods, e.g. by Sanger sequencing. It is especially important to confirm the detected novel mutations to exclude that they could be false positive.
In the Table 3, the authors report a frameshift variant 13:g.32913172delC, which has a high functional impact on . Is it detected in a patient or in a healthy relative? Could it be a novel BRCA2 mutation specific for the Armenian population? It is known that PolyPhen and SIFT may fail to predict the impact for some variants. The authors might consider to verify this mutation by other methods, e.g. Sanger sequencing, and report it to the appropriate databases. I would suggest to mention about this variant under the section Discussion.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed. The breast cancer is an important health problem in Armenia and identifying specific genetic factors that 1.

2.
The breast cancer is an important health problem in Armenia and identifying specific genetic factors that may predispose to breast cancer development, especially in the families of patient that were already diagnosed with this condition may improve significantly the dire situation with breast cancer prevention in Armenia. Although in the small number of patients and family members, the manuscript presents a good step forward and sets an example how genetic studies in the larger cohort of breast cancer patients and members of their families could identify clinically relevant variants in BRCA1/2 gene mutations that known elsewhere outside of Armenian population as well identify variants that could be specific for Armenian populations only.
The title of manuscript is appropriate and the abstract summarizes well the reported findings. Study design is appropriate, albeit with small number of patients. Materials and methods and data analyses are suitable for the design and conclusion are justified. Methodology provides sufficient information and references for replication of the experiments as well as to build up the data base with the larger cohort of patients and their family members.
A few suggestions to make the discussion of the results better: In this study there were no known clinically relevant variants identified. Could this be because of small number of patients in addition to the conceived notion that genetic structure of Armenians is "stabilized" 4000 years ago? Could there be other predisposing factors, as well? Needs a bit more discussion.
What is the value of the novel variants identified in this study? Could these novel variants be specific for Armenian population? Are there any other "close ethnic groups" that have shown novel variants that are not clinically relevant for the "mainstream population" but became relevant for the specific ethnic group. Brief discussion will suffice.
I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed. Competing Interests: