Keywords
breast cancer, BRCA1, BRCA2, mutation screening, targeted exome sequencing
breast cancer, BRCA1, BRCA2, mutation screening, targeted exome sequencing
Breast cancer (BC) is one of the most common cancers in females worldwide1 and particularly in Armenia2. Despite the high prevalence of this disease in developed countries, it has become highly prevalent in developing countries (50% of all cancer cases) and is characterized by high mortality rate (58% of all breast cancer related deaths)3.
The germline mutations of the BRCA14 and BRCA25 genes are the most significant and well characterized genetic risk factors for hereditary breast cancer, which constitutes about 5–10% of all cases6. Inherited mutations in BRCA1 and BRCA2 genes account for 30–50% of all known mutations associated with this disease7,8. Women who carry BRCA1 mutations are particularly susceptible to the development of breast cancer before the age of 35–40 with a probability rate of 45%–60%, whereas women who inherit a BRCA2 mutation have a 25%–40% risk of developing breast cancer7,8. The association of BRCA1/BRCA2 gene mutations with breast cancer was first well described in Ashkenazi Jews8–11. Intensive research in the last decades has demonstrated that the incidence of mutations in high-risk families varies widely among different populations6. For example, the mutations in BRCA1 and BRCA2 were each estimated to account for 45–50% of families with multiple cases of breast and ovarian cancer in UK and USA3,12, whereas mutation prevalence among African–Americans with family breast and ovarian cancer history was 16.3% for BRCA1 and 11.3–14.4% for BRCA213,14, which is significantly lower compared to Caucasian populations. Identification of the BRCA1/BRCA2 mutations in different populations and ethnic groups is an important endeavor, which enables geneticists and oncologists to make more specific choices in genetic testing of members of high-risk families15–17.
Here we have attempted to perform a pilot study for identification and characterization of mutations in BRCA1 and BRCA2 genes among Armenian patients with family history of breast cancer and their healthy relatives.
Six patients with confirmed family history of breast cancer (at least two cases in a family) and their first-degree healthy relatives were recruited in this study (except for the BC10 patient, see Table 1). Patients were admitted to the National Center of Oncology MH RA and ARTMED Medical Rehabilitation CJSC. Written informed consent forms were obtained from all the study participants. This study was approved by the Institutional Review Board (IRB00004079) of the Institute of Molecular Biology NAS RA.
Blood samples were collected in EDTA-containing tubes and genomic DNA was extracted according to the protocol described elsewhere18. A260/A280 ratio measured for evaluation of quality and quantity of extracted DNA was in the range of 1.8–2.
BRCA1 and BRCA2 exome sequencing was performed by an external service provider (Admera Health LLC, South Plainfield, NJ, USA) using the proprietary breast cancer panel iBRCATM, which detects genetic variations in all exons of BRCA1 and BRCA2. According to the service provider’s description, this panel utilizes the targeted amplicon (166 amplicons) sequencing method, based on Seq-Ready™ TE Panels protocol (WaferGen Biosystems Inc, Freemont, CA, USA). Reagent cocktails and samples were aliquoted into a 384-well sample source plate. The source plate and BRCA1/2 SmartChip™ were pre-dispensed with Seq-Ready™ TE BRCA1/2 Primers and were placed into the SmartChip™ Multisample Nanodispenser. The SmartChip™ was then amplified with Bio-Rad T100 SmartChip™ TE Cycler. PCR product was then purified with Agencourt AMPure XP (Beckman Coulter, Inc.), according to manufacturer’s instructions. Samples were then quantified with Qubit® 2.0 Fluorometer (Thermo Fisher Scientific, Inc.) and quality analyzed with Tapestation (Agilent Technologies). Sequencing was performed with Illumina MySeq platform on a single lane. Raw reads for each sequenced sample were stored in separate fastq files. DNA samples were shipped on ice to avoid degradation and were passed internal quality check before processing.
For each sample, raw sequences were aligned to the human reference genome sequence (hg19, see Public genome data section) using Burrows-Wheeler Aligner (BWA) version 0.7.10 with default parameters. The resulting bam files were used in downstream variant discovery analysis.
Variant discovery was performed using Genome Analysis Tool Kit (GATK) version 3.6 according to recommended workflows for germline single nucleotide variations (SNVs) and indel discovery in whole genome and exome sequencing data19. Base quality score recalibration, indel realignment and mate pair fixing were performed in bam files. Variant calling was performed without duplicate read removal. SNV and indel discovery and genotyping were performed simultaneously across all samples using standard hard filtering parameters19.
For the alignment, we have used the human reference genome sequence (NCBI build 36.1/hg19) from the UCSC (University of California, Santa Cruz) database (http://genome.ucsc.edu). Known SNPs (single nucleotide polymorphisms) were annotated using the UCSC database (single nucleotide polymorphism database, dbSNP version 135). 1000 Genomes phase 1 genotype data was used for human genetic variations filtration (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/). Allelic frequencies of detected variants were compared against 1000 Genomes phase 3 genotypes, as well as with the genome-wide association study (GWAS) data from 54 healthy Armenian females that were genotyped in the framework of population genetics study by Harber et al.20 (ftp://ngs.sanger.ac.uk/scratch/project/team19/Armenian). The Data on clinically significant BRCA1 and BRCA2 variants were obtained from Breast Cancer Core DataBase maintained by National Human Genome Research Institute (https://research.nhgri.nih.gov/bic/).
Comparison of allele frequency distributions in the study group with 1000 Genomes and healthy Armenians was performed using Fisher’s exact test available in R 3.3.2 base package. Variant functional annotation was performed using ENSEMBL Variant Effect Predictor tool21.
In this study we have performed exome sequencing of BRCA1 and BRCA2 genes in patients with a positive family history of breast cancer and their healthy relatives of Armenian origin. Patients’ clinical data and family structure of the studied subjects are presented in the Table 1. The aligned sequencing data is available in the NCBI Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra/) under accession SRP095082. For each sample, a total of 166 different primer pairs were used to amplify all the coding regions of BRCA1 and BRCA2 (as described in the Methods section). The average sequencing depth per base per sample was 6696±606. Detailed NGS statistics are presented in Table 2 and Supplementary file S1.
In total, variant calling resulted in detection of 232 sequence variations (200 SNVs and 32 indels, Supplementary datasets S2 and S3). Thirty-nine SNVs and 4 indels passed the thresholds after applying hard filters (Table 3).
This table provides functional annotation of mutations in BRCA1 and BRCA2 genes that passed filters during variant calling with GATK.
HGVSg – genomic position of mutation notation by Human Genome Variation Society; Consequence – consequence of mutation; Impact – functional impact of mutation (MD – modifier, MO – moderate, L – low, H – high); HGVSp - protein sequence name notation by Human Genome Variation Society; SIFT - prediction of protein function change depending on amino acid substitution using SIFT software (http://sift.jcvi.org/); PolyPhen - prediction of protein function change depending on amino acid substitution using PolyPhen software (genetics.bwh.harvard.edu/pph2/).
From these variants, 18 were novel (15 SNV and 3 indels), and the rest have already been described in 1000 Genomes populations (Table 4). The novel variants were detected only in one or two subjects (8 in healthy relatives and 7 in patients). We identified 12 missense variants (5 in BRCA1 and 7 in BRCA2), 8 synonymous variants (5 in BRCA1 and 3 in BRCA2), 15 intronic variants (8 in BRCA1 and 7 in BRCA2) and 4 in untranslated regions of BRCA2. The frequency distributions of known BRCA1/2 variants were similar to those in 1000 Genomes populations and/or GWAS of healthy Armenians, except for the g.32914236 C>T (pFisher=8.35E-24 vs Armenians, pFisher=0.013 vs 1000 Genomes) and g.41245471 C>T (pFisher=0.013 vs Armenians, pFisher=4.7-E05). No known clinically significant variants were detected in breast cancer patients and their healthy relatives.
The frequency distributions of identified mutations in the study group were compared with data from 1000 Genomes population, as well as the genome-wide association study from 54 healthy Armenian females20.
This study provides preliminary characterization of variations in BRCA1 and BRCA2 genes in Armenian patients with family history of breast cancer. Our data suggest that no known clinically significant variants22 contribute to the disease development in these patients. Meanwhile, two other frequent mutations were identified that cause missense substitutions in coding regions of BRCA1 and BRCA2 and were predicted as having pathogenic consequence. The results of this study are in agreement with a a previous report, which also failed to identify known high risk mutations of BRCA1 and BRCA2 genes in Armenian patients using high-resolution melting PCR approach23,24.
Mutations in BRCA1 and BRCA2 genes are known markers for hereditary breast/ovarian cancer25. Currently more than 100 clinically important mutations and polymorphisms have been described. Genetic testing of these mutations was among the first included in the guidelines for cancer prognostics3,4. Nowadays, in many countries genetic testing is routinely prescribed to patients in high-risk groups for hereditary breast and ovarian cancer26–28. However, it has also become apparent that the distribution and appearance of particular risk alleles in BRCA1 and BRCA2 genes is population dependent, and in many cases population specific mutations are being identified8–11. This is especially relevant to populations that have for a long time remained culturally and genetically isolated8–11, as in the case of Armenians. Recent research has demonstrated that the genetic structure of Armenians “stabilized” about 4000 years ago and has remained almost unchanged since that time20. Furthermore, our own data indicate that the frequencies of genetic variations associated with various complex human diseases share similarities both with European and Asian populations29–31. From the other side, Armenian genomes are highly underrepresented in the current human genome sequencing initiatives and little is known about genetic predisposition to complex diseases in this particular population.
In conclusion, despite the small sample size limitation, our results demonstrate the importance of screening of BRCA1 and BRCA2 gene variants in the Armenian population in order to identity specifics of mutation spectra and frequencies and enable accurate assessment of the risk of hereditary breast cancers.
The aligned sequencing data is available in the NCBI Sequence Read Archive (SRA) under accession number SRP095082 (https://www.ncbi.nlm.nih.gov/sra/?term=SRP095082). Scripts and vcf files with called and filtered genotypes are available: DOI, 10.5281/zenodo.21561532.
AA conceived the study, performed data analysis and drafted the manuscript. SA, AC and RZ performed experiments, data analysis and participated in drafting. NB and SA were responsible for patient selection, data analysis and contributed to manuscript writing.
This research received the grant from Armenian National Science & Education Fund (ANSEF) [#molbio-4334] to AC and SA.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Supplementary File 1. Sequencing statistics: Coverage and target enrichment statistics.
This file contains details on sequencing coverage and enrichment, which were extracted from the QC report compiled by Admera Health LLC, South Plainfield, NJ, USA.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Competing Interests: No competing interests were disclosed.
Competing Interests: No competing interests were disclosed.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 10 Jan 17 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)