ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Negative/null result

A Genome-Wide Association Study of spontaneous preterm birth in a European population

[version 1; peer review: 2 approved with reservations]
PUBLISHED 25 Nov 2013
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Background: Preterm birth is defined as a birth prior to 37 completed weeks’ gestation. It affects more than 10% of all births worldwide, and is the leading cause of neonatal mortality in non-anomalous newborns. Even if the preterm newborn survives, there is an increased risk of lifelong morbidity. Despite the magnitude of this public health problem, the etiology of spontaneous preterm birth is not well understood. Previous studies suggest that genetics is an important contributing factor. We therefore employed a genome-wide association approach to explore possible fetal genetic variants that may be associated with spontaneous preterm birth.
Methods: We obtained preterm birth phenotype and genotype data from the National Center for Biotechnology Information Genotypes and Phenotypes Database (study accession phs000103.v1.p1). This dataset contains participants collected by the Danish National Birth Cohort and includes 1000 preterm births and 1000 term births as controls. Whole genomes were genotyped on the Illumina Human660W-Quad_v1_A platform, which contains more than 500,000 markers. After data quality control, we performed genome-wide association studies for the 22 autosomal chromosomes.
Results: No single nucleotide polymorphism reached genome-wide significance after Bonferroni correction for multiple testing.
Conclusion: We found no evidence of genetic association with spontaneous preterm birth in this European population. Approaches that facilitate detection of both common and rare genetic variants, such as evaluation of high-risk pedigrees and genome sequencing, may be more successful in identifying genes associated with spontaneous preterm birth.

Introduction

Definition and significance of preterm birth

Preterm birth (PTB), defined as birth prior to 37 completed weeks’ gestation, is a major public health problem that affects more than 10% of all births worldwide1,2. Globally, an estimated 15 million babies are born premature each year1,2. Despite substantial public health efforts over the past several decades, the U.S. PTB rate remained at 11.72% in 20113,4.

PTB is the leading cause of neonatal mortality in non-anomalous newborns1,3,5,6. It is also associated with a broad spectrum of lifelong morbidity in surviving preterm infants, including neuro-developmental delay, cerebral palsy, blindness, deafness, and chronic lung disease79.

Definition of spontaneous preterm birth

Some PTBs are iatrogenic and can be attributed to obstetric intervention aimed at reducing maternal and/or fetal risk. The remaining PTBs are known as spontaneous PTB (SPTB) and are the focus of research efforts to identify genetic and environmental risk factors. Although SPTB is a pressing health issue, the incomplete understanding of its biology has inhibited development of effective prevention and treatment strategies.

Genetics of spontaneous preterm birth

The etiology of SPTB is complex and multifactorial912, although genetic factors are important contributors914. In addition, PTB prevalence varies among different population groups1519. African-American ancestry is consistently associated with an increased PTB risk, even after adjusting for epidemiologic risk factors, such as income, education, lack of prenatal care, and other socioeconomic factors10,20,21.

A candidate gene approach has identified polymorphisms in genes that encode the progesterone receptor, tumor necrosis factor alpha, interleukins 4, 6, and 10, and mannose-binding lectin2233, that are mildly to modestly associated with SPTB. However, results have been inconsistent29,3436.

Our genome-wide approach for spontaneous preterm birth

Here, we employed an unbiased, genome-wide approach to search for possible candidate genes associated with SPTB. Genome-wide association studies (GWAS), or whole genome association studies, are a commonly used genetic approach to study a disease or a trait. GWAS compares thousands or even millions of common genetic markers, mainly single nucleotide polymorphisms (SNPs), across individuals with a disease or trait status. There are 1,688 publications identifying 11,299 SNPs that are significant for diseases or traits in 17 different categories37,38. We obtained the SPTB phenotype and genotype data deposited in the National Center for Biotechnology Information (NCBI) Genotypes and Phenotypes Database (dbGaP)39,40, study accession phs000103.v1.p1, to perform a GWAS to explore genetic variants associated with SPTB.

Materials and methods

Data application and approval

We applied to and received approval from dbGaP39,40 for access to a dataset for SPTB phenotype and genotype (study accession phs000103.v1.p1). We followed the Data Use Certification Agreement we signed during the application.

To access the data, one must apply and agree to the dbGAP terms of usage. A detailed instructions and procedures for application can be obtained at https://dbgap.ncbi.nlm.nih.gov/.

Data content

This dataset contains participants collected by the Danish National Birth Cohort (DNBC)41. DNBC is a prospective cohort that enrolled more than 100,000 pregnant women in the first trimester, before most adverse outcomes occurred, and therefore is free from sampling or collection bias. In this dataset, there are approximately 1000 preterm births and 1000 term births as controls. These study subjects were collected from 1997 to 2003. All are singleton gestations. Each birth has records of mother-child pairs. With the exception of 24 children with one or two grandparents from other Nordic countries, all other children in the dataset had parents and all four grandparents born in Denmark.

The case (preterm) group contains births delivered before 37 gestational weeks. The control (term) group contains births delivered at approximately 40 weeks’ gestation. In both preterm and term groups, children born with any recognized congenital or genetic abnormality were excluded. Pregnancies with maternal conditions known to be associated with PTB, iatrogenic or spontaneous (placenta previa, placental abruption, hydramnios, isoimmunization, placental insufficiency, pre-eclampsia/eclampsia), were also excluded by DNBC.

The blood sample (buffy coat) was collected for each mother-child pair. Their whole genomes were genotyped on the Illumina Human660W-Quad_v1_A platform (Illumina, Inc., San Diego, California, USA), which contains more than 500,000 markers. Genotyping was performed by the Johns Hopkins University Center for Inherited Disease Research (Baltimore, Maryland, USA). Further data cleaning and harmonization were done at the GENEVA Coordinating Center at the University of Washington (Seattle, Washington, USA).

Quality control

In this study, we focused only on fetal genomes for further analysis. Individuals with missing genotypes greater than 3% were filtered out. In addition, individuals with a heterozygosity rate deviating more than 3 standard deviations were also excluded from further analysis42. After per-individual quality control, we also performed per-SNP filtering. The SNPs that had missing genotypes greater than 3% were excluded42. Those having significantly different genotype missing rate between the case and control groups were also eliminated. A conservative cut-off with p < 1×10-5 was applied42. We also excluded SNPs that significantly deviated from Hardy-Weinberg equilibrium in the control group – those with p < 1×10-5 were filtered out42,43.

Data analysis

After data quality control, we performed GWAS for the fetal genomes on 22 autosomal chromosomes using the PLINK software package v1.0744 (http://pngu.mgh.harvard.edu/purcell/plink/). The level of genome-wide significance was set at 9.18×10-8, corresponding to Bonferroni correction for 544,675 multiple independent tests.

Quantile-quantile (QQ) plot was made using STATA Statistical Software, Release 1245. Manhattan plots were generated using Haploview46.

Results

Data overview

We started with a dataset comprised of 1,900 children. There were 31 children having a missing genotype rate greater than 3%. The average heterozygosity was 0.3238, with standard deviation of 0.0059. There were also 31 children having heterozygosity that deviated more than 3 standard deviations. In fact, there was considerable overlap when applying these two filtering criteria (Figure 1) - 26 individuals were identified by both exclusion criteria. During per-individual quality control, a total of 36 individuals were excluded, resulting in 1864 individuals. However, 66 of them had a missing phenotype, yielding a final total of 849 cases and 949 controls (Table 1).

1e624129-b5a6-498b-b5fa-36b07d45af44_figure1.gif

Figure 1. Per-individual quality control.

The X axis is the missing genotype rate for each individual. The Y axis is the heterozygosity rate for each individual. Each dot represents a person. The vertical dash line is the cut-off for per-individual missing rate: 3%. Individuals with missing rate greater than 3% are excluded. The two horizontal dot lines represent the mean heterozygosity ± 3 standard deviations. People with heterozygosity deviate above 3 standard deviations are also excluded. The two criteria overlap largely at the right lower part of the graph.

Table 1. Number of participants and markers in the dataset.

I. Participants(Individuals)
Original dataset 1900
Per-individual
quality control exclusion
36
Missing genotype rate > 3%31
Heterozygosity > 3 sd31
Missing phenotype 66
Remained for analysis 1,798
II. Markers(SNPs)
Original dataset 560,768
Per-marker
quality control exclusion
2,670
Missing genotype rate > 3%1,933
Missing genotype rate
significantly different
between case and control
367
Significantly deviate from
Hardy-Weinberg
equilibrium in control group
885
Not on autosomes 13,423
Remained for analysis 544,675

sd: standard deviations.

Among the 560,768 markers, there were 1,933 SNPs that exceeded the missing rate threshold of 3%. There were 367 SNPs that had a significantly different missing rate between the case group and control group (P value < 1×10-5). Further, 885 SNPs significantly deviated from Hardy-Weinberg equilibrium in the control group (P value < 1×10-5). These three criteria also identified some overlapping SNPs; a total of 2,670 SNPs were excluded using all criteria. These quality control steps left 558,098 SNPs remaining in the dataset. Of these SNPs, 544,675 of them are located on 22 autosomes, and were included in the analysis (Table 1).

Allelic test

We carried out GWAS for these 1,798 individuals, of which 849 are SPTB cases and 949 are term controls, over 544,675 SNPs on 22 autosomal chromosomes. An allelic test was first carried out, and no SNPs reached genome-wide significance after Bonferroni correction for multiple testing (Manhattan plot (Figure 2) and QQ plot (Figure 3)).

1e624129-b5a6-498b-b5fa-36b07d45af44_figure2.gif

Figure 2. Manhattan plot for allelic test.

The X axis is the position of each SNP grouped by different chromosomes, and presented with different colors. The Y axis is the P value for each test for each SNP, in –log10 scale. The horizontal blue line indicated the threshold of genome-wide significance after Bonferroni correction. The threshold is at 7.04, which corresponds to –log10 (0.05/544,675) for 544,675 independent tests. No SNP reached genome-wide significance.

1e624129-b5a6-498b-b5fa-36b07d45af44_figure3.gif

Figure 3. QQ plot for allelic test.

The X axis indicated the expected P value, in –log10 scale. The Y axis indicated the observed P value from allelic test, also in –log10 scale. The red diagonal line is the line of Y=X, where observed equals expected. The genome-wide significance threshold for –log10 (P) scale is at 7.04, which corresponds to –log10 (0.05/544,675) for 544,675 independent tests. No SNP reached genome-wide significance.

Other genetic models

We then performed GWAS with different genetic models. We tested three classical Mendelian inheritance models here. The recessive model assumes that carrying two variant alleles is required to present a different phenotype; while in the dominant model, one variant allele is sufficient to present a different phenotype as carrying two variant alleles. The additive model assumes the heterozygotes present an intermediate phenotype between the two homozygotes and thus consider the three genotypes separately. The Manhattan plot for the additive model (Figure 4), dominant model (Figure 5), and recessive model (Figure 6) are shown. The dominant or recessive models refer to the action of the minor allele. Within these genetic models, no SNP reached genome-wide significance after Bonferroni correction for multiple testing.

1e624129-b5a6-498b-b5fa-36b07d45af44_figure4.gif

Figure 4. Manhattan plot for additive model.

The X axis is the position of each SNP grouped by different chromosomes, and presented with different colors. The Y axis is the P value for each test for each SNP, in –log10 scale. The horizontal blue line indicated the threshold of genome-wide significance after Bonferroni correction. The threshold is at 7.04, which corresponds to –log10 (0.05/544,675) for 544,675 independent tests. No SNP reached genome-wide significance.

1e624129-b5a6-498b-b5fa-36b07d45af44_figure5.gif

Figure 5. Manhattan plot for dominant model.

This is the GWAS result assuming the dominant action of minor alleles. The Y axis is the P value for each test for each SNP, in –log10 scale. The horizontal blue line indicated the threshold of genome-wide significance after Bonferroni correction. The threshold is at 7.04, which corresponds to –log10 (0.05/544,675) for 544,675 independent tests. No SNP reached genome-wide significance.

1e624129-b5a6-498b-b5fa-36b07d45af44_figure6.gif

Figure 6. Manhattan plot for recessive model.

This is the GWAS result assuming the recessive action of minor alleles. The Y axis is the P value for each test for each SNP, in –log10 scale. The horizontal blue line indicated the threshold of genome-wide significance after Bonferroni correction. The threshold is at 7.04, which corresponds to –log10 (0.05/544,675) for 544,675 independent tests. No SNP reached genome-wide significance.

Discussion

Here we describe a negative result for associations between SPTB and genetic polymorphisms of 22 autosomal chromosomes in a homogeneous European population. Myking et al. reported a GWAS focusing on the X chromosome47 and they incorporated Danish cases and controls from DNBC41,47, in addition to participants enrolled from the Norwegian Mother and Child Cohort Study (MoBa)48. Nevertheless, with a larger sample size (DNBC + MoBa), and fewer independent tests limited to the markers on X chromosomes, no SNP reached genome-wide significance after Bonferroni correction47.

One way to decrease the probability of a negative result is to increase the sample size, either by recruiting more cases and controls directly, or by combining different studies and conducting a meta-analysis49,50. Instead of sequencing thousands or millions of sporadic cases plus controls, another approach is to study SPTB using a family-based design – i.e., identify high-risk pedigrees in which a genetic mutation is more likely to be present in multiple individuals. Pedigree studies have the additional advantage of reduced phenotypic heterogeneity49. Several loci associated with SPTB have been identified by using family-based linkage studies51,52.

Another approach is to employ whole genome or whole exome sequencing. This will help to identify rare genetic variants with potentially larger effect sizes.

Another approach that may increase statistical power is to analyze the SPTB phenotype as a quantitative trait instead of a dichotomous one. The distribution of gestational age in the population is approximately normal53. Therefore, to analyze SPTB as a quantitative trait (i.e., gestational age), samples should be drawn randomly from the population of newborns.

Conclusion

We found no evidence of genetic association with SPTB in Danish population using an unbiased genome-wide approach. A family-based design in a high risk pedigree, and whole genome or exome sequencing, may yield higher detection rates of both common and rare variants associated with SPTB.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 25 Nov 2013
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Wu W, Clark EAS, Manuck TA et al. A Genome-Wide Association Study of spontaneous preterm birth in a European population [version 1; peer review: 2 approved with reservations]. F1000Research 2013, 2:255 (https://doi.org/10.12688/f1000research.2-255.v1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 25 Nov 2013
Views
28
Cite
Reviewer Report 24 Nov 2014
David Olson, Department of Obstetrics and Gynecology, University of Alberta, Edmonton, AB, Canada 
Scott Williams, Department of Genetics, Institute of Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA 
Approved with Reservations
VIEWS 28
From Scott Williams:

I have now read the paper and as it stands it is ok. I would however recommend several things that may improve it:
  1. State that the data generally suggests that most effects are maternal so that this may be
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Olson D and Williams S. Reviewer Report For: A Genome-Wide Association Study of spontaneous preterm birth in a European population [version 1; peer review: 2 approved with reservations]. F1000Research 2013, 2:255 (https://doi.org/10.5256/f1000research.2631.r6795)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
40
Cite
Reviewer Report 06 Feb 2014
Mohamad Saad, Division of Medical Genetics, University of Washington, Seattle, WA, USA 
Approved with Reservations
VIEWS 40
In their paper, “A Genome-Wide Association Study of spontaneous preterm birth in a
European population”, the authors perform a GWAS on spontaneous preterm birth in a European population. The GWAS is based on 849 cases (preterm births) and 949 controls (term ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Saad M. Reviewer Report For: A Genome-Wide Association Study of spontaneous preterm birth in a European population [version 1; peer review: 2 approved with reservations]. F1000Research 2013, 2:255 (https://doi.org/10.5256/f1000research.2631.r3489)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 25 Nov 2013
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.