ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Data Note
Revised

A curated transcriptome dataset collection to investigate inborn errors of immunity

[version 2; peer review: 2 approved]
PUBLISHED 30 Aug 2019
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Data: Use and Reuse collection.

Abstract

Primary immunodeficiencies (PIDs) are a heterogeneous group of inherited disorders, frequently caused by loss-of-function and less commonly by gain-of-function mutations, which can result in susceptibility to a broad or a very narrow range of infections but also in inflammatory, allergic or malignant diseases. Owing to the wide range in clinical manifestations and variability in penetrance and expressivity, there is an urgent need to better understand the underlying molecular, cellular and immunological phenotypes in PID patients in order to improve clinical diagnosis and management. Here we have compiled a manually curated collection of public transcriptome datasets mainly obtained from human whole blood, peripheral blood mononuclear cells (PBMCs) or fibroblasts of patients with PIDs and of control subjects for subsequent meta-analysis, query and interpretation. A total of eighteen (18) datasets derived from studies of PID patients were identified and retrieved from the NCBI Gene Expression Omnibus (GEO) database and loaded in GXB, a custom web application designed for interactive query and visualization of integrated large-scale data. The dataset collection includes samples from well characterized PID patients that were stimulated ex vivo under a variety of conditions to assess the molecular consequences of the underlying, naturally occurring gene defects on a genome-wide scale. Multiple sample groupings and rank lists were generated to facilitate comparisons of the transcriptional responses between different PID patients and control subjects. The GXB tool enables browsing of a single transcript across studies, thereby providing new perspectives on the role of a given molecule across biological systems and PID patients. This dataset collection is available at http://pid.gxbsidra.org/dm3/geneBrowser/list.

Keywords

Transcriptomics, microarray, primary immunodeficiency disorders, inborn errors of immunity.

Revised Amendments from Version 1

In this new version, I took in consideration very helpful reviewers comments and did some minor changes. The tutorials link on how to use the tools was updated. also, the GSE29536 datasets was removed from the instance and the datanote (table 1). Finally, the figure 1 was modified.

See the authors' detailed response to the review by Bertrand De Meulder
See the authors' detailed response to the review by John B. Ziegler

Introduction

Primary immunodeficiencies (PIDs) are a heterogeneous group of inherited disorders, most often caused by loss-of-function mutations and less commonly by gain-of-function mutations, affecting components of the innate and/or adaptive immune system13. These inborn errors of immunity can result in profoundly increased susceptibility to a broad or a very narrow range of infections but also autoimmune disorders, allergies and malignancies1,46. The spectrum of clinical manifestations of PIDs is very broad and largely dependent upon the affected gene(s) and the degree to which normal gene function is lost or altered. In addition, a variety of other factors such as germline or somatic mosaicism, modifier genes and environmental factors can play an important role in the clinical penetrance and expressivity of a given disease phenotype3,7,8. To date, mutations in more than 300 genes have been identified to cause PIDs, which are classified into major groups reflecting the diverse immunological phenotypes5,9. Nonetheless, PIDs often go unrecognized or are not properly diagnosed10. However, with the recent developments and rapidly declining costs of next-generation sequencing technologies and other high-throughput methods, it is expected that many more as yet unknown disease-related genetic variants will be discovered in the near future7.

A considerable challenge for identifying causal genetic variants—which is critical for the diagnosis and clinical management of PID patients—lies in the vast heterogeneity of the underlying immunological phenotypes and clinical manifestations on the one hand and in the degree of human genetic variation between individuals on the other hand. Despite considerable advances in recent years, specific gene functions in humans, their roles and regulation in biological processes, and essentiality of the redundancy in a function of a particular gene for protective immunity of the human host remains poorly understood6. In many cases, the use of forward and reverse genetics in mice or other model organisms has provided insufficient insights into the pathophysiology of PIDs, this due to interspecies differences and the fact that inbreeding has let to various deficiencies in laboratory animals, rendering them susceptible to a broad range of infections that often poorly recapitulates the clinical phenotypes in humans11. On the other hand, studying naturally occurring genetic defects in humans is much more challenging given the difficulty of obtaining biological samples, ethical implications and potential risks that go along with it. An additional challenge is the low frequency of most null alleles. Although PIDs are not necessarily rare when considered collectively, the small number of individuals that suffer from a specific deficiency usually does not permit classic case-control or family-based genetic association studies. Indeed, a considerable proportion of monogenic etiologies of PIDs were initially reported in single patients12. The ability to identify single-gene inborn errors in PID patients requires validation of the disease-causing variant by in-depth mechanistic studies demonstrating the structural and functional consequences of the mutations using blood or other accessible biological samples such as fibroblasts from skin biopsies12. In this context, several transcriptomics studies have been conducted using whole blood, PBMCs and fibroblasts of well-characterized PID patients, to assess the underlying immunological phenotypes at the molecular and cellular levels in more detail (Table 1) and in many cases, to further validate the causal relationship between the underlying genotypes and clinical phenotypes. Notable are several seminal studies of PID patients with susceptibility to a very narrow range of pathogens, such as patients with MYD88 or IRAK4 deficiency who are primarily susceptible to pyrogenic bacterial infections13,14, patients with TBK1, TRIF or TLR3 deficiency1517 which underlies herpes simplex encephalitis of childhood, or a recent study of a child with IRF7 deficiency who was primarily susceptible to severe influenza but otherwise immunocompetent with regard to other common infectious diseases18. Such studies have highlighted that often, the underlying gene defect may only affect a narrow repertoire of transcriptional responses while the affected individual's cells remain highly responsive to specific stimulation through alternate receptors, pathways and signaling networks and in particular to ex vivo stimulation with whole organisms, reflecting the high degree of human gene redundancy in host defenses6.

Table 1. List of datasets constituting the collection.

TitlePlatforms usedPID classificationDiseaseGenetic defectsCell type/
Tissues
Number
of
samples
Citation #GEO ID
Complete TLR3 deficiency. GSE30951.Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity
MyD88,
TLR3,
UNC93B1
deficiencies
Mutations of
MyD88, TLR3,
and UNC93B1
Fibroblast/
PBMC
4417GSE30951
Genome-wide profiling of whole blood from patients
with defects in Toll-like receptors (TLRs) and IL-1Rs
(the TIR pathway) signaling.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity
IRAK4,
MyD88
deficiencies
Mutations of
IRAK4, and
MyD88
Whole blood36514GSE25742
Identification of IL-21-induced STAT3 dependent
genes in human B cells.
Affymetrix
HuGene 1.0
ST v1
Diseases of immune
dysregulation
STAT3 GOF
mutations
Mutations in
STAT3
B cells1420GSE51587
Impaired intrinsic immunity to HSV-1 in human iPSC-
derived TLR3-deficient CNS cells.
Illumina
HumanWG-6 v3
Defects in intrinsic
and innate immunity
UNC93B1
deficiency
Mutations of
UNC93B1
iPS1821GSE40593
In vitro response of fibroblasts isolated from patients
with immunodeficiencies.
Illumina Human-
6 v2
Defects in intrinsic
and innate immunity

Combined
immunodeficiencies
with associated or
syndromic features
UNC93B1,
MyD88,
IRAK4,
STAT1,
NEMO
deficiencies
Mutations of
UNC93B1,
MyD88, IRAK4,
STAT1, and
NEMO
Fibroblast7213GSE12124
In vitro response of fibroblasts isolated from patients
with RBCK1 deficiency.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity

Combined
immunodeficiencies
with associated or
syndromic features
MyD88,
HOIL1,
NEMO
deficiencies
Mutations of
MyD88, HOIL1/
RBCK1, and
NEMO
Fibroblast4622GSE31064
In vitro responses of fibroblasts from patients
with TBK1 deficiency after TLR3 dependent and
independent stimuli.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity
TLR3, TBK1,
STAT1
deficiencies
Mutations of
TLR3, TBK1, and
STAT1
Fibroblast5815GSE38652
In vitro responses of PBMC and fibroblasts from
patients with TRIF deficiency after TRIF dependent
and independent stimuli.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity
TLR3, TRIF
and MyD88
deficiencies
Mutations of
TLR3, TRIF, and
MyD88
Fibroblast,
PBMC
2716GSE32390
Increased Wnt and Notch signaling: A clue to
the renal disease in Schimke immuno-osseous
dysplasia?.
Affymetrix HG-
U133_Plus_2
Combined
immunodeficiencies
with associated or
syndromic features
Schimke
immuno-
osseous
dysplasia
(SIOD)
Mutations in
SMARCAL1
Kidney biopsy523GSE81156
Inherited human IRAK-1 deficiency selectively
abolishes TLR signaling in fibroblasts.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity
IRAK4,
MyD88
deficiencies
Mutations of
IRAK1, IRAK4,
MyD88, and
MECP2
Fibroblast6424GSE92466
Interferon Signature in the Blood in Inflammatory
Common Variable Immune Deficiency [Test Set].
Illumina
HumanHT-12 v4
Predominantly
antibody deficiencies
Common
Variable
Immune
Deficiency
(CVID)
UnknownWhole blood5325GSE51404
Interferon Signature in the Blood in Inflammatory
Common Variable Immune Deficiency [Training Set].
Illumina
HumanHT-12 v3
Predominantly
antibody deficiencies
Common
Variable
Immune
Deficiency
(CVID)
UnknownWhole blood8325GSE51405
IRAK-4- and MyD88-dependent pathways are
essential for the removal of developing autoreactive B
cells in humans.
Affymetrix HG-
U133_Plus_2
Defects in intrinsic
and innate immunity
UNC93B1,
MyD88,
and IRAK4
deficiencies
Mutations of
UNC93B1,
MyD88, and
IRAK4
B cells526GSE13300
Response of IRF7-deficient peripheral blood
mononuclear cells to pH1N1 influenza virus infection.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity
UNC93B1,
and IRF7
deficiencies
Mutations of
UNC93B1, and
IRF7
PBMC1818GSE66486
Screening for differentially expressed genes in
patients with a novel immunodeficiency syndrome.
Affymetrix HG-
U133A
Congenital defects of
phagocyte number,
functions, or both
p14/
LAMTOR2
deficiency
Mutations
in ROBLD3/
LAMTOR2
B cells427GSE6322
Transcriptional analysis of whole blood in patients
with auto-inflammatory disorders.
Illumina
HumanHT-12 v3
Defects in intrinsic
and innate immunity

Autoinflammatory
disorders
UNC93B1,
MyD88,
and IRAK4
deficiencies

Defects
affecting the
inflammsome
Mutations of
UNC93B1,
MyD88, and
IRAK4, NLRP3,
MVK
Whole blood5122GSE40561
Transcriptome analysis in peripheral blood
mononuclear cells (PBMC) from HOIL-1-deficient
patients upon TNF-a or IL-1b stimulation.
Illumina
HumanHT-12 v4
Combined
immunodeficiencies
with associated or
syndromic features
HOLI1
deficiency
Mutations of
HOLI1/RBCK1
PBMC4522GSE40838
Transcriptome analysis in primary fibroblasts from
HOIL-1-deficient patients upon TNF-a or IL-1b
stimulation.
Illumina
HumanHT-12 v4
Defects in intrinsic
and innate immunity

Combined
immunodeficiencies
with associated or
syndromic features
MyD88,
HOIL1,
NEMO
deficiencies
Mutations of
MyD88, HOIL1/
RBCK1, and
NEMO
Fibroblast5422GSE40560

Here, we compiled a curated collection of 18 transcriptome datasets, retrieved from the NCBI's Gene Expression Omnibus (GEO) database, to provide as resource for the investigation on inborn errors of immunity. The datasets were loaded into a custom interactive web application, the Gene Expression Browser (GXB), (http://pid.gxbsidra.org/dm3/geneBrowser/list), which allows seamless access to the data and interactive visualization of the transcriptional responses, along with demographic and clinical information19. The user can customize data plots by adding multiple layers of parameters (e.g. age, gender, sample type and type of genetic defects), select and modify the sample ordering and gene rank lists, and generate links (mini URL) that can be shared via e-mail or used in publications. The GXB tool enables browsing of a single transcript across multiple studies and datasets, providing new perspectives on the role of a given molecule across biological systems and PID patients. In summary, this dataset collection can aid clinicians and researchers to study and quickly visualize the functional consequences of a variety of well characterized, naturally occurring mutations on a genome-wide scale.

Methods

A total of 163 datasets were identified in GEO using the following search query: Homo sapiens AND (“primary immunodeficiency diseases” OR “primary immunodeficiencies diseases” OR PID OR “autosomal recessive” OR “autosomal dominant” OR “inherited deficiency”) AND (“Expression profiling by array”). All GEO entries that were returned with this query were manually curated. This process involved reading all the descriptions available for the datasets, the study designs and corresponding research articles. Finally, a total of 18 datasets were retained because they contained samples which were obtained from well characterized patients with known PIDs (i.e. the genetic etiology had been identified) or the samples were obtained from patients which were considered to have common variable immunodeficiency (CVID). These include datasets that were generated from whole blood, PBMCs, fibroblasts, B cells, Induced Pluripotent Stem Cells (iPS) and kidney biopsy of individuals with defects in intrinsic and innate immunity, combined immunodeficiencies with associated or syndromic features, autoinflammatory disorders, congenital defects of phagocyte number, functions, or both, predominantly antibody deficiencies and diseases of immune dysregulation. However, In the future, this current instance might be updated by adding additional datasets. The selected datasets are listed in Table 1. A breakdown of the dataset collection by category in accordance to the most recently published update on PID classification from International Union of Immunological Societies Expert Committee5,9 is shown in Figure 1.

9db2dd52-a814-4262-ae24-f9a3210e2805_figure1.gif

Figure 1. Break down of the dataset collection by category.

The pie chart indicates the PIDs classification out for the 18 datasets.

The selected datasets were downloaded from GEO using the SOFT file format. Then, the datasets were uploaded onto our web tool, called the Gene Expression Browser (GXB), an interactive application hosted on the Amazon Web Services cloud19. Information about samples and study design were also uploaded. The available samples were assigned to groups based on the individuals and deficiencies studied and genes were ranked according to different group comparisons allowing the identification of transcripts that were differentially expressed between the patient's and control subject's cells cultured or stimulated ex vivo under the same conditions. Our dataset collection, uploaded in GXB, is available at http://pid.gxbsidra.org/dm3/geneBrowser/list. A web tutorial for the use of GXB can be accessed at: http://pid.gxbsidra.org/dm3/tutorials.gsp.

A detailed description of GXB has been recently published19,2830 and is reproduced here so that readers can use this article as a standalone resource. Briefly, datasets of interest can be quickly identified either by filtering on criteria from pre-defined sections on the left or by entering a query term in the search box at the top of the dataset navigation page. Clicking on one of the studies listed in the dataset navigation page opens a viewer designed to provide interactive browsing and graphic representations of large-scale data in an interpretable format. This interface is designed to present ranked gene lists and display expression results graphically in a context-rich environment. Selecting a gene from the rank ordered list on the left of the data-viewing interface will display its expression values graphically in the screen’s central panel. Directly above the graphical display drop down menus give users the ability: a) To change how the gene list is ranked - this allows the user to change the method used to rank the genes, or to only include genes that are selected for specific biological interest; b) To change sample grouping (Group Set button) - in some datasets, a user can switch between groups based on cell type to groups based on disease type, for example; c) To sort individual samples within a group based on associated categorical or continuous variables (e.g. gender or age); d) To toggle between the bar chart view and a box plot view, with expression values represented as a single point for each sample. Samples are split into the same groups whether displayed as a bar chart or box plot; e) To provide a color legend for the sample groups; f) To select categorical information that is to be overlaid at the bottom of the graph - for example, the user can display gender or smoking status in this manner; g) To provide a color legend for the categorical information overlaid at the bottom of the graph; h) To download the graph as a portable network graphics (png) image. Measurements have no intrinsic utility in absence of contextual information. It is this contextual information that makes the results of a study or experiment interpretable. It is therefore important to capture, integrate and display information that will give users the ability to interpret data and gain new insights from it. We have organized this information under different tabs directly above the graphical display. The tabs can be hidden to make more room for displaying the data plots, or revealed by clicking on the blue “show info panel” button on the top right corner of the display. Information about the gene selected from the list on the left side of the display is available under the “Gene” tab. Information about the study is available under the “Study” tab. Rolling the mouse cursor over a bar chart feature while displaying the “Sample” tab lists any clinical, demographic, or laboratory information available for the selected sample. Finally, the “Downloads” tab allows advanced users to retrieve the original dataset for analysis outside this tool. It also provides all available sample annotation data for use alongside the expression data in third party analysis software. Other functionalities are provided under the “Tools” drop-down menu located in the top right corner of the user interface. Some of the notable functionalities available through this menu include: a) Annotations, which provides access to all the ancillary information about the study, samples and dataset organized across different tabs; b) Cross-project view; which provides the ability for a given gene to browse through all available studies; c) Copy link, which generates a mini-URL encapsulating information about the display settings in use and that can be saved and shared with others (clicking on the envelope icon on the toolbar inserts the URL in an email message via the local email client); d) Chart options; which gives user the option to customize chart labels.

Data availability

All datasets included in our curated collection are available publicly via the NCBI GEO website: https://www.ncbi.nlm.nih.gov/gds/ and are referenced throughout the manuscript by their GEO accession numbers (e.g. GSE92466). Signal files and sample description files can also be downloaded from the GXB tool under the “downloads” tab.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 15 Feb 2019
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Bougarn S, Boughorbel S, Chaussabel D and Marr N. A curated transcriptome dataset collection to investigate inborn errors of immunity [version 2; peer review: 2 approved]. F1000Research 2019, 8:188 (https://doi.org/10.12688/f1000research.18048.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 30 Aug 2019
Revised
Views
3
Cite
Reviewer Report 17 Sep 2019
John B. Ziegler, School of Women's & Children's Health, University of New South Wales, Sydney, NSW, Australia;  Department of Immunology & infectious Diseases, Sydney Children's Hospital, Sydney, Australia 
Approved
VIEWS 3
All is now in order with ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Ziegler JB. Reviewer Report For: A curated transcriptome dataset collection to investigate inborn errors of immunity [version 2; peer review: 2 approved]. F1000Research 2019, 8:188 (https://doi.org/10.5256/f1000research.22465.r53217)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
8
Cite
Reviewer Report 09 Sep 2019
Bertrand De Meulder, European Institute for Systems Biology & Medicine, Université de Lyon, Lyon, France;  Association EISBM, Vourles, France 
Approved
VIEWS 8
I feel my comments have been taken into ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
De Meulder B. Reviewer Report For: A curated transcriptome dataset collection to investigate inborn errors of immunity [version 2; peer review: 2 approved]. F1000Research 2019, 8:188 (https://doi.org/10.5256/f1000research.22465.r53218)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 15 Feb 2019
Views
20
Cite
Reviewer Report 28 Mar 2019
John B. Ziegler, School of Women's & Children's Health, University of New South Wales, Sydney, NSW, Australia;  Department of Immunology & infectious Diseases, Sydney Children's Hospital, Sydney, Australia 
Approved with Reservations
VIEWS 20
This is a useful tool to examine publicly available gene transcript data.
 
The paper would be more useful if it included more detailed instructions for its use, perhaps including screenshots to illustrate the text.
 
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Ziegler JB. Reviewer Report For: A curated transcriptome dataset collection to investigate inborn errors of immunity [version 2; peer review: 2 approved]. F1000Research 2019, 8:188 (https://doi.org/10.5256/f1000research.19737.r45485)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Aug 2019
    Salim Bougarn, Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
    30 Aug 2019
    Author Response
    Dear Dr. Ziegler,
    We are  thankful to Dr. Ziegler  for his positive feed back regarding our manuscript and for the careful revision. We respond to the specific comments and describe the ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 30 Aug 2019
    Salim Bougarn, Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
    30 Aug 2019
    Author Response
    Dear Dr. Ziegler,
    We are  thankful to Dr. Ziegler  for his positive feed back regarding our manuscript and for the careful revision. We respond to the specific comments and describe the ... Continue reading
Views
30
Cite
Reviewer Report 21 Mar 2019
Bertrand De Meulder, European Institute for Systems Biology & Medicine, Université de Lyon, Lyon, France;  Association EISBM, Vourles, France 
Approved with Reservations
VIEWS 30
This article presents a tool to gather and make preliminary analyses on a set of curated sequencing datasets related to primary immunodeficiencies. 

Regarding replication, I could not reproduce the list of datasets with the criteria that are ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
De Meulder B. Reviewer Report For: A curated transcriptome dataset collection to investigate inborn errors of immunity [version 2; peer review: 2 approved]. F1000Research 2019, 8:188 (https://doi.org/10.5256/f1000research.19737.r45968)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Aug 2019
    Salim Bougarn, Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
    30 Aug 2019
    Author Response
    Dear Bertrand, 

    We are  thankful to Dr. De Meulder for his positive feed back regarding our manuscript and for the careful revision. We respond to the specific comments and ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 30 Aug 2019
    Salim Bougarn, Systems Biology and Immunology, Sidra Medicine, Doha, Qatar
    30 Aug 2019
    Author Response
    Dear Bertrand, 

    We are  thankful to Dr. De Meulder for his positive feed back regarding our manuscript and for the careful revision. We respond to the specific comments and ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 15 Feb 2019
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.