ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article

Iron Hack - A symposium/hackathon focused on porphyrias, Friedreich’s ataxia, and other rare iron-related diseases

[version 1; peer review: 2 approved]
* Equal contributors
PUBLISHED 19 Jul 2019
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Python collection.

This article is included in the Rare diseases collection.

This article is included in the Hackathons collection.

Abstract

Background: Basic and clinical scientific research at the University of South Florida (USF) have intersected to support a multi-faceted approach around a common focus on rare iron-related diseases. We proposed a modified version of the National Center for Biotechnology Information’s (NCBI) Hackathon-model to take full advantage of local expertise in building “Iron Hack”, a rare disease-focused hackathon. As the collaborative, problem-solving nature of hackathons tends to attract participants of highly-diverse backgrounds, organizers facilitated a symposium on rare iron-related diseases, specifically porphyrias and Friedreich’s ataxia, pitched at general audiences.
Methods: The hackathon was structured to begin each day with presentations by expert clinicians, genetic counselors, researchers focused on molecular and cellular biology, public health/global health, genetics/genomics, computational biology, bioinformatics, biomolecular science, bioengineering, and computer science, as well as guest speakers from the American Porphyria Foundation (APF) and Friedreich’s Ataxia Research Alliance (FARA) to inform participants as to the human impact of these diseases.
Results: As a result of this hackathon, we developed resources that are relevant not only to these specific disease-models, but also to other rare diseases and general bioinformatics problems. Within two and a half days, “Iron Hack” participants successfully built collaborative projects to visualize data, build databases, improve rare disease diagnosis, and study rare-disease inheritance.
Conclusions: The purpose of this manuscript is to demonstrate the utility of a hackathon model to generate prototypes of generalizable tools for a given disease and train clinicians and data scientists to interact more effectively.

Keywords

Hackathon, Data Science, Ataxia, Porphyria, Rare Diseases, Friedreich’s Ataxia, Clinical Informatics, Bioinformatics

Introduction

Iron Hack: Genesis of a new hackathon model

Hackathons are an effective avenue for the generation of software prototypes in the biomedical informatics space, several of which have been sponsored by the National Institutes of Health (NIH NCBI). A long-standing interest and active research programs on rare diseases, including Friedreich’s ataxia and porphyrias at the University of South Florida (USF), prompted us to modify the traditional NCBI Hackathon model and initiate a specific disease-focused hackathon13. Our event “Iron Hack” was named after the rare diseases upon which we focused.

Because of the diverse scientific and bioinformatic backgrounds of hackathon participants, the organizers felt it necessary to have a symposium on rare iron-related diseases, and specifically ataxia and porphyrias, in the early part of each of the three days. During the symposium, renowned scientists, clinicians and investigators in rare iron-related disease research covered the major aspects of Friedreich’s ataxia, porphyrias and sideroblastic anemia and emphasized pressing questions that need to be addressed for advancement of the field. As a result of the hackathon, we developed resources that are relevant not only to rare iron-related diseases, but also to other rare diseases and some general bioinformatics problems. The objective of this report is to demonstrate the utility of a hackathon model to develop generalizable tools for evaluation, diagnosis, and management of a given disease.

Rare iron-related diseases: Hurdles to overcome

Rare diseases have a large impact on the population, with 7,000 orphan diseases collectively affecting about 1 in 10 to 20 people4. These diseases place a heavy burden on patients and families, with a diagnosis taking up to ten years to identify, if ever. Limited patient-numbers and resources for each disease severely hamper research, prognoses, diagnoses, and treatments. Although quite disparate in symptoms, these conditions all stem from problems with iron metabolism or heme synthesis. A brief description of these rare diseases is provided below to provide context for the “Iron Hack” team projects.

Porphyrias

Biochemical basis and clinical manifestation. Porphyrias are a group of rare metabolic disorders caused by malfunction of the enzymes involved in heme biosynthesis2,3,5. There are eight different types of porphyria, each of which arises from mutation(s) in the genes for each of the eight enzymes of the heme biosynthetic pathway2,3,6,7. With the exception of porphyria cutanea tarda (PCT), all other seven porphyrias are inherited as autosomal-dominant, autosomal-recessive, or X-linked traits2,6,8,9.

All porphyrias are characterized by the accumulation and excretion of porphyrins or porphyrin precursors, though each type has disparate clinical manifestations (including neurovisceral and/or cutaneous symptoms) depending upon which enzyme of the pathway is defective10. In general, neurological disturbances are manifested in the form of acute attacks (e.g., extreme abdominal and chest pain, vomiting, confusion, constipation, fever, high blood pressure, low blood sodium levels, and seizures), while photosensitivity is at the root of cutaneous manifestations (e.g., skin blistering, redness, scarring, and pain when exposed to the sun)3,9,11,12. Suggested treatments and disease-outcomes also vary with the particular pathway-defect10,1319.

Diagnosis. Acute porphyrias can prevail undiagnosed for 10–15 years following the onset of symptoms19. Perhaps not surprisingly, diagnosis of porphyrias remains challenging: they are rare, and their symptoms are nonspecific, often mimicking other, more common disorders11. Thus once porphyria clinical symptoms are recognized, biochemical laboratory testing should be performed to identify the specific type of porphyria10. Genetic testing, normally targeted gene sequencing, becomes critical to define the mutation(s) in a family and make genetic counseling possible. However, diagnosis of porphyrias are often overlooked due to, a large extent, the difficulty in performing the specific biochemical assays and absence of specialized porphyria centers, particularly in developing countries2022.

Prevalence. In the United States, porphyrias collectively afflict fewer than 200,000 people, with similar prevalence in the European Union4,10,23. Estimates of porphyria prevalence vary by type, with values of 1 in 10,000 for the most common type of porphyria, PCT (OMIM 176090) and 1 in 1,000,000 for congenital erythropoietic porphyria (CEP; OMIM 263700)10. Many people with a genetic mutation associated with the disease never experience signs or symptoms, a phenomenon known as incomplete or reduced penetrance24.

Friedreich’s ataxia

Biochemical basis and clinical manifestation. Friedreich’s ataxia (FRDA, FA; OMIM 229300) is a rare autosomal-recessive disease associated with progressive spinocerebellar ataxia, cardiomyopathy, scoliosis, diabetes, and vision and hearing impairment1,25. Most FRDA patients are largely asymptomatic during the first 5 – 10 years of life. But, with advancing gait and limb ataxia, they require use of a wheelchair and are unable to perform daily activities independently, often during adolescence25,26.

Symptoms result from reduced synthesis of the mitochondrial protein frataxin, an iron chaperone that, by shielding this metal, prevents the production of reactive oxygen species (ROS) and renders it bioavailable as ferrous iron2629. When frataxin levels are low, iron accumulates in the mitochondria, largely in an oxidized and insoluble form30,31. The accumulated iron can participate in Fenton chemistry leading to formation of extensive reactive oxygen species (ROS) that cause damage and cell death. While there is a general agreement that frataxin is critical for mitochondrial iron metabolism and cellular iron homeostasis, its precise biological role remains a controversial matter26,32,33. Involvement of frataxin in 1) iron delivery to the iron–sulfur cluster assembly and repair machinery, 2) repair of oxidatively inactivated [3Fe–4S] aconitase to yield an active enzyme, 3) delivery of ferrous iron to ferrochelatase for heme biosynthesis, 4) detoxification of iron by catalyzing the oxidation of Fe(II) to Fe(III) and storing the metal as a ferrihydrite mineral within structurally organized frataxin oligomers are among the reported functions ascribed to frataxin26,31,3441. Despite the lack of consensus, frataxin and heme biosynthesis are linked. Frataxin may participate in the assembly of the [2Fe-2S] cluster, an essential cofactor for an active human ferrochelatase, the terminal enzyme of the heme biosynthetic pathway4244. Alternatively, by maintaining and chaperoning iron in a reduced form, frataxin may donate Fe(II) to ferrochelatase, which has strict physiological specificity for Fe(II) as substrate37,39,4446. Clearly, a combination of these two functional possibilities cannot be ruled out.

Prognoses and treatment. Presently, there is neither a cure nor a U.S. Food and Drug Administration (FDA)-approved treatment for FRDA47. Advances in understanding the underlying mechanism of FRDA, in particular the recognition that frataxin deficiency is the root cause of FRDA, have prompted the development of therapeutic strategies. Since increased oxidative stress and mitochondrial respiratory chain dysfunction have been associated with the pathogenesis of FRDA, antioxidants and inhibitors of free radical formation (e.g., idebenone, L-acetylcarnitine, resveratrol, and RT001 and other deuterated polyunsaturated fatty acids) have been assessed as a promising treatment option4750. Iron chelators, such as deferiprone, have been considered as a therapeutic approach of FRDA by controlling iron accumulation and decreased frataxin synthesis29,51. Regulation of frataxin gene expression by increasing either histone acetylation or transcription of the frataxin gene, represents yet another treatment possibility being explored52,53. Histone deacetylase inhibitors reverse or, at least, diminish silencing and the reduced transcription of the frataxin gene observed in FRDA patients54. While protein interferon- increases transcription of the frataxin gene and consequent production of the frataxin, its therapeutic benefit remains to be established52,55. Frataxin gene replacement is also being developed for as a potential treatment for FRDA. Because of the scope of this report, an evaluation of the therapies for FRDA can be neither extensive nor complete47.

Prevalence. FRDA affects about 1 in 150,000 individuals of Caucasian descent and accounts for 50% of overall cases of hereditary ataxia and for 75% of those with onset before age 251,25,56,57.

Genetics of porphyrias and FRDA: a tractable problem?

The robust evidence suggesting these devastating, rare diseases named porphyrias can largely be pinpointed to dysfunction in a single pathway of eight enzymes, caused by mutation(s) inherited in well-understood, classical Mendelian patterns made them an attractive case for potentially-impactful tool-development. However, even in these seemingly straightforward cases of diseases exhibiting classical Mendelian inheritance, disease phenotypes are not entirely explained by the presence of known pathogenic variants. The discrepancy between the low penetrance of symptomatic patients for autosomal-dominant acute intermittent porphyria (AIP; OMIM 176000) and the high frequency of pathogenic mutations led Chen et al. to propose that predisposing- or protective-modifier genes alter expression of the AIP phenotype58. Indeed, a small number of modifier-genes, regulatory and pathophysiological mechanisms have since been identified to contribute to onset of porphyrias, though these findings remain insufficient to explain the disease-penetrance puzzle8,20.

In rare diseases resulting from trinucleotide copy-number repeat-variation, such as FRDA, there is some degree of relationship between severity of disease phenotype and copy-number57,59. In FRDA, expanded trinucleotide (GAA) tracts in intron 1 of the FXN gene, commonly between 600 and 900 repeats, result in pathologically decreased levels of frataxin6062. However, number of trinucleotide repeats are not reliably predictive of disease severity, further suggesting the importance of as-yet unknown modifying genes or environmental factors that may contribute to disease outcomes47.

The number of disease-phenotypes entirely decided by single-gene variants are in the minority63. Most inherited diseases are likely to have a more complicated etiology determined by some combination of genomic variants, impacted by myriad environmental factors as well.

Critical gaps Iron-Hack projects sought to address

We organized Iron Hack to address these challenges, including the great need for genomics tools to handle rare-disease data, such that new data-mining concepts and computational tools could be developed and further adapted to serve the rare-disease communities. We established five Iron Hack teams to develop five computational-tool prototypes broadly focused on (1) exploration of consumer-genomics data, (2) large-scale RNAseq data mining, (3) genomic data visualization, (4) rare-disease variants discovery, and (5) genotype-to-phenotype mapping. These team-efforts have led to the convergence of iron-research communities and genomics data-science researchers to produce promising computational tools, strengthened through an iterative process of soliciting ideas and feedback from domain experts.

The remainder of this report is organized into subsections by project, beginning with a detailed description for the five projects, the motivations behind them and the gaps they seek to fill. We next describe the methodologies and implementations of the projects into usable software applications, how to operate the software applications, and results produced using the software applications. Finally, we discuss the pros and cons of this new highly-interdisciplinary and community-driven twist on more traditional hackathons.

Project descriptions and goals

Project 1: UPWARD

Uniting People Working Against Rare Diseases (UPWARD) will be a Health Insurance Portability and Accountability Act (HIPAA)-compliant database which will allow people with rare diseases to declare interest in participating in research studies, and subsequently share their personal disease stories, clinical symptoms, and consumer genetic testing data with researchers and clinicians. Figure 1 shows that, as consumer-genetic testing data are submitted, they are analyzed alongside 43 porphyria-related pathogenic SNPs currently held in UPWARD. A set of statistical computation and machine learning methods can be used downstream to parse out the novel modifiers of diseases as well as the interactions of genetic loci underlying pathologies. This information with be compiled and analyzed within UPWARD using, in part, a program which identifies all rare disease-related pathogenic or likely-pathogenic Single Nucleotide Polymorphisms (SNPs) that are currently included on SNP microarray chips used by common consumer genetic testing companies. Table 1 shows that variants sourced from ClinVar, a crowdsourced genotype-phenotype database hosted by NCBI, against consumer-genetics data sourced from Illumina OmniExpress and GSA microarray chips used by Ancestry and 23andMe. These resulting 43 variants will be used in analysis of patient-submitted consumer-genomics results. The goal of this platform is to facilitate data-driven discovery of rare-disease determinants, such as modifiers that affect penetrance, by leveraging the growing data of consumer genomics. To facilitate use of this database, UPWARD has focused its tools to benefit people living with porphyria, and porphyria research as a whole.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure1.gif

Figure 1. UPWARD - Uniting People Working Against Rare Disease.

UPWARD opens with a web interface designed to clearly communicate research and advocacy goals to the public, request consent and gather data in a HIPPA-compliant manner.

Table 1. List of porphyria-related pathogenic SNPs.

UPWARD includes a tool built to map highly-pathogenic and likely-pathogenic porphyria-associated variants.

NameGeneRSIDChip
NM_000374.4(UROD):c.603A>G (p.Pro201=)URODrs2228084GSA
NM_000374.4(UROD):c.842G>A (p.Gly281Glu)URODrs121918057GSA
NM_000374.4(UROD):c.842G>T (p.Gly281Val)URODrs121918057GSA
NM_000374.4(UROD):c.874C>G (p.Arg292Gly)URODrs121918059GSA
NM_000374.4(UROD):c.912C>A (p.Asn304Lys)URODrs121918065GSA
NM_000374.4(UROD):c.932A>G (p.Tyr311Cys)URODrs121918061GSA
NM_000374.4(UROD):c.995G>A (p.Arg332His)URODrs121918066GSA
NM_000309.4(PPOX):c.-90G>TPPOXrs115158839GSA
NM_001122764.1(PPOX):c.199delC (p.Leu67Terfs)PPOXrs786204784GSA
NM_001122764.3(PPOX):c.502C>T (p.Arg168Cys)PPOXrs121918325GSA
NM_000097.5(CPOX):c.814A>C (p.Asn272His)CPOXrs1131857GSA
NM_000410.3(HFE):c.187C>G (p.His63Asp)HFE|LOC108783645rs1799945GSA
NM_000410.3(HFE):c.193A>T (p.Ser65Cys)HFE|LOC108783645rs1800730GSA
NM_000410.3(HFE):c.845G>A (p.Cys282Tyr)HFErs1800562GSA
NM_000031.5(ALAD):c.823G>A (p.Val275Met)ALADrs121912981GSA
NM_000031.5(ALAD):c.718C>T (p.Arg240Trp)ALADrs121912982GSA
NM_000031.5(ALAD):c.397G>A (p.Gly133Arg)ALADrs121912980GSA
NM_000031.5(ALAD):c.36C>G (p.Phe12Leu)ALADrs121912984GSA
NM_000375.2(UROS):c.683C>T (p.Thr228Met)UROSrs121908014GSA
NM_000375.2(UROS):c.673G>A (p.Gly225Ser)UROSrs121908020GSA
NM_000375.2(UROS):c.244G>T (p.Val82Phe)UROSrs121908016GSA
NM_000375.2(UROS):c.217T>C (p.Cys73Arg)UROSrs121908012GSA
NM_000375.2(UROS):c.184A>G (p.Thr62Ala)UROSrs28941775GSA
NM_000375.2(UROS):c.10C>T (p.Leu4Phe)UROSrs121908015GSA
NM_000190.4(HMBS):c.445C>T (p.Arg149Ter)HMBSrs118204120GSA
NM_000190.4(HMBS):c.499C>T (p.Arg167Trp)HMBSrs118204101GSA
NM_000190.4(HMBS):c.500G>T (p.Arg167Leu)HMBSrs118204095GSA
NM_000190.4(HMBS):c.500G>A (p.Arg167Gln)HMBSrs118204095GSA
NM_000190.4(HMBS):c.601C>T (p.Arg201Trp)HMBSrs118204109GSA
NM_000190.4(HMBS):c.606G>T (p.Val202=)DPAGT1|HMBSrs1131488GSA
NM_000190.4(HMBS):c.1075G>A (p.Asp359Asn)HMBSrs144949995GSA
NM_001382.3(DPAGT1):c.1177A>G (p.Ile393Val)DPAGT1|HMBSrs643788GSA
NM_001382.3(DPAGT1):c.994T>G (p.Phe332Val)DPAGT1|HMBSrs138544311GSA
NM_000374.4(UROD):c.603A>G (p.Pro201=)URODrs2228084OmniExpress
NM_000309.4(PPOX):c.-186C>APPOXrs2301286OmniExpress
NM_000410.3(HFE):c.187C>G (p.His63Asp)HFE|LOC108783645rs1799945OmniExpress
NM_000190.3(HMBS):c.-65C>THMBSrs589925OmniExpress
NM_000190.4(HMBS):c.88-14G>AHMBSrs17075OmniExpress
NM_000190.4(HMBS):c.613-19C>AHMBSrs1784304OmniExpress
NM_001382.3(DPAGT1):c.*427T>GDPAGT1|HMBSrs28990975OmniExpress
NM_001382.3(DPAGT1):c.*417T>CDPAGT1|HMBSrs7759OmniExpress
NM_001382.3(DPAGT1):c.*265A>GDPAGT1|HMBSrs28990974OmniExpress
NM_001382.3(DPAGT1):c.1177A>G (p.Ile393Val)DPAGT1|HMBSrs643788OmniExpress

When people with porphyria access UPWARD, they are met with a survey built to collect consent, contact information and disease-associated information, such as clinical symptoms, genetic and environmental data, including targeted questions concerning environmental factors suspected to trigger acute porphyria attacks. Participants are given the option to share this survey with family members and friends, both those with and without porphyria symptoms. Although family members and friends without symptoms at the time of the survey will likely never develop symptoms (due to the low penetrance of porphyria-associated mutations), we seek to identify modifying genes and environmental factors that contribute to the phenotype through comparing genotypes of these individuals with those of people reporting latent and active porphyria64. We plan to explore the possibility of recruiting participants by adding UPWARD links to the SNPedia research database, as well as through collaborating with porphyria advocacy and patient-education groups, and clinical partners.

Project 2: Variants Discovery and Rapid Clinical Diagnosis

Many different mutations can contribute to the onset and progression of porphyria65. We designed a method to search for underlying genetic variants associated with symptoms of congenital erythropoietic porphyria (CEP). The first diagnostic steps to confirm CEP often happen after referral to a genetic counsellor, who recommends targeted screening for a panel of known-pathogenic porphyria-associated SNPs. In cases where no known-pathogenic variants are found, whole-exome sequencing may be recommended of both the patient and their parents to catalog variation in the symptomatic person versus their asymptomatic parents. These variants filtered from the parent-child “trio” data can then be annotated with available disease-associated information, if any, using existing tools (such as dbNSFP and WGSA)66,67. With our Variants Discovery tool, we aimed to generate a workflow which operates on trio-data to identify, categorize and then rigorously assess candidate disease-causing mutations in cases where the underlying mutation is unknown, modeled after existing workflows for whole-exome sequence analysis (Figure 2)68.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure2.gif

Figure 2. Overview of the Variants Discovery pipeline to report possible pathogenic variants associated with Mendelian diseases.

Abbreviations: dbNSFP, database for nonsynonymous SNPs’ functional predictions; WGSA, whole genome sequencing annotator; HGMD, Human gene mutation database; eQTL, expression quantitative trait loci.

Tier 1 variants are known disease-causing mutations in known disease-causing genes. Tier 2 variants are uncharacterized de novo mutations predicted to be damaging (see Methods) in known disease-causing genes. Tier 3 variants are uncharacterized, damaging, inherited mutations in known disease-causing genes (parents are not affected). Tier 4 variants are functional mutations with unknown significance in known disease-causing genes. Tier 5 variants are damaging mutations in the extended gene list (e.g. those genes associated with symptoms of disease). Candidate disease-causing variants are categorized into five evidence-based tiers, where Tier 1 variants are known-pathogenic and have the highest support. We intend to expand this workflow so that it might be used to assist in the diagnosis of patients with other difficult-to-identify conditions.

Project 3: MassiveSeq: Automated meta-analysis of RNA-Seq Data from GEO data

The fields of biology and medicine have undergone swift changes to the manner in which ribonucleic acid (RNA) can be studied using deep-sequencing techniques to investigate expression-differences in possible RNA species that may be associated with deleterious disease outcomes69. RNA-Seq technology has revolutionized detection and analysis of aberrant RNA transcripts associated with disease69.

In rare disease research in particular, obtaining sample-sizes enabling confident identification of disease-associated transcripts is a considerable challenge. The amount of RNA-seq data contributed to NCBI’s Gene Expression Omnibus (GEO), a public repository for functional genomics data, is increasing at a rapid pace. A simple query for “Expression profiling by high throughput sequencing” yielded 14,200 unique datasets as of March 6, /2019. The availability of these massive quantities of data creates an open opportunity in many research areas for meta-analyses using these published datasets to strengthen analytical power. Our massive parallel-sequencing analysis tool, MassiveSeq, provides an opportunity for researchers and bioinformaticians to easily extract and process meaningful information (such as quantitative gene expression, novel transcripts and their isoforms, alternative splice-site variants, SNPs or copy-number variation) from these large datasets to evaluate the associations between biological processes, gene expression and disease outcomes. MassiveSeq automates downloading and processing of large-scale RNA-seq datasets with the aim of easing computational time and complexity7073.

MassiveSeq differs from conventional, comprehensive RNA-seq pipelines in that it combines multiple RNA-seq datasets to increase analytical power (Figure 3)7476. The search is confined to samples meeting the criteria, e.g., disease, library source (genomic, transcriptomic or metagenomic), platform (Illumina, PacBio), or instrument (Genome Analyzer, Hiseq, Nextseq). MassiveSeq additionally allows exploration of novel clustering methods to enable meta-analysis of differential gene expression. Initial steps in processing raw sequencing reads on even a single, traditional dataset are often computationally intensive, and obtaining additional publicly-available RNAseq datasets at a massive scale for such processing is resource-consuming as well. MassiveSeq takes raw-sequencing data (fastq format) automatically streamed from NCBI’s Short-Read Archive (SRA) as input, using a GEO query specifying parameters such as disease and experimental type (e.g., high-throughput RNAseq). Datasets can be further filtered as needed. The MassiveSeq pipeline next utilizes dockerized HISAT2 (version 2.1.0) and StringTie (v1.3.5) to enable automated, parallel processing of each experiment77,78. Reads are automatically streamed directly from SRA, mapped to a reference genome, assembled into transcripts--including novel splice-variants--and annotated in parallel within each dataset. The MassiveSeq pipeline allows uniform processing of multiple, independent RNAseq datasets, enabling powerful identification of differentially expressed genes and transcripts associated with diseases of interest. We applied MassiveSeq to 99 Friedreich’s Ataxia SRA datasets to identify disease-associated transcripts for Iron Hack7073.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure3.gif

Figure 3. Flowchart for Massiveseq Methodology.

The pipeline takes metadata from the Sequence Read Archive (SRA) and parses it for quality control (QC). The primary work takes place in a custom snakemake script that aligns sequences with Hisat2 and then quantifies transcripts with Stringtie in a parallelized fashion across available machines and cores.

Project 4: Phenogeno Viz: Rapid aberrantly-expressed gene identification from RNA-Seq

Abnormal gene-expression patterns can cause a broad range of diseases. However identifying abnormally-expressed genes and correctly interpreting expression data across experiments can be complicated by inconsistencies in gene-expression normalization strategies, as well as inadequate filtering of noisy data. Here, we developed an algorithm to rapidly identify genes with abnormal gene expression patterns in samples of interest (e.g., disease-presenting patient) as compared to controls (Figure 4). This method was built utilizing ~2000 RNA-Seq datasets publicly available on GTExPortal79. The package implements three commonly used RNA-Seq normalization methods: Fragments per kilobase of transcript per million mapped reads (FPKM), transcripts per million mapped reads (TPM) and differential gene expression analysis based on the negative binomial distribution (DEseq). A Gaussian-mixture model is utilized here to remove RNA-Seq noise and the DE-Seq method is finally implemented to capture abnormally expressed genes corresponding to query tissue. Simulation data were generated to test algorithm performance, and we intend to expand this system so that it might be used to assist in the diagnosis of patients with difficult to identify conditions80.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure4.gif

Figure 4. The work flow chart for identifying abnormal genes based on RNA-Seq.

After RNA-Seq is performed on a patient sample, the program searches the Genotype-Tissue Expression Project (GTEx) database for RNA-Seq data from the specific tissue potentially associated with the disease. Three methods are used for RNA-Seq normalization (Fragments per kilobase of transcript per million mapped reads (FPKM), transcripts per million mapped reads (TPM) and Differential gene expression analysis based on the negative binomial distribution (as implemented in DESeq)), and the data were fit to a Gaussian mixture model to remove noise within samples. The differentially expressed genes in the patient sample are finally captured by using the R program DESeq.

Project 5: Phenotype-to-Genotype Mapping: Assessing combinatorial variant-contribution to disease phenotypes

Disease-phenotypes are unlikely to be entirely explained by the presence of single pathogenic variants. Pleiotropy, modulating genes and combinatorial effects are the rule, rather than the exception; however assessing combinatorial effects underlying disease quickly becomes computationally expensive, with a practically-infinite number of variant-combinations that could be assessed. We developed a tool-set to enable thoughtful reduction of variants to feasibly assess the role of modifying genes in rare diseases such as Friedreich’s ataxia.

Most of the alleles a person inherits are unlikely to be involved in modulating the disease phenotype, and models incorporating many extraneous variables are unnecessarily cumbersome and perform more poorly than models incorporating domain-specific feature-selection. Therefore the first step of our pipeline was to reduce the disease-associated variant search-space to genes fitting a profile of interest.

We focused on broadly-applicable features of likely disease-causing variants (as opposed to disease-specific features) for our first layer of feature-selection in this first iteration of our pipeline. Input variant-call data are filtered based on the likelihood that any particular variant is deleterious (as predicted by Polyphen-2 scores) and by residual variation inheritance scores81,82. As features-of-interest for disease-associated variants should change depending upon the particular disease, phenotype, and available domain-specific knowledge, the feature-selection component of our tool is intended to be easily extendable for investigating the combinatorial contributions of multiple variants to disease phenotypes by any number of characteristics. We incorporated two existing annotation packages (Open-CRAVAT (version 1.4.0) and ANNOVAR version (On 2018Apr16)) to thoroughly annotate available information for each variant, any of which can be filtered on in the feature-selection module83. Highly ranked variants are then assessed for their contribution to disease-phenotype via the equally modular “analysis” part of our pipeline. Our current analysis module utilizes the APRIORI algorithm to detect variant co-occurrence relationships with disease, though the output from the feature-selection module is in a common format to facilitate application of other machine-learning approaches to identifying combinatorial interactions, all implemented via a simple web-user interface84.

We developed this pipeline with modularity being a primary goal. The APRIORI algorithm is currently implemented to identify genes that frequently co-occur in the feature-selected set of genes. Future work will implement tools that check for over-representation of gene ontology terms among the genes determined to have deleterious alleles.

Methods and implementation

Key concepts informing methods and implementations of each project are described below.

UPWARD

To build a database of pathogenic or likely pathogenic SNPs, we sourced Rapid Stain Identification Series (RSID) information from the Illumina OmniExpress & Illumina Global Screening Array (GSA) microarray chips (used by Ancestry and 23andMe respectively), then filtered out non disease-associated genes using the NCBI ClinVar database. For specific application to porphyrias, we selected all genetic polymorphisms annotated to be associated with any of the porphyrias, as well as their associated RSID, SNP location in the genome, and degree of pathogenicity85,86. Participants’ raw genomics data and environmental data are stored in a non-relational database, which has been proven to be more efficient than relational databases for storing and accessing genomic data87,88. A secure, HIPAA-compliant human subject meta-information database will be built as part of the next iteration of development89,90. A secure, HIPAA-compliant human subject meta-information database will be built as part of the next iteration of development89,90. At that time, the database will be expanded to capture the following information: 1) patient-reported phenotype and symptom information of people identified as potentially carrying a pathogenic or likely pathogenic variant in a porphyria gene and 2) people with a clinical diagnosis of porphyria, as well as de-identified information on their family members to try to capture data on asymptomatic people.

Our system currently consists of a cloud-database built on MongoDB Community Edition, and a web server run through NGINX to accept input data from participants. The entire system is containerized and orchestrated by Docker Compose for ease of replication and to enable application to other diseases.

Variants Discovery and Rapid Clinical Diagnosis

Our pipeline categorizes patient variant-data into five tiers of pathogenic certainty based on quality of evidence, the logic of which is broadly outlined in Figure 2. The pipeline accepts dbNSFP or WGSA-annotated patient variant-files (in tab-delimited format, one variant per line). Annotated variants are first checked against known disease-associated variant databases, namely HGMD and ClinVar, to identify any previously reported pathogenic mutations matching the patient phenotype; these known, disease-causing variants in known disease-causing genes are categorized into the most confident classification, Tier 1. All variants not represented in the HGMD and ClinVar databases are next checked to see if they are located in genes that are involved in known disease-associated pathways. Variants in disease-associated pathways are then evaluated for probability of being deleterious (with start-loss, stop-gain, essential splicing variant, frameshift, indel or missense-mutations being highly likely to be deleterious). Damaging variants in known disease-causing genes are then contrasted against variant files from the non-affected parents to distinguish de novo (Tier 2, uncharacterized, damaging de novo mutation in known disease-causing gene) from inherited (Tier 3, uncharacterized, damaging inherited mutation in known disease-causing gene) variants. Non-deleterious variants are not considered further.

Damaging variants not occurring in known disease-causing genes themselves, but mapping to known expression-Quantitative Trait Loci (eQTLs, loci associated with expression-changes in transcripts from known disease-causing genes), are stratified into Tier 4. We report all other damaging variants in or associated with genes that are related to symptoms of the disease as Tier 5 (damaging mutation in the extended gene list). Any other known disease-causing mutations associated with unrelated diseases are additionally reported in an extended report to allow for possible incidental or secondary findings.

MassiveSeq

We planned the main snakemake (version 5.4.2) to automate dispatching of jobs depending on the available cores and memory of a machine91. Here, the core steps involved Hisat2 for alignment, followed by Stringtie (v1.3.5) for transcript annotation and de novo annotation. Finally, reads were quantified by using featureCounts to measure at the exon level from the subRead (version 1.6.3) package92. This quantification pipeline follows a common, recently published protocol on Stringtie and Hisat293. It allows for both known as well as novel isoform transcripts to be identified and measured.

Once the gene counts were fully quantified for each sample, we analyzed the overall dataset comprised of all 4 studies using the R package metaSeq (version 1.22.1)94. This package adapts the non-parametric NOISeq method for differential RNA-seq analysis to allow for multiple studies in a meta-analysis framework95.

We used the gene counts with GSVA (Gene Set Variation Analysis, version 1.30) to estimate per-same GSEA (Gene Set Enrichment Analysis) pathway enrichments for the 50 hallmark datasets from MSigDB96,97. We used these pathway enrichments as features (a binary up- or down-regulated pathway) for a deep learning model, along with the remaining gene estimates. We used the fast.ai library to construct a Convolutional Neural Network (CNN)98. One of the novel features of fast.ai, especially for our data, is that it facilitates rapid construction of neural networks with tabular data via embeddings similar to Word2Vec99. The training split was 70/30; afterwards the CNN was trained for 5 epochs (cycles of the data), with a learning rate of 0.1.

We are also in the early stages of adapting the GATK RNA-seq best practices to this pipeline so that we can rapidly call variants on these samples100. Our workflow for the procedures, and methods used can be found in Figure 3.

Phenogeno Viz

This package is designed to detect abnormal genes exhibited differential expression compared with normal tissue cell. For each patient RNA-Seq result, we first download the gene expression level from normal tissue same as patient tissue. Then for multiple gene expression RNA-Seq samples available in website GTExportal (https://gtexportal.org/home/), three normalized methods including TPM, FPKM and DEseq are available. The Gaussian-mixture model is utilized to remove RNA-Seq noise. The basic idea is to use the EM algorithm to find two best fitted Gaussian distribution and only maintain the distribution with relative higher mean value as a true signal. After noise reduction, DEseq algorithm is used to identify significant up or down regulated genes in patients.

Phenotype to Genotype-Mapping

This tool is envisioned to be most useful for analyzing variant-call data from networks of families affected by rare disease (Figure 5). Currently the input data are variant-call files (.vcf) obtained from patients with a rare disease; future versions will incorporate additional genomic information from family members. The user uploads patient .vcf files through a web-app interface, then selects features on which to filter (currently a maximum and minimum score for residual variance-intolerance, a measure of gene-tolerance to variation based on population allele-frequencies). The web interface can be run locally to keep patient data secure. The back-end next runs the files through CRAVAT and ANNOVAR to assemble annotation information on all variants compiled from multiple databases that the user selects upon install (including ClinVar, Pubmed, etc). These annotation data are then filtered according to user-specification using the feature selection module.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure5.gif

Figure 5. Phenotype-to-Genotype Mapping: Assessing combinatorial variant-contribution to disease phenotypes general workflow.

Input data are variant-call files in .vcf format collected from patient samples. The feature-selection module collects all available annotation information for each identified variant, then narrows down to variants most likely to be associated with the phenotype based on user-specified parameters. These feature-selected variants are then analyzed for combinatorial contribution to the disease using the tools in the analysis module. The output of the analysis modules are tables and graphs that summarize the results.

Operation

Operation can be performed on a computational cluster with multiple cores. The system can use a Lustre parallel file system for fast Input and Output. Remote mounting onto the cluster should be available for flexible data access and movement.

UPWARD

The only requirement to build this system is having Docker and Docker Compose installed on your machine. For instructions on running the system refer to the associated GitHub readme at bit.ly/UPWARD19101,102.

Variants Discovery and Rapid Clinical Diagnosis

GitHub readme and description available at https://bit.ly/2FGqkv7103,104.

“Massive” RNA-seq Combined Analysis of Multiple Datasets

For full instructions on how to clone and implement the code, please refer to:

The MassiveSeq github repository: https://bit.ly/2HKA61y105,106.

PhenogenoViz: Rapid abnormal gene identification based on RNA-Seq

Running the web app requires the installation of Ruby, R, and Python on the server. The instructions for installing Ruby on Rails on Windows 10, Ubuntu, and OS X can be seen here, and should be similar for different OS versions107. Install Python 3 from here and install R from here108,109. The web application is available on GitHub at https://bit.ly/2V3Hpo2 and instructions for installation are detailed on the ReadMe110,111.

Phenotype to Genotype-Mapping

GitHub readme and description available at https://github.com/NCBI-Hackathons/pheno_geno_ataxia112,113.

Lessons learned

Throughout this process we identified several areas where improvements could be made for future disease-focused hackathons. A few of these are described below.

  • 1) We were successful in prototyping for a specific disease.

  • 2) It was helpful to learn more about the diseases and current problems that need to be solved before starting the projects.

  • 3) If trying to solve a clinical problem, such as how to improve and speed up the rate at which patients receive a diagnosis for rare diseases, include clinicians as part of the group.

  • 4) It may also be advantageous to have the didactic presentations about the diseases in advance of the hackathon such that everyone has a basic understanding of the issues and disease symptoms and time for brainstorming.

  • 5) Having a team meeting prior to the hackathon to assign roles and discuss overall flow for each day was helpful.

  • 6) Providing literature to read about the disease/genetic condition was also useful background for preparing for the hackathon.

  • 7) Having two leads on each team increased efficiency, as each could take turns fielding questions from less-experienced team-members while the other could keep the hacking on-task for the day.

  • 8) A few things some specific teams learned

    • a) UPWARD: Forming a team composed of people with a variety of training backgrounds (e.g. clinicians, researchers, organizers, computer scientists, biologists, geneticists, etc.) brings strength and utility to team ideas and project results. Additionally devoting a portion of the first day or meeting period to brainstorming, idea proposal, and arguments facilitates the formation of a plan which team members are able to agree upon and work towards while reducing the chance of a schism further down the road.

    • b) MassiveSeq: We initially overscoped/planned on a completely different toolset. Scaling back to a core set of tools that we were comfortable with made completing the project feasible. People were coming in from really different backgrounds (choice of programming language, familiarity with genomics data formats, etc.) and in retrospect we would have liked to have planned a bit more for some specific tasks.

Results

UPWARD

After identifying porphyria-related pathogenic SNPs currently included on Illumina OmniExpress and Illumina GSA microarray chips, 43 porphyria-related pathogenic SNPs were found. This list (presented in Table 1) will be maintained and updated at the UPWARD GitHub repository (labeled as pathogenic SNPs.csv)101. This list will be compared against participant-submitted consumer genomics test results within UPWARD once the project is reviewed and approved by the Institutional Review Board (IRB).

Variants discovery and rapid diagnosis

We examined the genetic basis with the following examples of porphyria genetics. Using our developed pipeline, we successfully identified several candidate SNPs (in Tier 1, Tier 2 and Tier 4) that were previously unnoticed in a porphyria patient in a clinical setting. These SNPs are located in genes known to cause different kinds of porphyria, e.g. UROS and CPOX genes. The discovered SNPs can, based on the prediction using our pipeline, affect the transcription of the candidate genes, their translation or both. All of these possibilities would result in functional abnormalities of the final gene product. For example, two of the identified SNPs were eQTL (expression quantitative trait loci), which led to significantly decreased expression level of the UROS gene (Figure 6 and Figure 7).

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure6.gif

Figure 6. Expression change of the UROS gene caused by eQTL SNP No. 1 across all tissue types in the Genotype-Tissue Expression Project (GTEx).

There is significant down-regulation of UROS gene associated with this variant in all tissues (except ovary). NES: normalized effect size.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure7.gif

Figure 7. Expression change of the UROS gene caused by eQTL SNP No. 2 across all tissue types in the Genotype-Tissue Expression Project (GTEx).

There is significant down-regulation of UROS gene associated with this variant in all tissues (except ovary). NES: normalized effect size.

These variants examples demonstrated that our pipeline can help physicians and/or clinical geneticists quickly filter out the vast majority of neutral variants and report the remaining variants in clinically meaningful tiers to facilitate further experimental validation and explanation.

MassiveSeq

Using Metaseq we identified over 2000 genes upregulated in Friedreich’s Ataxia patients compared to controls (Figure 8). However, we emphasize that this analysis was merely a proof of concept, and further work needs to be done to explore methods and techniques for standardizing phenotypes (data harmonization) alongside the meta-analysis itself in Friedreich’s Ataxia. (Figure 8).

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure8.gif

Figure 8. Significance of up-regulated genes from metaseq analysis; red bar denotes .05 significance cutoff.

Distribution of significance in downregulated genes from metaseq analysis; no genes were significant at 0.05 threshold.

We used fast.ai to train a CNN on an embedded feature-space of these counts as well as 50 gene-set enrichment features from Msigdb (see Methods). The trained model had an overall accuracy of 0.75, which seems promising given the number of features and low number of samples for training.

MassiveSeq’s implementation of HISAT2 and StringTie identified novel-isoform transcripts in various samples. We focused our analyses on the FXN gene, as trinucleotide GAA-repeats at this locus are causative of FRDA. We identified multiple novel-isoform transcripts within 1kb up and downstream of FXN in affected, unaffected and carrier-patients (Table 2). We were able to visualize the truncation of the FXN transcripts from the above samples using IGV. Shallow read-coverage of the whole transcriptome from this particular study made it difficult to confirm the reliability of the identified transcript truncation.

Table 2. List of novel-isoform transcripts within 1kb of the FXN gene.

Novel TranscriptChr.StrandStartEndFPKMTPMDisease
SRR8038380_chr.305729+69035259691001781.7870763.825795Friedrich Ataxia
SRR8038380_chr.305739-69107926691082170.1361390.291447Friedrich Ataxia
SRR8038387_chr.176999+69035259690790760.2740680.490055Carrier
SRR8038389_chr.198449+69035751690748501.0705711.139182Unaffected
SRR8038390_chr.212539+69035259691001781.0334841.298192Unaffected
SRR8038399_chr.264279+69035259690790763.1269597.802162Unaffected

Phenogeno Viz: Rapid abnormal gene identification based on RNA-Seq

For each input patient RNA-Seq data, the RNA-Seq data related to query tissue are extracted from the database. The available tissues and number of RNA-seq data are listed in Table 3.

Table 3. Available RNA-Seq data samples in Genotype-Tissue Expression Project (GTEx) for different tissues.

TissuesNumber of
RNA-Seq data
Adipose797
Adrenal190
Bladder11
Blood536
Blood913
Brain1671
Breast290
Cervix11
Colon507
Esophagus1021
Fallopian7
Heart600
Kidney45
Liver175
Lung427
Muscle564
Nerve414
Ovary133
Pancreas248
Pituitary183
Prostate152
Salivary97
Skin1203
Small137
Spleen162
Stomach262
Testis259
Thyroid446
Uterus111
Vagina115

For convenience, brain tissue is selected here for the following discussion. Considering different sequencing depth for each sample, we provide three methods for data normalization: DEseq, FPKM, and TPM. As there are a large number of samples, we used uniform sampling to select n genes for visualization. As shown in Figure 9, n = 20 genes are shown here to compare different normalization methods. DEseq normalization results show relative lower fluctuation compared with the other two methods (F-test p < 2.2e16), indicating better performance of DEseq. Except for Bladder, Cervix, and Fallopian, most tissues in our database exhibit large RNA-Seq sample number. Therefore, a method is required to select the data with relatively high signal/noise ratio. A Gaussian-mixture model is fit for each gene and returns the posterior probability to be ‘true’ signal for each RNA-Seq sample (Figure 10). The top ten samples with the highest average posterior probability are picked as background and compared with patient samples. Then, DEseq is used for differential expression gene identification. As shown in Figure 11A, the green dots represent significant differential expression gene (p-adj < 0.01) between patient and samples from database. Figure 11B shows top 10 abnormal genes and their geneID in patients.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure9.gif

Figure 9. n = 20 genes are sampled here to compare different normalization method: Fragments per kilobase of transcript per million mapped reads (FPKM), transcripts per million mapped reads (TPM) and Differential gene expression analysis based on the negative binomial distribution (DESeq).

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure10.gif

Figure 10. The Gaussian mixture model is implemented here to filter out noise.

Hist plot shows the distribution of gene expression level for gene ‘CELSR2’ in 1671 different brain RNA-Seq samples. The Gaussian mixture model is fitted by the EM algorithm and the noise is filtered out by posterior probability bigger than 0.5.

385c446f-4e4c-4e59-a97e-5c55e4e3b3ee_figure11.gif

Figure 11. Differential gene expression analysis based on the negative binomial distribution (DESeq) is used here to find differential expression genes between patient and database.

A) Scatter plot shows significant differential genes (green dot, p-adj < 0.01). B) Boxplot shows top 10 abnormal genes in simulation compared with data from database.

(‘Neuroblastoma’ related genes is used here).

Phenotype-to-Genotype Mapping

The code was tested using the related individuals from the 1000 genomes project. Flagging the genes most likely to have deleterious alleles decreased the search space enough to allow the APRIORI algorithm to run on the dataset.

Conclusion and next steps

Common questions in the community about hackathons include whether they can focus on specific diseases and how clinical personnel can interact more effectively with data scientists. We found that it was indeed possible to focus on a given disease while developing generalized tools in a hackathon. In fact, we found it helpful to have cases to use in our analyses from a specific disease. Finally, we found it was to the benefit of everyone to have clinical personnel involved, especially in the later stages of the event.

Software availability

Rapid Clinical Diagnostics

Source code: https://github.com/NCBI-Hackathons/Rapid_Clinical_Diagnostics

Archived source code: http://doi.org/10.5281/zenodo.3236563104

License: MIT

Phenotype-to-Genotype Mapping

Source code: https://github.com/NCBI-Hackathons/pheno_geno_ataxia

Archived source code: http://doi.org/10.5281/zenodo.3236569112

License: MIT

Data availability

Underlying data

All data underlying the results are available as part of the article and no additional source data are required.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 19 Jul 2019
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Ferreira GC, Oberstaller J, Fonseca R et al. Iron Hack - A symposium/hackathon focused on porphyrias, Friedreich’s ataxia, and other rare iron-related diseases [version 1; peer review: 2 approved]. F1000Research 2019, 8:1135 (https://doi.org/10.12688/f1000research.19140.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 19 Jul 2019
Views
11
Cite
Reviewer Report 04 Dec 2019
Robert J. Desnick, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA 
Approved
VIEWS 11
General Comments:
This is an interesting and insightful manuscript that describes several novel computer-based algorithms that can investigate and/or uncover new insights into the porphyrias and Fredrich Afaxia. It proves the usefulness of bringing together experts in various ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Desnick RJ. Reviewer Report For: Iron Hack - A symposium/hackathon focused on porphyrias, Friedreich’s ataxia, and other rare iron-related diseases [version 1; peer review: 2 approved]. F1000Research 2019, 8:1135 (https://doi.org/10.5256/f1000research.20975.r56584)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
23
Cite
Reviewer Report 28 Aug 2019
Karl E Anderson, Department of Preventive Medicine and Community Health, The University of Texas Medical Branch at Galveston, Galveston, TX, USA 
Approved
VIEWS 23
“Iron Hack” is a rare disease-focused hackathon. As a collaborative, problem-solving effort it has successfully attracted participants of diverse backgrounds at the U of South Florida and focusses on rare iron-related genetic diseases, specifically porphyrias and Friedreich’s ataxia. 
  ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Anderson KE. Reviewer Report For: Iron Hack - A symposium/hackathon focused on porphyrias, Friedreich’s ataxia, and other rare iron-related diseases [version 1; peer review: 2 approved]. F1000Research 2019, 8:1135 (https://doi.org/10.5256/f1000research.20975.r51424)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 19 Jul 2019
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.