MIREyA: a computational approach to detect miRNA-directed gene activation

Emerging studies demonstrate the ability of microRNAs (miRNAs) to activate genes via different mechanisms. Specifically, miRNAs may trigger an enhancer promoting chromatin remodelling in the enhancer region, thus activating the enhancer and its target genes. Here we present MIREyA, a pipeline developed to predict such miRNA-gene-enhancer trios based on an expression dataset which obviates the need to write custom scripts. We applied our pipeline to primary murine macrophages infected by Mycobacterium tuberculosis (HN878 strain) and detected Mir22, Mir221, Mir222, Mir155 and Mir1956, which could up-regulate genes related to immune responses. We believe that MIREyA is a useful tool for detecting putative miRNA-directed gene activation cases. MIREyA is available from: https://github.com/veania/MIREyA


Introduction
Conventionally, microRNAs (miRNAs) are considered to suppress gene expression through RNA interference (RNAi) by binding complementarily to mRNAs, forming a RISC complex, and causing RNA degradation. 1 However, recent studies provide evidence that some miRNAs act in the opposite waystimulating gene activation. Numerous studies have demonstrated the ability of miRNAs to up-regulate genes by targeting their promoters. 2,3 Ago1 from miRNA-Ago complex associates with the Ccnb1 promoter and miR-744 induces enrichment of RNA Pol II and H3K4me3 at the Ccnb1 transcription start site. 3 MiRNA let-7i interacts with the TATA-box of IL-2 gene and stimulates transcription initiation by contributing to the assembly of the pre-initiation complex. 4 Relatively fewer miRNAs demonstrated the ability to unconventionally target and activate enhancers, thus inducing genes regulated by these enhancers. MiR-24-1 acts as a modulator of the chromatin state of an enhancer. Furthermore, it increases p300 and RNA Pol II binding at the enhancer locus. The miR-24-1 actually originates from the enhancer locus. However, some genes regulated by other enhancers are also expressed at higher levels when miR-24-1 is transfected, and the enhancers of induced genes contain a sequence similar to the seed of the miRNA. 5 These observations suggest that other miRNAs might trigger enhancers and activate gene expression.
When miRNAs function as activators in a nucleus, different targeting mechanisms are possible: miRNA:DNA Watson-Crick duplex formation as well as miRNA:DNA Hoogsteen triple helix formation. Nuclear miRNA target prediction tools utilize are based on the idea that miRNA:DNA interaction requires an intact seed region. 6 MicroPIR2 predicts targets in mouse and human promoter regions. 7 Trident predicts miRNA:DNA Hoogsteen-type base pairings. 8 Some tools designed to predict conventional miRNA targets may also be utilized to find nuclear activational targets, e.g. miRanda. 9 In this work, we report MIREyA (MIRnas functioning through Enhancer Activation), a pipeline for detection of miRNAs and their gene targets up-regulated through triggering their enhancer in the provided expression dataset. We applied MIREyA in order to identify and characterize activational miRNAs in Mycobacterium tuberculosis (Mtb) infected macrophage dataset. MiRNAs are important regulators in macrophage responses during Mtb infection that may act as host immunity agents as well as a tool exploited by pathogen agents to manipulate host cell pathways. 10 Yet, only the classical role of miRNA has been investigated so far in the context of Mtb infection. [11][12][13][14][15][16][17][18][19] Although multiple studies have shown the possibility of the activational role of miRNAs, this potential remains neglected in the study of miRNAs in bacterial infections. MIREyA found several miRNAs, which have not been shown to be functional in TB yet, suggesting it could be useful to find candidate activational miRNAs for further experimental validation. Mir155 has previously been shown to act as a negative regulator of essential mRNAs during TB, 20,21 but not as an activator.

Methods
Main steps of the algorithm MIREyA aims to detect miRNAs with the potential to upregulate a gene via activation of its enhancer. It consists of three major steps: 1) The algorithm detects miRNA bound to the enhancers. This step is implemented with three different approaches described below.
2) The algorithm selects genes regulated by the enhancers selected in step 1). For this step the output from the first step is required as well as a table with enhancer:gene pairs where an enhancer is assumed to regulate the corresponding gene.
3) The algorithm calculates the Spearman's correlation coefficient (SCC) between the expression levels of miRNAs and genes regulated by corresponding enhancers selected in step 1), and estimates the p-value of the SCC with a Benjamini-Hochberg correction using the number of the miRNA:gene pairs for one miRNA (FDR < 0.05). The input data is gene expression data with sample size ≥ 8 and the output of the previous step.

REVISED Amendments from Version 1
This version contains several minor changes. First of all, we added a more detailed explanation of arguments which a user provides to our pipeline and description of the output of the pipeline. Secondly, the missing number of enhancers is now present in table 2. Thirdly, we improved logging of the pipeline in the new version and updated an archived version of the repository. We also corrected the percent identity (PI) threshold from 0.5 to 50 to fit the definition of the percent in the text and we made corresponding changes in the code. Other minor changes include a change of a reference 1, a fix in a gene name Dot1l.
Any further responses from the reviewers can be found at the end of the article Figure 1 illustrates the workflow of our algorithm implemented in Python and R.

Prediction of miRNA-enhancer interaction
We speculate that in order to activate an enhancer, miRNA should bind to enhancer DNA. Since the mechanism of such binding is unclear, we decided to implement several reasonable prediction strategies. The first two strategies assume that miRNA binds to DNA forming an RNA:DNA double helix, while the third assumes RNA:DNA triple helix formation.
1) The first approach is to select an enhancer containing an exact match of the user-provided seed sequence of a miRNA, then expand each seed by 14 bp of the corresponding mature miRNA and align it to the enhancer with Needle tool 22 and keep only enhancers with the percent identity (PI) > 50 (PI defined as a percent of matches between miRNA and DNA region).
2) The second approach is to scan miRNA sequences against enhancer sequences and detect potential target sites with MiRanda. 9 3) The third approach is to predict RNA:DNA triplexes between miRNAs and enhancers with Triplexator tool. 23 We relaxed the error-rate and lower-length-bound Triplexator default parameters in order to adjust the algorithm to work with extremely short miRNA sequences (error-rate=19, lower-length-bound=11).
The approaches are interchangeable, also the user can merge the results of all approaches to reflect multiple mechanisms of potential miRNA:DNA binding.
The main script to run the pipeline is src/run_mireya.py. Input and output files depend on choice of one of previously described approaches to predict miRNA-enhancer interaction and must be specified with the following flags: -d/ --detection_mir_enh_interaction: approach to predict miRNA-enhancer interaction, accepts one of three possible values: seed_match_needle / miranda / triplexator -e/ --enhancers: path to a fasta file with sequences of enhancers of interest -o/ --output: full path to output directory -ge/ --gene_expression: path to .tsv file with normalized gene expression -me/ --mirnas_expression: path to .tsv file with expression of miRNAs of interest -ei/ --enh_gene_interaction: path to .tsv file with enhancers and corresponding genes they are assumed to regulate -m/ --mature_mirnas: path to a fasta file with sequences of mature miRNAs of interest Pipeline mode with the first approach to predict miRNA-enhancer interaction (-d seed_match_needle) requires additional input data which must be provided with flags: -g/ --genome: path to a folder with fasta files with complete genome of the organism of interest: one chromosome per file -s/ --seeds_mirnas_forward: path to a tab-delimited file with sequences of seeds of miRNAs of interest, without header with 2 columns: 1) general name of mirnas; 2) their seeds as DNA sequences (make sure that U are replaced with T) -sr/ --seeds_mirnas_reverse_compl: path to a tab-delimited file with reverse complementary sequences of seeds of miRNAs of interest, without header with 2 columns: 1) general name of mirnas; 2) their seeds as DNA sequences (make sure that U are replaced with T) -eb/ --enhancers_bed: path to bed file with coordinates of enhancers of interest -ms/ --mature_mirnas_separate: path to a directory containing directories with mature miRNA sequences: one folder per miRNA containing fasta files with one mature sequence per file. Each directory with mature mirna fastas must be named exactly as in other input files. Names of fasta files will be used further as "mature_mirna" column in result tables.
MiRNAs must have the same names in -ge, -me, -ei, -s, -sr input data and for folders in -ms argument. Please, use example files to make sure the input is in correct format. MiRNA names for mature sequences must be the same in -m and -ms arguments.
Examples of command to run the main script, one for each approach to predict miRNA-enhancer interaction (names of files coinside with names of example input files in repository): Output file is called mir_enh_gene_trios.tsv. Example output file is placed in the out/directory of repository. One line in the output file corresponds to one trio of a mature miRNA, an enhancer and a gene ("mature_mirna", "Gene.Name", "enhancer" columns). Columns "corr (miRNA, gene)"and "p.value adj" correspond to Spearman's correlation coefficient of the miRNA and the gene expression and FDR. "PI" column corresponds to percent identity and is present only in the output of seed_match_needle approach.

Implementation
The pipeline is implemented as Python, R and bash scripts, and can be run with a master script run_mireya.py.

Operation
Python>=3.5 and r-base are expected to be pre-installed. Besides, two modes of the pipeline require the following tools installed: MiRanda, Triplexator. The pipeline was tested in Ubuntu and Ubuntu-based linux systems (Ubuntu>=16.04).

Use case
We applied MIREyA to three time-series (0, 4, 12, 24, 48, 96 hours) expression datasets (CAGE) of mouse bone marrowderived macrophages infected with hypervirulent Beijing/W lineage Mycobacterium tuberculosis (Mtb) HN878 strain, 2-4 replicates per time point. 24 Each dataset corresponds to the time series after infection for macrophages of different phenotypes: not pre-stimulated (M0), interferon-γ stimulated (M1-polarized) and interleukin-4/interleukin-13 stimulated (M2-polarized). Only differentially expressed (DE) miRNAs and genes were considered. We obtained enhancer-gene interactome from 25 where an enhancer is predicted to regulate a gene if their expression levels correlate significantly and they belong to the same topologically associated domain (TAD).
We searched for candidate enhancers targeted by miRNAs with all three methods described previously and merged the results for further steps. CAGE enables one to estimate expression at the promoter level, while enhancers are associated with whole genes. As a proxy of gene expression, we used either an expression value of a promoter with the highest SCC (a) or summed up the expression values of all promoters of the gene (b). To reduce the number of false positive predictions, we selected among miRNA:enhancer duplexes only such cases where (1) a duplex was predicted both by miRanda and Needle-based approach; (2) an enhancer was associated with several genes since in the original paper on miRNA-activated enhancers 5 one nuclear miRNA affected expression of multiple genes regulated by the triggered enhancer; (3) the miRNA-gene pair was obtained in both ways (a and b) to estimate expression of the gene. We also added miRNA-gene pairs with highly correlated expression levels (SCC ≥ 0.8) in any of four combinations of two methods to detect miRNA:enhancer interactions (seed match + Needle and miRanda) and two approaches to treat expression of different promoters (a and b). Among predicted miRNA:enhancer triplexes we selected cases where (1) an enhancer was associated with multiple genes; (2) both approaches to estimate gene expression (a and b) yielded this triplex.

Results
We applied MIREyA to three time-series datasets of Mtb-infected macrophages with 3 different phenotypes prior to infection. In M0 macrophages 10 miRNAs were differentially expressed (DE) in at least one time point compared to the state before infection (0h), in M1 there was no DE miRNA, while in M2 only Mir1956 was DE. Figure 2 and Table S1 (Extended data 43 ) represent detected miRNA-enhancer-gene trios for M0 macrophages (Extended data: Table S2 for M2 macrophages 43 ). We investigated the functions of the obtained genes and miRNAs and confirmed that many of them might be involved in the response to Mtb infection. We detected Mir155, which is vastly studied, and known to subvert autophagy in human dendritic cells 20 and to be a potential diagnostic marker of active tuberculosis. 21 Other miRNAs and their targets which we consider promising for further investigation are summed in Tables 1 and 2. Mir22, Mir221, Mir222 are annotated with high confidence. 26 Expression of only some promoters of Klf6 and BC016423 genes correlates with Mir22 expression, but we included them, because they have predicted triplexes with three different enhancers.  Table S1. We reconstructed regulatory networks for these miRNAs based on miRNA-enhancer-gene trios and investigated their potential role in the response to Mtb infection. Mir22-activated gene network is highly likely to be involved in Mtb response. Klf6, a potential target of Mir22, is a transcription factor essential for macrophage motility 27 and plays an important role in the regulation of macrophage polarization promoting M1 phenotype cooperatively with NF-κB. 28 Nfkb1 gene, a putative target of miRNAs Mir221, Mir222 and Mir155, encodes a subunit of NF-κB protein complex, a master transcription factor in macrophage immune responses. The human ortholog of Ube2d3, potentially regulated by the same miRNAs, facilitates polyubiquitination of NFKBIA (a member of the NF-kappa-B inhibitor family) stimulating its subsequent degradation. 29 A detected target of both Mir22 and Mir221, Peli1 regulates the NF-κB activity negatively and attenuates the induction of proinflammatory cytokines in T-cells. 30 Cxcl1 and Cxcl2, detected targets of both Mir22 and Mir221, are chemoattractants for neutrophils contributing to tissue inflammation. 31 Malt1, potentially up-regulated by Mir155, is known to activate NF-κB in lymphocytes. 32 Among targets of Mir1956 detected in M2 macrophages dataset we found the Ccrl2 gene encoding a chemokine receptor-like protein which is expressed at high levels in primary neutrophils and primary monocytes. Another Mir1956 target, Dotl1, is an H3K79 methyltransferase which facilitates the expression of IL-6 and IFN-β in macrophages. 33 Cd14 leads to NF-κB activation and inflammatory response, 34 Cd14 KO mice infected with Mtb are protected due to reduced inflammatory responses at the chronic stage. 35 Rab20 plays a role in the maturation and acidification of phagosomes and the fusion of phagosomes with lysosomes during mycobacterial infection. 36 Ticam1 is involved in native immunity against pathogens: it interacts specifically with toll-like receptor 3, activates NF-κB. 37 Tnfaip3 (A20) is an important regulatory protein that down-regulates NF-κB activity. 38  Relationships are depicted with arrows of a different colour: greenactivation, redrepression. Dashed arrows indicate predicted relationships, solidknown from published studies and described previously. Full coordinates of enhancers are available in Table S1.
We further investigated protein networks associated with the selected miRNAs. Proteins regulated by miRNAs Mir22, Mir155, Mir221, Mir222 are involved in regulation of NF-κBa vital orchestrator of the response of the innate immune cells to pathogens 39 ( Figure 3A). Cxcl1 and Cxcl2 regulated by Mir22 and Mir155 ( Figure 3B) are chemokines which signal through CXC receptor 2 to attract neutrophils to the place of inflammation, which is essential to control tissue infection. 31

Discussion
MIREyA aims to easily find candidate activating miRNAs which trigger an enhancer and genes up-regulated by the enhancer. New emerging methods for studying RNA-DNA interactome detect previously unknown miRNAs bound to chromatin 40,41 which might be promising for further experimental investigation in terms of understanding complex gene regulation networks in detail. Although high-throughput data on RNA:DNA interactions in several cell types but not macrophages are available now, we could not confirm MiRNAs detected with MIREyA for Mtb infection using either RADICL-seq 41 or GRID-seq, 42 which is unsurprising since the ncRNA:DNA interactions are highly cell-type-specific. 41 Although the implemented algorithm considers multiple aspects of the suggested mechanism of gene activation, several factors remain unaccounted for. To run MIREyA a user requires a priori knowledge of enhancer-gene interactions. A common fast computational approach to determine gene-enhancer pairs is to use genes and enhancers co-localized in the genome, ignoring the long-distance spatial interactions. One of solutions is the approach suggested in: 25 to calculate the correlation between expression of genes and enhancers which belong to the same TAD.
Although the exact mechanism of RNA activation of enhancers remains unclear, we do know that miRNA as a mediator facilitates epigenetic modifications in the enhancer region. At this stage, MIREyA does not consider chromatin availability or any other epigenetic information. In order to reduce the number of false positive predictions, nuclear localisation of specific predicted miRNAs should be validated experimentally.
Despite the discussed limitations, using MIREyA we detected several promising miRNA candidates. We suggest that MIREyA provides a promising approach to select miRNAs which up-regulate genes by triggering enhancers for further experimental validation.

Conclusion
Our method extends the study of activational miRNAs and provides a basis for further research. The use case on Mtbinfected macrophages demonstrates the possibility of existence of novel miRNAs up-regulating gene expression.

Underlying data
The CAGE time series expression datasets from the use case are available at https://fantom.gsc.riken.jp/data/ in mm9 Phase 2 release and can be selected to download with FANTOM5 This project contains the following extended data: -

Open Peer Review
However, the manuscript lacks this important part. Though the authors applied the pipeline to primary murine macrophages infected by Mycobacterium tuberculosis and detected some miRNAs could up-regulate genes related to immune responses, these predicted results need to be proved by experiments. At present, we can't determine whether these regulatory relationships are correct, so the reliability of MIREyA can not be proved. The authors can use the published miRNAs with activation function, such as MiR-24-1, miR-744 and MiRNA let-7i, to verify the reliability of the pipeline.
Are there any miRNAs coming from primary murine macrophases affected by Mycobacterium tuberculosis that can bind to enhancers but reduce gene expression? 2.
"keep only enhancers with the percent identity (PI) > 0.5 (PI defined as a percent of matches between miRNA and DNA region)". Will 0.5 be too low to increase false positives? What is the basis for the author to choose the parameter? 3.
Are there any other sRNAs that share very similar sequence to miRNA and express even higher than miRNA in cells? These sRNAs may be involved in the regulation of enhancers too. Whether authors can improve the software to find these co-regulated sRNAs from high throughput sequence data, so as to build a more complete regulatory network.

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Yes Thank you for leaving the review. As for the questions: A newly developed software needs to prove its function is correct before it is applied. However, the manuscript lacks this important part. Though the authors applied the pipeline to primary murine macrophages infected by Mycobacterium tuberculosis and detected some miRNAs could up-regulate genes related to immune responses, these predicted results need to be proved by experiments. At present, we can't determine whether these regulatory relationships are correct, so the reliability of MIREyA can not be proved. The authors can use the published miRNAs with activation function, such as MiR-24-1, miR-744 and MiRNA let-7i, to verify the reliability of the pipeline.

○
To find already studied miRNAs with this tool, we need an expression dataset for the same cell types and conditions as in the references. MiRNA expression is highly dependent on cell type and conditions, thus any other expression dataset would have a different set of expressed miRNAs which might not include miRNAs studied in the papers. Another limitation is that the expression dataset must include at least 8 samples so that correlation analysis would make sense. For now we are not aware of such a complete dataset.
In case of MiR-24-1, there was no expression dataset to run our pipeline, but the first step of the algorithm (miRNA/enhancer interaction prediction) would automatically find the enhancer from the article: the miRNA sequence is apparently present in the sequence of the enhancer in the reference 3 as the enhancer contains a gene sequence of the miRNA. We found an article (doi.org/10.1038/srep12987) reporting gene activation by miR-744 and an article (doi.org/10.1073/pnas.1803384115) reporting similar activation by miR-744 but the authors did not demonstrate that these miRNAs do that through enhancer activation. Lack of a known enhancer in a regulatory circuit makes the use of MIREyA impossible. Are there any miRNAs coming from primary murine macrophases affected by Mycobacterium tuberculosis that can bind to enhancers but reduce gene expression? ○ Since we focused on the activatory role of miRNA, we believe that the analysis of repressor miRNAs bound to enhancers is beyond the scope of the current paper. "keep only enhancers with the percent identity (PI) > 0.5 (PI defined as a percent of matches between miRNA and DNA region)". Will 0.5 be too low to increase false positives? What is the basis for the author to choose the parameter?

○
We believe that 0.5 is high enough. We chose the threshold by comparison with experimentally validated examples of miRNA interaction with RNA, where the minimum 11/22 miRNA bases represented an exact match. We hypothesized that such PI would be sufficient for miRNA interaction with DNA. Are there any other sRNAs that share very similar sequence to miRNA and express even higher than miRNA in cells? These sRNAs may be involved in the regulation of enhancers too. Whether authors can improve the software to find these co-regulated sRNAs from high throughput sequence data, so as to build a more complete regulatory network.
○ So far we have not seen any article describing sRNAs with similar to miRNA properties which are able to regulate enhancers. We believe that for now this suggestion is beyond the scope of this work.
the correlation test? At least 3 line expression data about Ccdc59 in the example expression data data/DE_gene_expression.tsv, which one is used in the correlation test?
I could not reproduce the first example 1 on page 5: 3.

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Partly the future. Some genes have multiple copies across the genome, when calculating the correlation between gene expression and enhancer/miRNA, will copy/genomic coordinate be given to the correlation test? At least 3 line expression data about Ccdc59 in the example expression data data/DE_gene_expression.tsv, which one is used in the correlation test? 1.
The expression dataset we use represents promoters not genes obtained by CAGE-seq technique. It has been shown previously that on average each human gene has 5 promoters. So in our dataset "copies" represent different promoters of genes. The pipeline will consider them separately. If one has a similar dataset, they may choose a maximum correlation coefficient per gene after running the pipeline. Alternatively, it is possible to sum promoter expression into gene expression from the beginning and pass it to the pipeline using a flag -ge. In the example dataset we used promoter separetely. I could not reproduce the first example 1 on page 5:

Michiel Jan Laurens De Hoon
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan

Summary:
This manuscript describes a novel software tool to detect interactions between miRNAs, enhancers, and the genes they regulate. Though the analysis performed by this software pipeline seems straightforward, it is a welcome addition as a bioinformatics tool, as the activation of enhancers by miRNAs has not been studied in much detail compared to the well-known mechanism of gene regulation by miRNAs via the RNAi pathway.

Comments and suggestions for improvement:
Some of the references to the scientific literature are appropriate but quite old (for example, references 1, 9, and 26). Can these references be updated? ○ On page 4, I assume that "percent identity (PI) > 0.5" should be "percent identity (PI) > 50". ○ On page 5, the format and contents of the output file mir_enh_gene_trios.tsv is not clearly explained.
○ In Table 2, should a column with the number of enhancers be added (same as in Table 1)?

Optional:
In the results, what is the evidence that miRNA-gene interactions shown in Table 1  Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? Partly
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? Partly We replaced reference 1 with a reference of a more recent review. The 9th reference is a reference to the original article of miRanda tool which is rather old but still functioning and suitable for our needs. The 26th reference is the most recent (2014) article to cite for miRBase database. On page 4, I assume that "percent identity (PI) > 0.5" should be "percent identity (PI) > 50". ○ We replaced 0.5 with 50 in the text of the article and we replaced PIs in the algorithm with values from 0 to 100 instead of 0 to 1. On page 5, the format and contents of the output file mir_enh_gene_trios.tsv is not clearly explained.

○
We added the following explanation to the main text of the article: One line in the output file corresponds to one trio of a mature miRNA, an enhancer and a gene ("mature_mirna", "Gene.Name", "enhancer" columns). Columns "corr(miRNA, gene)" and "p.value adj" correspond to Spearman's correlation coefficient of the miRNA and the gene expression and FDR respectively. "PI" column corresponds to percent identity and is present only in the output of seed_match_needle approach. In Table 2, should a column with the number of enhancers be added (same as in Table 1)? We added a column with the number of enhancers to table 2.

Optional:
In the results, what is the evidence that miRNA-gene interactions shown in Table 1 and 2 are true? If no such evidence is available, can the pipeline be used to rediscover the miRNA/enhancer/gene interactions described in reference 3, 4, and 5? I understand that this may be difficult as currently there are not many examples available together with suitable expression data sets, but perhaps it can be shown that at least the first step in the pipeline (detecting miRNA-enhancer interactions by Needle, MiRanda, or Triplexator) gives results consistent with what is described in these three references? That would make the paper more convincing.

○
As for the evidence that miRNA-gene interactions shown in Table 1 and 2, we do understand that the output of our pipeline needs to be validated experimentally. The redescovering of the miRNA/enhancer/gene interactions described in reference 3, 4, and 5 is impossible since there's no expression dataset available in these references. MiRNA expression is highly dependent on cell type and conditions, thus any other expression dataset would have a different set of expressed miRNAs which might not include miRNAs studied in the papers.
Repeating the first step (miRNA/enhancer interaction prediction) in the pipeline is impossible for the references 4, 5, since they describe other mechanisms of gene activation by miRNA without enhancer activation. The miRNA MiR-24-1 is apparently present in the sequence of the enhancer in the reference 3, because the enhancer contains a gene sequence of the miRNA as described in the article. Therefore an alignment of miRNA to the enhancer automatically will be successful. As a general point: Are both Python and R needed to run the pipeline? Could the pipeline be designed such that only Python or only R are needed?

○
We started developing our pipeline as a set of R and bash scripts and added Python later due to the need of Biopython functions to work with fasta files. We will consider using only Python for MIREyA as a further step.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com