<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.135164.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>On the limits of inferring biophysical parameters of RBP-RNA interactions from in vitro RNA Bind&#x2019;n Seq data</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 1 approved, 2 approved with reservations, 1 not approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Schlusser</surname>
                        <given-names>Niels</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Zavolan</surname>
                        <given-names>Mihaela</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Biozentrum, Universitat Basel, Basel, Basel-Stadt, 4056, Switzerland</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:mihaela.zavolan@unibas.ch">mihaela.zavolan@unibas.ch</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>5</month>
                <year>2024</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2023</year>
            </pub-date>
            <volume>12</volume>
            <elocation-id>742</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>8</day>
                    <month>5</month>
                    <year>2024</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Schlusser N and Zavolan M</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/12-742/pdf"/>
            <abstract>
                <p>We develop a thermodynamic model describing the binding of RNA binding proteins (RBP) to oligomers 
                    <italic toggle="yes">in vitro.</italic> We apply expectation-maximization to infer the specificity of RBPs, represented as position-specific weight matrices (PWMs), by maximizing the likelihood of RNA Bind&#x2019;n Seq data from the ENCODE project. Analyzing these public data we find sequence motifs that can partly explain the data for more than half of the studied 111 RBPs, and for 48 of the proteins these motifs are consistent with the known specificity. Our code is publicly available, facilitating analysis of RBP binding data.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Systems biology</kwd>
                <kwd>bioinformatics</kwd>
                <kwd>computational biology</kwd>
                <kwd>machine learning</kwd>
                <kwd>maximum entropy method</kwd>
                <kwd>Bayesian statistics</kwd>
                <kwd>RNA binding proteins</kwd>
                <kwd>RNA Bind'n'Seq</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/501100001711">
                    <funding-source>Schweizerischer Nationalfonds zur F&#x00f6;rderung der Wissenschaftlichen Forschung</funding-source>
                    <award-id>310030_204517</award-id>
                </award-group>
                <funding-statement>This work was funded by the Swiss National Fund under grant number 310030_204517</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>Following the suggestion of Johannes S&#x00f6;ding, we reduced the degree of the Markov model to compute the sequence frequency priors. Consequently, we could confirm a lot more binding motifs obtained from our model with data from other sources. Therefore, we decided to expand the application of our model from just a few examples of public RNA Bind'n Seq data sets (RBFOX2, CELF1, hnRNPD, hnRNPK, MBNL1, hnRNPL, FUS, TAF15) to the entirety of publicly available Bind'n Seq data sets (111 RBPs in total on ENCODE). As a result, the run statistics are much more comprehensive (Fig. 1), and we added (Fig. 2) as a comparison of binding affinities and the respective motifs. Beyond that, we expanded the number of discussed RBP examples (Subsec. 4.1-4.14, Fig. 3-15). A derivation of relative binding affinities from our model terminology is appended to Sec. 2 and serves as a measure of comparison between PWMs of different length instead of the previous BIC and AIC. Following the suggestions of the reviewers, we clarified the derivation of the model (Sec. 2) and the EM procedure (Subsec. 3.2). We reflect the changes to our work in slight changes in the abstract to highlight the different outcome compared to version 1. We removed Appendix A since a list of used data sets is anyway given in the metadata of the article.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>1. Introduction</title>
            <p>RNA-binding proteins (RBPs) interact with RNAs at every step of their life cycle. Due to their modular structure, usually consisting in an assortment of RNA-binding domains, RBPs interact with both RNAs and proteins, and couple various layers of gene expression.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup> While 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mo>&#x223c;</mml:mo>
                        <mml:mn>2500</mml:mn>
                    </mml:math>
                </inline-formula> RBPs are currently known, most remain to be functionally characterized. A first step in this process is to determine the interaction partners and the sequence/structure specificity of the RBP. Many RBPs recognize their targets in a sequence-specific manner, although the accessibility of binding sites within the targets also plays a role.
                <sup>
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup> The sequence specificity is usually represented by a position weight matrix (PWM), which specifies the probability of finding each of the four nucleotides at each position in the RBP binding site. This is an obvious simplification, as dependencies between positions in the binding site likely occur. However, training more complex models requires substantially more data, which are often not available. Moreover, in the case of another class of nucleic acid binding proteins, transcription factors, the improvement in binding site predictability by more complex models is modest, at least in the case of other nucleic acid binding proteins, transcription factors.
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> With the realization that the presence of a canonical RNA-binding domains is not necessary for the ability of a protein to bind RNAs
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup> came a pressing need to determine the determinants of RNA-RBPs interactions and the sequence/structure specificity of the proteins newly found to interact with RNAs.</p>
            <p>The past two decades have seen the development and broad application of experimental methods for RBP target identification. They include 
                <italic toggle="yes">in vivo</italic> high-throughput approaches such as HITS-CLIP, PAR-CLIP, iCLIP and eCLIP (reviewed in Refs. 
                <xref ref-type="bibr" rid="ref5">5</xref>, 
                <xref ref-type="bibr" rid="ref6">6</xref>), and more recently-developed 
                <italic toggle="yes">in vitro</italic> approaches such as RNA Bind&#x2019;n Seq.
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>
                </sup> While the CLIP methods rely on the sequencing of RNAs that interact with and can therefore be crosslinked to RBPs 
                <italic toggle="yes">in vivo</italic>, RNA Bind&#x2019;n Seq relies on the affinity-dependent interaction of RBPs with random RNAs 
                <italic toggle="yes">in vitro.</italic> The oligonucleotides selected in this experiment are computationally analyzed to identify short sequence motifs that mediate the interaction with the RBP. So far, analyses of such data involved the identification of enriched kmers (short oligonucleotide sequences of a specified length, 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mi>k</mml:mi>
                    </mml:math>
                </inline-formula>), and then a greedy alignment procedure yielded PWM representations of the RBP binding motifs. This left open the question of whether the derived PWMs accurately predicted the interaction energies of RBPs with their binding sites. In contrast, the aim of our work was to develop a biophysics-anchored method to directly infer the PWMs from RNA Bind&#x2019;n Seq data. Our paper is organized as follows: 
                <xref ref-type="sec" rid="sec2">Sec. 2</xref> explains how we derive our thermodynamical model. We comment on the practical implementation of this model in 
                <xref ref-type="sec" rid="sec5">Sec. 3</xref>, where we also explain how we account for sequence composition biases in the pool of oligomers. Results for different RBPs are presented in 
                <xref ref-type="sec" rid="sec8">Sec. 4</xref>, where we also comment on the accuracy of the results obtained from this type of data for different RBPs. Concluding remarks are given in 
                <xref ref-type="sec" rid="sec24">Sec. 5</xref>.</p>
        </sec>
        <sec id="sec2">
            <title>2. Model</title>
            <p>Our model is an adaptation of a Bayesian, thermodynamic model that was constructed to infer di-nucleotide weight tensors from SELEX data.
                <sup>
                    <xref ref-type="bibr" rid="ref8">8</xref>
                </sup> In the following, we derive the log-likelihood of Bind&#x2019;N Seq data given the PWM for the RBP of interest, which will be inferred by expectation-maximization as described in 
                <xref ref-type="sec" rid="sec5">Sec. 3</xref>.</p>
            <sec id="sec3">
                <title>2.1 Derivation of the data likelihood</title>
                <p>We assume that an RBP binds an oligomer over a binding site 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>s</mml:mi>
                        </mml:math>
                    </inline-formula> of length 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> and that the likelihood of binding taking place, according to Boltzmann&#x2019;s law, goes as 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x221d;</mml:mo>
                            <mml:mo>exp</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:msubsup>
                                    <mml:mo>&#x2211;</mml:mo>
                                    <mml:mrow>
                                        <mml:mi>i</mml:mi>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                    <mml:msub>
                                        <mml:mi>L</mml:mi>
                                        <mml:mi>w</mml:mi>
                                    </mml:msub>
                                </mml:msubsup>
                                <mml:msubsup>
                                    <mml:mi>E</mml:mi>
                                    <mml:mi>i</mml:mi>
                                    <mml:msub>
                                        <mml:mi>s</mml:mi>
                                        <mml:mi>i</mml:mi>
                                    </mml:msub>
                                </mml:msubsup>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2261;</mml:mo>
                            <mml:msup>
                                <mml:mi mathvariant="normal">e</mml:mi>
                                <mml:mrow>
                                    <mml:mi>E</mml:mi>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">(</mml:mo>
                                        <mml:mi>s</mml:mi>
                                        <mml:mo stretchy="true">)</mml:mo>
                                    </mml:mrow>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, where 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>s</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is the nucleotide at position 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>i</mml:mi>
                        </mml:math>
                    </inline-formula>, so 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>s</mml:mi>
                                <mml:mi>i</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2208;</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">{</mml:mo>
                                <mml:mi mathvariant="normal">A</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi mathvariant="normal">C</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi mathvariant="normal">G</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi mathvariant="normal">T</mml:mi>
                                <mml:mo stretchy="true">}</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>. Therefore, each element of the position weight matrix (PWM) 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>M</mml:mi>
                        </mml:math>
                    </inline-formula> can be identified with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mi>m</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>&#x03b1;</mml:mi>
                            </mml:msubsup>
                            <mml:mo>&#x2261;</mml:mo>
                            <mml:mo>exp</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:msubsup>
                                    <mml:mi>E</mml:mi>
                                    <mml:mi>i</mml:mi>
                                    <mml:mi>&#x03b1;</mml:mi>
                                </mml:msubsup>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>, with columns being normalized as 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mi>&#x03b1;</mml:mi>
                            </mml:msub>
                            <mml:msubsup>
                                <mml:mi>m</mml:mi>
                                <mml:mi>i</mml:mi>
                                <mml:mi>&#x03b1;</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>1</mml:mn>
                            <mml:mspace width="1em"/>
                            <mml:mo>&#x2200;</mml:mo>
                            <mml:mspace width="0.5em"/>
                            <mml:mi>i</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>1</mml:mn>
                            <mml:mo>,</mml:mo>
                            <mml:mo>&#x2026;</mml:mo>
                            <mml:mo>,</mml:mo>
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>.</p>
                <p>Additionally, we account for the fact that there are genuinely two different ways of binding, sequence-specific binding as described by the PWM, and unspecific binding to RNAs with a probability 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>exp</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>. Combining these two possibilities, we arrive at the probability of a site 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>s</mml:mi>
                        </mml:math>
                    </inline-formula> being bound by the RBP
                    <disp-formula id="e1">
                        <mml:math display="block">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mtext>bound</mml:mtext>
                                <mml:mo>|</mml:mo>
                                <mml:mi>s</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>c</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:mi>c</mml:mi>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">(</mml:mo>
                                        <mml:msup>
                                            <mml:mi mathvariant="normal">e</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>E</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:mi>s</mml:mi>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:msup>
                                        <mml:mo>+</mml:mo>
                                        <mml:msup>
                                            <mml:mi mathvariant="normal">e</mml:mi>
                                            <mml:msub>
                                                <mml:mi>E</mml:mi>
                                                <mml:mn>0</mml:mn>
                                            </mml:msub>
                                        </mml:msup>
                                        <mml:mo stretchy="true">)</mml:mo>
                                    </mml:mrow>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:mn>1</mml:mn>
                                    <mml:mo>+</mml:mo>
                                    <mml:mi>c</mml:mi>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">(</mml:mo>
                                        <mml:msup>
                                            <mml:mi mathvariant="normal">e</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>E</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:mi>s</mml:mi>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:msup>
                                        <mml:mo>+</mml:mo>
                                        <mml:msup>
                                            <mml:mi mathvariant="normal">e</mml:mi>
                                            <mml:msub>
                                                <mml:mi>E</mml:mi>
                                                <mml:mn>0</mml:mn>
                                            </mml:msub>
                                        </mml:msup>
                                        <mml:mo stretchy="true">)</mml:mo>
                                    </mml:mrow>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mo>,</mml:mo>
                        </mml:math>
                        <label>(2.1)</label>
                    </disp-formula>where the 1 in the denominator represents the (constant) chance of the RBP being unbound. Assuming that the binding of RBPs to oligomers is not saturated, i.e. 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>c</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:msup>
                                    <mml:mi mathvariant="normal">e</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>E</mml:mi>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mi>s</mml:mi>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mo>+</mml:mo>
                                <mml:msup>
                                    <mml:mi mathvariant="normal">e</mml:mi>
                                    <mml:msub>
                                        <mml:mi>E</mml:mi>
                                        <mml:mn>0</mml:mn>
                                    </mml:msub>
                                </mml:msup>
                                <mml:mn>0</mml:mn>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x226a;</mml:mo>
                            <mml:mn>1</mml:mn>
                        </mml:math>
                    </inline-formula>, we can linearize 
                    <xref ref-type="disp-formula" rid="e1">(2.1)</xref>
                    <disp-formula id="e2">
                        <mml:math display="block">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mtext>bound</mml:mtext>
                                <mml:mo>|</mml:mo>
                                <mml:mi>s</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>c</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:mi>c</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:msup>
                                    <mml:mi mathvariant="normal">e</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>E</mml:mi>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mi>s</mml:mi>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mo>+</mml:mo>
                                <mml:msup>
                                    <mml:mi mathvariant="normal">e</mml:mi>
                                    <mml:msub>
                                        <mml:mi>E</mml:mi>
                                        <mml:mn>0</mml:mn>
                                    </mml:msub>
                                </mml:msup>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>.</mml:mo>
                        </mml:math>
                        <label>(2.2)</label>
                    </disp-formula>
                </p>
                <p>Consequently, the chance of an RBP being bound somewhere on a longer oligomer 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula> with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is
                    <disp-formula id="e3">
                        <mml:math display="block">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mtext>bound</mml:mtext>
                                <mml:mo>|</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>c</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:munder>
                                <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>s</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>S</mml:mi>
                                </mml:mrow>
                            </mml:munder>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mtext>bound</mml:mtext>
                                <mml:mo>|</mml:mo>
                                <mml:mi>s</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>c</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:mi>c</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:msup>
                                    <mml:mi mathvariant="normal">e</mml:mi>
                                    <mml:mrow>
                                        <mml:mi>E</mml:mi>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mi>S</mml:mi>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mo>+</mml:mo>
                                <mml:mrow>
                                    <mml:mo stretchy="true">(</mml:mo>
                                    <mml:msub>
                                        <mml:mi>L</mml:mi>
                                        <mml:mi>S</mml:mi>
                                    </mml:msub>
                                    <mml:mo>&#x2212;</mml:mo>
                                    <mml:msub>
                                        <mml:mi>L</mml:mi>
                                        <mml:mi>w</mml:mi>
                                    </mml:msub>
                                    <mml:mo>+</mml:mo>
                                    <mml:mn>1</mml:mn>
                                    <mml:mo stretchy="true">)</mml:mo>
                                </mml:mrow>
                                <mml:mspace width="0.1em"/>
                                <mml:msup>
                                    <mml:mi mathvariant="normal">e</mml:mi>
                                    <mml:msub>
                                        <mml:mi>E</mml:mi>
                                        <mml:mn>0</mml:mn>
                                    </mml:msub>
                                </mml:msup>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>,</mml:mo>
                        </mml:math>
                        <label>(2.3)</label>
                    </disp-formula>where 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msup>
                                <mml:mi mathvariant="normal">e</mml:mi>
                                <mml:mrow>
                                    <mml:mi>E</mml:mi>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">(</mml:mo>
                                        <mml:mi>S</mml:mi>
                                        <mml:mo stretchy="true">)</mml:mo>
                                    </mml:mrow>
                                </mml:mrow>
                            </mml:msup>
                            <mml:mo>&#x2261;</mml:mo>
                            <mml:msub>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>s</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>S</mml:mi>
                                </mml:mrow>
                            </mml:msub>
                            <mml:msup>
                                <mml:mi mathvariant="normal">e</mml:mi>
                                <mml:mrow>
                                    <mml:mi>E</mml:mi>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">(</mml:mo>
                                        <mml:mi>s</mml:mi>
                                        <mml:mo stretchy="true">)</mml:mo>
                                    </mml:mrow>
                                </mml:mrow>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> is the sum over all possible 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>-mers 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>s</mml:mi>
                        </mml:math>
                    </inline-formula> in 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula>. The probability to observe each read 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula> in the pool of oligomers that were selected by the interaction with the RBP is
                    <disp-formula id="e4">
                        <mml:math display="block">
                            <mml:mtable displaystyle="true" groupalign="{center}">
                                <mml:mtr>
                                    <mml:mtd>
                                        <mml:mi>P</mml:mi>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mi>IP</mml:mi>
                                            <mml:mo>|</mml:mo>
                                            <mml:mi>S</mml:mi>
                                            <mml:mo>,</mml:mo>
                                            <mml:mi>c</mml:mi>
                                            <mml:mo>,</mml:mo>
                                            <mml:mi>M</mml:mi>
                                            <mml:mo>,</mml:mo>
                                            <mml:msub>
                                                <mml:mi>E</mml:mi>
                                                <mml:mn>0</mml:mn>
                                            </mml:msub>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                        <mml:mo>=</mml:mo>
                                        <mml:mfrac>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mi>f</mml:mi>
                                                    <mml:mi>S</mml:mi>
                                                </mml:msub>
                                                <mml:mi>P</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:mtext>bound</mml:mtext>
                                                    <mml:mo>|</mml:mo>
                                                    <mml:mi>S</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:mi>c</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:mi>M</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:msub>
                                                        <mml:mi>E</mml:mi>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:msub>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mo>&#x2211;</mml:mo>
                                                    <mml:mrow>
                                                        <mml:mi>&#x03c3;</mml:mi>
                                                        <mml:mo>&#x2208;</mml:mo>
                                                        <mml:mi>D</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                                <mml:msub>
                                                    <mml:mi>f</mml:mi>
                                                    <mml:mi>&#x03c3;</mml:mi>
                                                </mml:msub>
                                                <mml:mi>P</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:mtext>bound</mml:mtext>
                                                    <mml:mo>|</mml:mo>
                                                    <mml:mi>&#x03c3;</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:mi>c</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:mi>M</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:msub>
                                                        <mml:mi>E</mml:mi>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:msub>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:mfrac>
                                    </mml:mtd>
                                </mml:mtr>
                                <mml:mtr>
                                    <mml:mtd>
                                        <mml:mspace width="11em"/>
                                        <mml:mo>&#x2248;</mml:mo>
                                        <mml:mfrac>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mi>f</mml:mi>
                                                    <mml:mi>S</mml:mi>
                                                </mml:msub>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:msup>
                                                        <mml:mi mathvariant="normal">e</mml:mi>
                                                        <mml:mrow>
                                                            <mml:mi>E</mml:mi>
                                                            <mml:mrow>
                                                                <mml:mo stretchy="true">(</mml:mo>
                                                                <mml:mi>S</mml:mi>
                                                                <mml:mo stretchy="true">)</mml:mo>
                                                            </mml:mrow>
                                                        </mml:mrow>
                                                    </mml:msup>
                                                    <mml:mo>+</mml:mo>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>L</mml:mi>
                                                            <mml:mi>S</mml:mi>
                                                        </mml:msub>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>L</mml:mi>
                                                            <mml:mi>w</mml:mi>
                                                        </mml:msub>
                                                        <mml:mo>+</mml:mo>
                                                        <mml:mn>1</mml:mn>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                    <mml:mspace width="0.1em"/>
                                                    <mml:msup>
                                                        <mml:mi mathvariant="normal">e</mml:mi>
                                                        <mml:msub>
                                                            <mml:mi>E</mml:mi>
                                                            <mml:mn>0</mml:mn>
                                                        </mml:msub>
                                                    </mml:msup>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mo>&#x2211;</mml:mo>
                                                    <mml:mrow>
                                                        <mml:mi>&#x03c3;</mml:mi>
                                                        <mml:mo>&#x2208;</mml:mo>
                                                        <mml:mi>D</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                                <mml:msub>
                                                    <mml:mi>f</mml:mi>
                                                    <mml:mi>&#x03c3;</mml:mi>
                                                </mml:msub>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:msup>
                                                        <mml:mi mathvariant="normal">e</mml:mi>
                                                        <mml:mrow>
                                                            <mml:mi>E</mml:mi>
                                                            <mml:mrow>
                                                                <mml:mo stretchy="true">(</mml:mo>
                                                                <mml:mi>&#x03c3;</mml:mi>
                                                                <mml:mo stretchy="true">)</mml:mo>
                                                            </mml:mrow>
                                                        </mml:mrow>
                                                    </mml:msup>
                                                    <mml:mo>+</mml:mo>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>L</mml:mi>
                                                            <mml:mi>&#x03c3;</mml:mi>
                                                        </mml:msub>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>L</mml:mi>
                                                            <mml:mi>w</mml:mi>
                                                        </mml:msub>
                                                        <mml:mo>+</mml:mo>
                                                        <mml:mn>1</mml:mn>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                    <mml:mspace width="0.1em"/>
                                                    <mml:msup>
                                                        <mml:mi mathvariant="normal">e</mml:mi>
                                                        <mml:msub>
                                                            <mml:mi>E</mml:mi>
                                                            <mml:mn>0</mml:mn>
                                                        </mml:msub>
                                                    </mml:msup>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:mfrac>
                                        <mml:mo>,</mml:mo>
                                    </mml:mtd>
                                </mml:mtr>
                            </mml:mtable>
                        </mml:math>
                        <label>(2.4)</label>
                    </disp-formula>with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>D</mml:mi>
                        </mml:math>
                    </inline-formula> being the data set of reads, IP being a binary variable indicating if the read is immuno-percipitated or not, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> a frequency prior that corrects for the fact that the pool of oligomers has a non-uniform nucleotide composition. Note that, due to the linearization in 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>c</mml:mi>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>IP</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> is independent of the protein concentration 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>c</mml:mi>
                        </mml:math>
                    </inline-formula>, which cancels out as an overall prefactor in both numerator and denominator. 
                    <xref ref-type="disp-formula" rid="e4">(2.4)</xref> is essentially a formulation of Bayes&#x2019; theorem, with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>IP</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> being the likelihood of having a read 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula> being immuno-percipitated, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> being the likelihood of the read 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula> in the input, and having an overall normalization in the denominator.</p>
                <p>Eventually, the log-likelihood of the entire library of oligomers 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>D</mml:mi>
                        </mml:math>
                    </inline-formula> explained by the specific binding to the RBP described by the PWM 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>M</mml:mi>
                        </mml:math>
                    </inline-formula> of length 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> as well as unspecific binding described by parameter 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is given by
                    <disp-formula id="e5">
                        <mml:math display="block">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:munder>
                                <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>S</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>D</mml:mi>
                                </mml:mrow>
                            </mml:munder>
                            <mml:msub>
                                <mml:mi>n</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>log</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">[</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>f</mml:mi>
                                            <mml:mi>S</mml:mi>
                                        </mml:msub>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:msup>
                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                <mml:mrow>
                                                    <mml:mi>E</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:mi>S</mml:mi>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                </mml:mrow>
                                            </mml:msup>
                                            <mml:mo>+</mml:mo>
                                            <mml:mrow>
                                                <mml:mo stretchy="true">(</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>L</mml:mi>
                                                    <mml:mi>S</mml:mi>
                                                </mml:msub>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>L</mml:mi>
                                                    <mml:mi>w</mml:mi>
                                                </mml:msub>
                                                <mml:mo>+</mml:mo>
                                                <mml:mn>1</mml:mn>
                                                <mml:mo stretchy="true">)</mml:mo>
                                            </mml:mrow>
                                            <mml:mspace width="0.1em"/>
                                            <mml:msup>
                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                <mml:msub>
                                                    <mml:mi>E</mml:mi>
                                                    <mml:mn>0</mml:mn>
                                                </mml:msub>
                                            </mml:msup>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mo>&#x2211;</mml:mo>
                                            <mml:mrow>
                                                <mml:mi>&#x03c3;</mml:mi>
                                                <mml:mo>&#x2208;</mml:mo>
                                                <mml:mi>D</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:msub>
                                            <mml:mi>f</mml:mi>
                                            <mml:mi>&#x03c3;</mml:mi>
                                        </mml:msub>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:msup>
                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                <mml:mrow>
                                                    <mml:mi>E</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:mi>&#x03c3;</mml:mi>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                </mml:mrow>
                                            </mml:msup>
                                            <mml:mo>+</mml:mo>
                                            <mml:mrow>
                                                <mml:mo stretchy="true">(</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>L</mml:mi>
                                                    <mml:mi>&#x03c3;</mml:mi>
                                                </mml:msub>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>L</mml:mi>
                                                    <mml:mi>w</mml:mi>
                                                </mml:msub>
                                                <mml:mo>+</mml:mo>
                                                <mml:mn>1</mml:mn>
                                                <mml:mo stretchy="true">)</mml:mo>
                                            </mml:mrow>
                                            <mml:mspace width="0.1em"/>
                                            <mml:msup>
                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                <mml:msub>
                                                    <mml:mi>E</mml:mi>
                                                    <mml:mn>0</mml:mn>
                                                </mml:msub>
                                            </mml:msup>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo stretchy="true">]</mml:mo>
                            </mml:mrow>
                            <mml:mo>,</mml:mo>
                        </mml:math>
                        <label>(2.5)</label>
                    </disp-formula>where 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>n</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is the number of copies of read 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula> in the library.</p>
            </sec>
            <sec id="sec4">
                <title>2.2 Relationship to the dissociation constant 
                    <italic toggle="yes">K</italic>
                    <sub>
                        <italic toggle="yes">D</italic>
                    </sub>
                </title>
                <p>The log-likelihood of binding derived in the previous section can be related to the dissociation constant 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> known from the formalism of chemical reactions as follows. The concentration 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mrow>
                                    <mml:mo stretchy="true">[</mml:mo>
                                    <mml:mi>RBP</mml:mi>
                                    <mml:mo stretchy="true">]</mml:mo>
                                </mml:mrow>
                                <mml:mtext>bound</mml:mtext>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> of RBP bound to a multivalent oligomer with concentration 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mrow>
                                <mml:mo stretchy="true">[</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo stretchy="true">]</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> is given by
                    <sup>
                        <xref ref-type="bibr" rid="ref9">9</xref>
                    </sup>
                    <disp-formula id="e6">
                        <mml:math display="block">
                            <mml:msub>
                                <mml:mrow>
                                    <mml:mo stretchy="true">[</mml:mo>
                                    <mml:mi>RBP</mml:mi>
                                    <mml:mo stretchy="true">]</mml:mo>
                                </mml:mrow>
                                <mml:mtext>bound</mml:mtext>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:mi>n</mml:mi>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">[</mml:mo>
                                        <mml:mi>S</mml:mi>
                                        <mml:mo stretchy="true">]</mml:mo>
                                    </mml:mrow>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:mfrac>
                                        <mml:msub>
                                            <mml:mi>K</mml:mi>
                                            <mml:mi mathvariant="normal">D</mml:mi>
                                        </mml:msub>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">[</mml:mo>
                                            <mml:mi>RBP</mml:mi>
                                            <mml:mo stretchy="true">]</mml:mo>
                                        </mml:mrow>
                                    </mml:mfrac>
                                    <mml:mo>+</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mo>,</mml:mo>
                        </mml:math>
                        <label>(2.6)</label>
                    </disp-formula>where 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is the dissociation constant and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>n</mml:mi>
                        </mml:math>
                    </inline-formula> is the number of binding sites on the oligomer. We make the simplification of considering binding sites to be in principle identical and their maximum number be given by the total number of configurations in which the RBP can contact the oligomer, i.e. 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                            <mml:mo>+</mml:mo>
                            <mml:mn>1</mml:mn>
                        </mml:math>
                    </inline-formula>.</p>
                <p>Further assuming that the probability to observe individual oligomers in the sequencing pool is proportional to their relative abundance in complex with the RBP (which can only hold when at most one RBP molecule is bound to an individual oligomer), i.e. 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x221d;</mml:mo>
                            <mml:msub>
                                <mml:mrow>
                                    <mml:mo stretchy="true">[</mml:mo>
                                    <mml:mi>RBP</mml:mi>
                                    <mml:mo stretchy="true">]</mml:mo>
                                </mml:mrow>
                                <mml:mtext>bound</mml:mtext>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>, we can re-express 
                    <xref ref-type="disp-formula" rid="e5">(2.5)</xref> in terms of 
                    <xref ref-type="disp-formula" rid="e6">(2.6)</xref> as
                    <disp-formula id="e7">
                        <mml:math display="block">
                            <mml:mtable columnalign="left" displaystyle="true">
                                <mml:mtr>
                                    <mml:mtd>
                                        <mml:mo>log</mml:mo>
                                        <mml:mi>P</mml:mi>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mi>D</mml:mi>
                                            <mml:mo>|</mml:mo>
                                            <mml:mi>M</mml:mi>
                                            <mml:mo>,</mml:mo>
                                            <mml:msub>
                                                <mml:mi>E</mml:mi>
                                                <mml:mn>0</mml:mn>
                                            </mml:msub>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                        <mml:mo>=</mml:mo>
                                        <mml:mi>C</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:munder>
                                            <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                            <mml:mrow>
                                                <mml:mi>S</mml:mi>
                                                <mml:mo>&#x2208;</mml:mo>
                                                <mml:mi>D</mml:mi>
                                            </mml:mrow>
                                        </mml:munder>
                                        <mml:msub>
                                            <mml:mi>n</mml:mi>
                                            <mml:mi>S</mml:mi>
                                        </mml:msub>
                                        <mml:mo>log</mml:mo>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mfrac>
                                                <mml:mrow>
                                                    <mml:mi>n</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">[</mml:mo>
                                                        <mml:mi>S</mml:mi>
                                                        <mml:mo stretchy="true">]</mml:mo>
                                                    </mml:mrow>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mfrac>
                                                        <mml:msub>
                                                            <mml:mi>K</mml:mi>
                                                            <mml:mi mathvariant="normal">D</mml:mi>
                                                        </mml:msub>
                                                        <mml:mrow>
                                                            <mml:mo stretchy="true">[</mml:mo>
                                                            <mml:mi>RBP</mml:mi>
                                                            <mml:mo stretchy="true">]</mml:mo>
                                                        </mml:mrow>
                                                    </mml:mfrac>
                                                    <mml:mo>+</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                            </mml:mfrac>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mtd>
                                </mml:mtr>
                                <mml:mtr>
                                    <mml:mtd>
                                        <mml:mspace width="7.5em"/>
                                        <mml:mo>=</mml:mo>
                                        <mml:mi>C</mml:mi>
                                        <mml:mo>+</mml:mo>
                                        <mml:munder>
                                            <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                            <mml:mrow>
                                                <mml:mi>S</mml:mi>
                                                <mml:mo>&#x2208;</mml:mo>
                                                <mml:mi>D</mml:mi>
                                            </mml:mrow>
                                        </mml:munder>
                                        <mml:msub>
                                            <mml:mi>n</mml:mi>
                                            <mml:mi>S</mml:mi>
                                        </mml:msub>
                                        <mml:mo>log</mml:mo>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mfrac>
                                                <mml:mrow>
                                                    <mml:msub>
                                                        <mml:mi>f</mml:mi>
                                                        <mml:mi>S</mml:mi>
                                                    </mml:msub>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>L</mml:mi>
                                                            <mml:mi>S</mml:mi>
                                                        </mml:msub>
                                                        <mml:mo>&#x2212;</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>L</mml:mi>
                                                            <mml:mi>w</mml:mi>
                                                        </mml:msub>
                                                        <mml:mo>+</mml:mo>
                                                        <mml:mn>1</mml:mn>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                </mml:mrow>
                                                <mml:mrow>
                                                    <mml:mfrac>
                                                        <mml:msub>
                                                            <mml:mi>K</mml:mi>
                                                            <mml:mi mathvariant="normal">D</mml:mi>
                                                        </mml:msub>
                                                        <mml:mrow>
                                                            <mml:mo stretchy="true">[</mml:mo>
                                                            <mml:mi>RBP</mml:mi>
                                                            <mml:mo stretchy="true">]</mml:mo>
                                                        </mml:mrow>
                                                    </mml:mfrac>
                                                    <mml:mo>+</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                            </mml:mfrac>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                        <mml:mo>,</mml:mo>
                                    </mml:mtd>
                                </mml:mtr>
                            </mml:mtable>
                        </mml:math>
                        <label>(2.7)</label>
                    </disp-formula>where we used that the concentration 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mrow>
                                <mml:mo stretchy="true">[</mml:mo>
                                <mml:mi>S</mml:mi>
                                <mml:mo stretchy="true">]</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> of any given oligomer 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>S</mml:mi>
                        </mml:math>
                    </inline-formula> is proportional to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>C</mml:mi>
                        </mml:math>
                    </inline-formula> is a proportionality constant independent of the binding specificity. Under the assumption that the binding is typically of low affinity, i.e. 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mfrac>
                                <mml:mn>1</mml:mn>
                                <mml:mrow>
                                    <mml:mfrac>
                                        <mml:msub>
                                            <mml:mi>K</mml:mi>
                                            <mml:mi mathvariant="normal">D</mml:mi>
                                        </mml:msub>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">[</mml:mo>
                                            <mml:mi>RBP</mml:mi>
                                            <mml:mo stretchy="true">]</mml:mo>
                                        </mml:mrow>
                                    </mml:mfrac>
                                    <mml:mo>+</mml:mo>
                                    <mml:mn>1</mml:mn>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:mo stretchy="true">[</mml:mo>
                                    <mml:mi>RBP</mml:mi>
                                    <mml:mo stretchy="true">]</mml:mo>
                                </mml:mrow>
                                <mml:msub>
                                    <mml:mi>K</mml:mi>
                                    <mml:mi mathvariant="normal">D</mml:mi>
                                </mml:msub>
                            </mml:mfrac>
                        </mml:math>
                    </inline-formula>, we arrive at
                    <disp-formula id="e8">
                        <mml:math display="block">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:msup>
                                <mml:mi>C</mml:mi>
                                <mml:mo>&#x2032;</mml:mo>
                            </mml:msup>
                            <mml:mo>+</mml:mo>
                            <mml:munder>
                                <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>S</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>D</mml:mi>
                                </mml:mrow>
                            </mml:munder>
                            <mml:msub>
                                <mml:mi>n</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>log</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:mo stretchy="true">[</mml:mo>
                                        <mml:mi>RBP</mml:mi>
                                        <mml:mo stretchy="true">]</mml:mo>
                                    </mml:mrow>
                                    <mml:msub>
                                        <mml:mi>K</mml:mi>
                                        <mml:mi mathvariant="normal">D</mml:mi>
                                    </mml:msub>
                                </mml:mfrac>
                                <mml:msub>
                                    <mml:mi>f</mml:mi>
                                    <mml:mi>S</mml:mi>
                                </mml:msub>
                                <mml:mrow>
                                    <mml:mo stretchy="true">(</mml:mo>
                                    <mml:msub>
                                        <mml:mi>L</mml:mi>
                                        <mml:mi>S</mml:mi>
                                    </mml:msub>
                                    <mml:mo>&#x2212;</mml:mo>
                                    <mml:msub>
                                        <mml:mi>L</mml:mi>
                                        <mml:mi>w</mml:mi>
                                    </mml:msub>
                                    <mml:mo>+</mml:mo>
                                    <mml:mn>1</mml:mn>
                                    <mml:mo stretchy="true">)</mml:mo>
                                </mml:mrow>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>.</mml:mo>
                        </mml:math>
                        <label>(2.8)</label>
                    </disp-formula>
                </p>
                <p>Hence, we can rank interactions by the dissociation constants relative to some reference PWM 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msup>
                                <mml:mi>M</mml:mi>
                                <mml:mi>ref</mml:mi>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                                <mml:mi>ref</mml:mi>
                            </mml:msubsup>
                        </mml:math>
                    </inline-formula>
                    <disp-formula id="e9">
                        <mml:math display="block">
                            <mml:mo>log</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mfrac>
                                    <mml:msub>
                                        <mml:mi>K</mml:mi>
                                        <mml:mi mathvariant="normal">D</mml:mi>
                                    </mml:msub>
                                    <mml:msubsup>
                                        <mml:mi>K</mml:mi>
                                        <mml:mi mathvariant="normal">D</mml:mi>
                                        <mml:mi>ref</mml:mi>
                                    </mml:msubsup>
                                </mml:mfrac>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mn>1</mml:mn>
                                <mml:mrow>
                                    <mml:munder>
                                        <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>S</mml:mi>
                                            <mml:mo>&#x2208;</mml:mo>
                                            <mml:mi>D</mml:mi>
                                        </mml:mrow>
                                    </mml:munder>
                                    <mml:msub>
                                        <mml:mi>n</mml:mi>
                                        <mml:mi>S</mml:mi>
                                    </mml:msub>
                                </mml:mrow>
                            </mml:mfrac>
                            <mml:mrow>
                                <mml:mo stretchy="true">[</mml:mo>
                                <mml:mo>log</mml:mo>
                                <mml:mi>P</mml:mi>
                                <mml:mrow>
                                    <mml:mo stretchy="true">(</mml:mo>
                                    <mml:mi>D</mml:mi>
                                    <mml:mo>|</mml:mo>
                                    <mml:msup>
                                        <mml:mi>M</mml:mi>
                                        <mml:mi>ref</mml:mi>
                                    </mml:msup>
                                    <mml:mo>,</mml:mo>
                                    <mml:msubsup>
                                        <mml:mi>E</mml:mi>
                                        <mml:mn>0</mml:mn>
                                        <mml:mi>ref</mml:mi>
                                    </mml:msubsup>
                                    <mml:mo stretchy="true">)</mml:mo>
                                </mml:mrow>
                                <mml:mo>&#x2212;</mml:mo>
                                <mml:mo>log</mml:mo>
                                <mml:mi>P</mml:mi>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                                <mml:mo>+</mml:mo>
                                <mml:munder>
                                    <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                    <mml:mrow>
                                        <mml:mi>S</mml:mi>
                                        <mml:mo>&#x2208;</mml:mo>
                                        <mml:mi>D</mml:mi>
                                    </mml:mrow>
                                </mml:munder>
                                <mml:msub>
                                    <mml:mi>n</mml:mi>
                                    <mml:mi>S</mml:mi>
                                </mml:msub>
                                <mml:mo>log</mml:mo>
                                <mml:mrow>
                                    <mml:mo stretchy="true">(</mml:mo>
                                    <mml:mfrac>
                                        <mml:mrow>
                                            <mml:msub>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>S</mml:mi>
                                            </mml:msub>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:msub>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>w</mml:mi>
                                            </mml:msub>
                                            <mml:mo>+</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mrow>
                                            <mml:msub>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>S</mml:mi>
                                            </mml:msub>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:msubsup>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>w</mml:mi>
                                                <mml:mi>ref</mml:mi>
                                            </mml:msubsup>
                                            <mml:mo>+</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                    </mml:mfrac>
                                    <mml:mo stretchy="true">)</mml:mo>
                                </mml:mrow>
                                <mml:mo stretchy="true">]</mml:mo>
                            </mml:mrow>
                            <mml:mo>.</mml:mo>
                        </mml:math>
                        <label>(2.9)</label>
                    </disp-formula>
                </p>
                <p>As expected, the result corrects for the library size-dependence of 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> by dividing by the total foreground reads 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>S</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>D</mml:mi>
                                </mml:mrow>
                            </mml:msub>
                            <mml:msub>
                                <mml:mi>n</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>, and for differences in binding domain or read length by the last term. Assuming a random oligomer of length 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                                <mml:mi>ref</mml:mi>
                            </mml:msubsup>
                        </mml:math>
                    </inline-formula> for which there is no unspecific binding (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                                <mml:mi>ref</mml:mi>
                            </mml:msubsup>
                            <mml:mo>&#x2192;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mo>&#x221e;</mml:mo>
                        </mml:math>
                    </inline-formula>) as reference allows us to bring 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:msup>
                                    <mml:mi>M</mml:mi>
                                    <mml:mi>rel</mml:mi>
                                </mml:msup>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula> into the simple closed form
                    <disp-formula id="e10">
                        <mml:math display="block">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:msup>
                                    <mml:mi>M</mml:mi>
                                    <mml:mi>ref</mml:mi>
                                </mml:msup>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:munder>
                                <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>S</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>D</mml:mi>
                                </mml:mrow>
                            </mml:munder>
                            <mml:msub>
                                <mml:mi>n</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>log</mml:mo>
                            <mml:mrow>
                                <mml:mo stretchy="true">[</mml:mo>
                                <mml:mfrac>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>f</mml:mi>
                                            <mml:mi>S</mml:mi>
                                        </mml:msub>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:msub>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>S</mml:mi>
                                            </mml:msub>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:msubsup>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>w</mml:mi>
                                                <mml:mi>ref</mml:mi>
                                            </mml:msubsup>
                                            <mml:mo>+</mml:mo>
                                            <mml:mn>1</mml:mn>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mo>&#x2211;</mml:mo>
                                            <mml:mrow>
                                                <mml:mi>&#x03c3;</mml:mi>
                                                <mml:mo>&#x2208;</mml:mo>
                                                <mml:mi>D</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:msub>
                                            <mml:mi>f</mml:mi>
                                            <mml:mi>&#x03c3;</mml:mi>
                                        </mml:msub>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:msub>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>&#x03c3;</mml:mi>
                                            </mml:msub>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:msubsup>
                                                <mml:mi>L</mml:mi>
                                                <mml:mi>w</mml:mi>
                                                <mml:mi>ref</mml:mi>
                                            </mml:msubsup>
                                            <mml:mo>+</mml:mo>
                                            <mml:mn>1</mml:mn>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mo stretchy="true">]</mml:mo>
                            </mml:mrow>
                            <mml:mo>.</mml:mo>
                        </mml:math>
                        <label>(2.10)</label>
                    </disp-formula>
                </p>
                <p>Combining 
                    <xref ref-type="disp-formula" rid="e9">(2.9)</xref> with 
                    <xref ref-type="disp-formula" rid="e5">(2.5)</xref>, the output of the optimization procedure, and the reference 
                    <xref ref-type="disp-formula" rid="e10">(2.10)</xref> allows us to compute the logarithm of the dissociation constant of the RBP-RNA binding for a representative RBP binding site relative to a random oligomer of length 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msubsup>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                                <mml:mi>ref</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>5</mml:mn>
                        </mml:math>
                    </inline-formula>, a typical size for an RBP binding site. Relative 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>&#x2019;s also provide a measure to rank different binding specificities, even of different lengths, for a given RBP.</p>
            </sec>
        </sec>
        <sec id="sec5">
            <title>3. Implementation</title>
            <p>Our goal is to identify the parameters that maximize the likelihood of the library of oligomers given in 
                <xref ref-type="disp-formula" rid="e5">(2.5)</xref>. This is equivalent to optimizing 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mi>P</mml:mi>
                        <mml:mrow>
                            <mml:mo stretchy="true">(</mml:mo>
                            <mml:mi>D</mml:mi>
                            <mml:mo stretchy="true">)</mml:mo>
                        </mml:mrow>
                    </mml:math>
                </inline-formula>, or rather its logarithm to avoid underflows. While the library 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mi>D</mml:mi>
                    </mml:math>
                </inline-formula>, the copy number of a read 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>n</mml:mi>
                            <mml:mi>S</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>, the read and binding site lengths 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>L</mml:mi>
                            <mml:mi>S</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula> and 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>L</mml:mi>
                            <mml:mi>w</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>, and &#x2013; with some limitations &#x2013; the frequency priors 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>f</mml:mi>
                            <mml:mi>S</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula> are given from our data, the position-specific binding of the RBP described by the PWM and the position-unspecific binding described by 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>E</mml:mi>
                            <mml:mn>0</mml:mn>
                        </mml:msub>
                    </mml:math>
                </inline-formula> have to be inferred. Eventually, we want to obtain the PWM, whereas 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>E</mml:mi>
                            <mml:mn>0</mml:mn>
                        </mml:msub>
                    </mml:math>
                </inline-formula> represents a hidden parameter which will be inferred via the expectation-maximization procedure. In principle, this would also apply to the concentration 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mi>c</mml:mi>
                    </mml:math>
                </inline-formula> but none of our final expressions depend on 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mi>c</mml:mi>
                    </mml:math>
                </inline-formula> any more due to linearization. Before diving into the details of the EM procedure&#x2019;s implementation we would like to comment on how to infer the frequency priors 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>f</mml:mi>
                            <mml:mi>S</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>.</p>
            <sec id="sec6">
                <title>3.1 Construction of the frequency priors 
                    <italic toggle="yes">f</italic>
                    <sub>
                        <italic toggle="yes">S</italic>
                    </sub> from a Markov model</title>
                <p>RNA Bind&#x2019;n Seq data does not only comprise libraries of reads pulled down by specific RBPs at non-vanishing RBP concentrations, but also libraries of input reads. The oligonucleotides that were used for RBP affinity-based selection were short, typically 20 nucleotides in length (c.f. Ref. 
                    <xref ref-type="bibr" rid="ref10">10</xref>). The number of possible 20mers is 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msup>
                                <mml:mn>4</mml:mn>
                                <mml:mn>20</mml:mn>
                            </mml:msup>
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>12</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, much larger than the library sizes of 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x223c;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>7</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. Thus, even in the absence of selection (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>c</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>0</mml:mn>
                        </mml:math>
                    </inline-formula>), the expected overlap of two libraries is extremely small.</p>
                <p>To preserve the statistical power of the foreground pool, i.e. use all the reads detected in the foreground sample in the analysis, even though they were not represented in the background sample, we have to predict the frequency of foreground reads under the assumption of no selection for binding the RBP. A commonly used approach for this type of problem is to train a Markov model from the background pool and construct the expected frequency of each read in the foreground from the trained model, just as in Ref. 
                    <xref ref-type="bibr" rid="ref11">11</xref>. For an completely unbiased process of oligomer synthesis, capture and sequencing the degree 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>d</mml:mi>
                        </mml:math>
                    </inline-formula> of the Markov model would be 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0</mml:mn>
                        </mml:math>
                    </inline-formula>, i.e. each base would be equally likely to occur at any position in the oligomer, and all 20mers would have the same prior frequency of occurrence 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>. However, various types of biases can lead to some sequences, with specific composition of short nucleotide motifs, being present in the data more often than others. In principle, the higher the degree of the Markov model, the larger the sequence context that can be resolved. However, an over-parametrization of the Markov-model needs to be avoided. As binding specificities of RBPs tend to be short and our main aim is to appropriately detect enrichment of motifs on this length scale, we used a Markov model of 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>d</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>4</mml:mn>
                        </mml:math>
                    </inline-formula>, which seemed to give a good tradeoff between the accuracy of the background read frequency prediction, size of Bind&#x2019;n Seq libraries, and available compuational resources. The predicted frequency priors 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> &#x2013; and kmer-frequencies, accordingly &#x2013; need to be normalized such that their sum over the background frequency pool 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mi>B</mml:mi>
                        </mml:math>
                    </inline-formula> satisfies 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mo>&#x2211;</mml:mo>
                                <mml:mrow>
                                    <mml:mi>S</mml:mi>
                                    <mml:mo>&#x2208;</mml:mo>
                                    <mml:mi>B</mml:mi>
                                </mml:mrow>
                            </mml:msub>
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mn>1</mml:mn>
                        </mml:math>
                    </inline-formula>.</p>
            </sec>
            <sec id="sec7">
                <title>3.2 Inferring PWMs by expectation maximization</title>
                <p>Having constructed our model with the final expression 
                    <xref ref-type="disp-formula" rid="e5">(2.5)</xref>, as well as the background frequencies 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>f</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> described in the subsection above, the remaining task is to identify the PWM and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> maximize the log-likelihood 
                    <xref ref-type="disp-formula" rid="e5">(2.5)</xref>. To this end, we rely on the expectation maximization algorithm.
                    <sup>
                        <xref ref-type="bibr" rid="ref12">12</xref>
                    </sup>
                    <sup>,</sup>
                    <sup>
                        <xref ref-type="bibr" rid="ref13">13</xref>
                    </sup> Provided that only some of our model parameters can be directly inferred from the data, the algorithm optimizes the&#x201d;hidden&#x201d; parameters to maximize 
                    <xref ref-type="disp-formula" rid="e5">(2.5)</xref>. The expectation-maximization procedure (EM) can be divided into the following steps:
                    <list list-type="order">
                        <list-item>
                            <label>1.</label>
                            <p>Initialize 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mi>E</mml:mi>
                                            <mml:mn>0</mml:mn>
                                        </mml:msub>
                                    </mml:math>
                                </inline-formula> and the PWM elements 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msubsup>
                                            <mml:mi>m</mml:mi>
                                            <mml:mi>i</mml:mi>
                                            <mml:mi>&#x03b1;</mml:mi>
                                        </mml:msubsup>
                                    </mml:math>
                                </inline-formula> with respectively well-defined real numbers, i.e. 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mi>E</mml:mi>
                                            <mml:mn>0</mml:mn>
                                        </mml:msub>
                                        <mml:mo>&#x2208;</mml:mo>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:mo>&#x221e;</mml:mo>
                                            <mml:mo>,</mml:mo>
                                            <mml:mn>0</mml:mn>
                                            <mml:mo stretchy="true">]</mml:mo>
                                        </mml:mrow>
                                    </mml:math>
                                </inline-formula> and 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mo>&#x2211;</mml:mo>
                                            <mml:mi>&#x03b1;</mml:mi>
                                        </mml:msub>
                                        <mml:msubsup>
                                            <mml:mi>m</mml:mi>
                                            <mml:mi>i</mml:mi>
                                            <mml:mi>&#x03b1;</mml:mi>
                                        </mml:msubsup>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>1</mml:mn>
                                        <mml:mspace width="0.5em"/>
                                        <mml:mo>&#x2200;</mml:mo>
                                        <mml:mspace width="0.5em"/>
                                        <mml:mi>i</mml:mi>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>1</mml:mn>
                                        <mml:mo>,</mml:mo>
                                        <mml:mo>&#x2026;</mml:mo>
                                        <mml:mo>,</mml:mo>
                                        <mml:msub>
                                            <mml:mi>L</mml:mi>
                                            <mml:mi>w</mml:mi>
                                        </mml:msub>
                                    </mml:math>
                                </inline-formula>. This can either be done in an entirely unbiased way or by pre-determining some motifs and initializing PWMs with values reflecting those motifs.</p>
                        </list-item>
                        <list-item>
                            <label>2.</label>
                            <p>Recalculate 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mi>E</mml:mi>
                                            <mml:mn>0</mml:mn>
                                        </mml:msub>
                                    </mml:math>
                                </inline-formula> to maximize 
                                <xref ref-type="disp-formula" rid="e5">(2.5)</xref> holding the PWM fixed, which amounts to finding the root of
                                <disp-formula id="e11">
                                    <mml:math display="block">
                                        <mml:mtable columnalign="left" displaystyle="true">
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mfrac>
                                                        <mml:mi>&#x2202;</mml:mi>
                                                        <mml:mrow>
                                                            <mml:mi>&#x2202;</mml:mi>
                                                            <mml:msub>
                                                                <mml:mi>E</mml:mi>
                                                                <mml:mn>0</mml:mn>
                                                            </mml:msub>
                                                        </mml:mrow>
                                                    </mml:mfrac>
                                                    <mml:mo>log</mml:mo>
                                                    <mml:mi>P</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:mi>D</mml:mi>
                                                        <mml:mo>|</mml:mo>
                                                        <mml:mi>M</mml:mi>
                                                        <mml:mo>,</mml:mo>
                                                        <mml:msub>
                                                            <mml:mi>E</mml:mi>
                                                            <mml:mn>0</mml:mn>
                                                        </mml:msub>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:munder>
                                                        <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                                        <mml:mrow>
                                                            <mml:mi>S</mml:mi>
                                                            <mml:mo>&#x2208;</mml:mo>
                                                            <mml:mi>D</mml:mi>
                                                        </mml:mrow>
                                                    </mml:munder>
                                                    <mml:msub>
                                                        <mml:mi>n</mml:mi>
                                                        <mml:mi>S</mml:mi>
                                                    </mml:msub>
                                                    <mml:mspace width="0.1em"/>
                                                    <mml:mfrac>
                                                        <mml:mrow>
                                                            <mml:mrow>
                                                                <mml:mo stretchy="true">(</mml:mo>
                                                                <mml:msub>
                                                                    <mml:mi>L</mml:mi>
                                                                    <mml:mi>S</mml:mi>
                                                                </mml:msub>
                                                                <mml:mo>&#x2212;</mml:mo>
                                                                <mml:msub>
                                                                    <mml:mi>L</mml:mi>
                                                                    <mml:mi>w</mml:mi>
                                                                </mml:msub>
                                                                <mml:mo>+</mml:mo>
                                                                <mml:mn>1</mml:mn>
                                                                <mml:mo stretchy="true">)</mml:mo>
                                                            </mml:mrow>
                                                            <mml:msup>
                                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                                <mml:msub>
                                                                    <mml:mi>E</mml:mi>
                                                                    <mml:mn>0</mml:mn>
                                                                </mml:msub>
                                                            </mml:msup>
                                                        </mml:mrow>
                                                        <mml:mrow>
                                                            <mml:msup>
                                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                                <mml:mrow>
                                                                    <mml:mi>E</mml:mi>
                                                                    <mml:mrow>
                                                                        <mml:mo stretchy="true">(</mml:mo>
                                                                        <mml:mi>S</mml:mi>
                                                                        <mml:mo stretchy="true">)</mml:mo>
                                                                    </mml:mrow>
                                                                </mml:mrow>
                                                            </mml:msup>
                                                            <mml:mo>+</mml:mo>
                                                            <mml:mrow>
                                                                <mml:mo stretchy="true">(</mml:mo>
                                                                <mml:msub>
                                                                    <mml:mi>L</mml:mi>
                                                                    <mml:mi>S</mml:mi>
                                                                </mml:msub>
                                                                <mml:mo>&#x2212;</mml:mo>
                                                                <mml:msub>
                                                                    <mml:mi>L</mml:mi>
                                                                    <mml:mi>w</mml:mi>
                                                                </mml:msub>
                                                                <mml:mo>+</mml:mo>
                                                                <mml:mn>1</mml:mn>
                                                                <mml:mo stretchy="true">)</mml:mo>
                                                            </mml:mrow>
                                                            <mml:msup>
                                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                                <mml:msub>
                                                                    <mml:mi>E</mml:mi>
                                                                    <mml:mn>0</mml:mn>
                                                                </mml:msub>
                                                            </mml:msup>
                                                        </mml:mrow>
                                                    </mml:mfrac>
                                                </mml:mtd>
                                            </mml:mtr>
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mspace width="10em"/>
                                                    <mml:mo>&#x2212;</mml:mo>
                                                    <mml:munder>
                                                        <mml:mo movablelimits="false">&#x2211;</mml:mo>
                                                        <mml:mrow>
                                                            <mml:mi>S</mml:mi>
                                                            <mml:mo>&#x2208;</mml:mo>
                                                            <mml:mi>D</mml:mi>
                                                        </mml:mrow>
                                                    </mml:munder>
                                                    <mml:msub>
                                                        <mml:mi>n</mml:mi>
                                                        <mml:mi>S</mml:mi>
                                                    </mml:msub>
                                                    <mml:mspace width="0.1em"/>
                                                    <mml:mfrac>
                                                        <mml:mrow>
                                                            <mml:msub>
                                                                <mml:mo>&#x2211;</mml:mo>
                                                                <mml:mrow>
                                                                    <mml:mi>&#x03c1;</mml:mi>
                                                                    <mml:mo>&#x2208;</mml:mo>
                                                                    <mml:mi>D</mml:mi>
                                                                </mml:mrow>
                                                            </mml:msub>
                                                            <mml:msub>
                                                                <mml:mi>f</mml:mi>
                                                                <mml:mi>&#x03c1;</mml:mi>
                                                            </mml:msub>
                                                            <mml:mrow>
                                                                <mml:mo stretchy="true">(</mml:mo>
                                                                <mml:msub>
                                                                    <mml:mi>L</mml:mi>
                                                                    <mml:mi>&#x03c1;</mml:mi>
                                                                </mml:msub>
                                                                <mml:mo>&#x2212;</mml:mo>
                                                                <mml:msub>
                                                                    <mml:mi>L</mml:mi>
                                                                    <mml:mi>w</mml:mi>
                                                                </mml:msub>
                                                                <mml:mo>+</mml:mo>
                                                                <mml:mn>1</mml:mn>
                                                                <mml:mo stretchy="true">)</mml:mo>
                                                            </mml:mrow>
                                                            <mml:msup>
                                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                                <mml:msub>
                                                                    <mml:mi>E</mml:mi>
                                                                    <mml:mn>0</mml:mn>
                                                                </mml:msub>
                                                            </mml:msup>
                                                        </mml:mrow>
                                                        <mml:mrow>
                                                            <mml:msub>
                                                                <mml:mo>&#x2211;</mml:mo>
                                                                <mml:mrow>
                                                                    <mml:mi>&#x03c3;</mml:mi>
                                                                    <mml:mo>&#x2208;</mml:mo>
                                                                    <mml:mi>D</mml:mi>
                                                                </mml:mrow>
                                                            </mml:msub>
                                                            <mml:msub>
                                                                <mml:mi>f</mml:mi>
                                                                <mml:mi>&#x03c3;</mml:mi>
                                                            </mml:msub>
                                                            <mml:mrow>
                                                                <mml:mo stretchy="true">(</mml:mo>
                                                                <mml:msup>
                                                                    <mml:mi mathvariant="normal">e</mml:mi>
                                                                    <mml:mrow>
                                                                        <mml:mi>E</mml:mi>
                                                                        <mml:mrow>
                                                                            <mml:mo stretchy="true">(</mml:mo>
                                                                            <mml:mi>&#x03c3;</mml:mi>
                                                                            <mml:mo stretchy="true">)</mml:mo>
                                                                        </mml:mrow>
                                                                    </mml:mrow>
                                                                </mml:msup>
                                                                <mml:mo>+</mml:mo>
                                                                <mml:mrow>
                                                                    <mml:mo stretchy="true">(</mml:mo>
                                                                    <mml:msub>
                                                                        <mml:mi>L</mml:mi>
                                                                        <mml:mi>&#x03c3;</mml:mi>
                                                                    </mml:msub>
                                                                    <mml:mo>+</mml:mo>
                                                                    <mml:msub>
                                                                        <mml:mi>L</mml:mi>
                                                                        <mml:mi>w</mml:mi>
                                                                    </mml:msub>
                                                                    <mml:mo>+</mml:mo>
                                                                    <mml:mn>1</mml:mn>
                                                                    <mml:mo stretchy="true">)</mml:mo>
                                                                </mml:mrow>
                                                                <mml:msup>
                                                                    <mml:mi mathvariant="normal">e</mml:mi>
                                                                    <mml:msub>
                                                                        <mml:mi>E</mml:mi>
                                                                        <mml:mn>0</mml:mn>
                                                                    </mml:msub>
                                                                </mml:msup>
                                                                <mml:mo stretchy="true">)</mml:mo>
                                                            </mml:mrow>
                                                        </mml:mrow>
                                                    </mml:mfrac>
                                                </mml:mtd>
                                            </mml:mtr>
                                            <mml:mtr>
                                                <mml:mtd>
                                                    <mml:mspace width="9em"/>
                                                    <mml:mover>
                                                        <mml:mo>=</mml:mo>
                                                        <mml:mo>!</mml:mo>
                                                    </mml:mover>
                                                    <mml:mspace width="0.5em"/>
                                                    <mml:mn>0</mml:mn>
                                                    <mml:mo>.</mml:mo>
                                                </mml:mtd>
                                            </mml:mtr>
                                        </mml:mtable>
                                    </mml:math>
                                    <label>(3.1)</label>
                                </disp-formula>
                            </p>
                            <p>We employ a Brent minimization algorithm from the GSL library
                                <sup>
                                    <xref ref-type="bibr" rid="ref14">14</xref>
                                </sup> to the negative value of the log-likelihood in 
                                <xref ref-type="disp-formula" rid="e10">(2.10)</xref> to maximize it.</p>
                        </list-item>
                        <list-item>
                            <label>3.</label>
                            <p>Updating the PWM with the new 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mi>E</mml:mi>
                                            <mml:mn>0</mml:mn>
                                        </mml:msub>
                                    </mml:math>
                                </inline-formula> from the previous step, splitting the data set into 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mi>L</mml:mi>
                                            <mml:mi>w</mml:mi>
                                        </mml:msub>
                                    </mml:math>
                                </inline-formula>-mers 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>s</mml:mi>
                                    </mml:math>
                                </inline-formula> (on a read 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>S</mml:mi>
                                    </mml:math>
                                </inline-formula>) and adding the weight
                                <disp-formula id="e12">
                                    <mml:math display="block">
                                        <mml:mfrac>
                                            <mml:mrow>
                                                <mml:mi>P</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:mi>s</mml:mi>
                                                    <mml:mo>|</mml:mo>
                                                    <mml:mi>c</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:mi>M</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:msub>
                                                        <mml:mi>E</mml:mi>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:msub>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:mi>P</mml:mi>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:mi>S</mml:mi>
                                                    <mml:mo>|</mml:mo>
                                                    <mml:mi>c</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:mi>M</mml:mi>
                                                    <mml:mo>,</mml:mo>
                                                    <mml:msub>
                                                        <mml:mi>E</mml:mi>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:msub>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:mfrac>
                                        <mml:mo>=</mml:mo>
                                        <mml:msub>
                                            <mml:mi>n</mml:mi>
                                            <mml:mi>S</mml:mi>
                                        </mml:msub>
                                        <mml:mfrac>
                                            <mml:msup>
                                                <mml:mi mathvariant="normal">e</mml:mi>
                                                <mml:mrow>
                                                    <mml:mi>E</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mo stretchy="true">(</mml:mo>
                                                        <mml:mi>s</mml:mi>
                                                        <mml:mo stretchy="true">)</mml:mo>
                                                    </mml:mrow>
                                                </mml:mrow>
                                            </mml:msup>
                                            <mml:mrow>
                                                <mml:msup>
                                                    <mml:mi mathvariant="normal">e</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mi>E</mml:mi>
                                                        <mml:mrow>
                                                            <mml:mo stretchy="true">(</mml:mo>
                                                            <mml:mi>S</mml:mi>
                                                            <mml:mo stretchy="true">)</mml:mo>
                                                        </mml:mrow>
                                                    </mml:mrow>
                                                </mml:msup>
                                                <mml:mo>+</mml:mo>
                                                <mml:msup>
                                                    <mml:mi mathvariant="normal">e</mml:mi>
                                                    <mml:msub>
                                                        <mml:mi>E</mml:mi>
                                                        <mml:mn>0</mml:mn>
                                                    </mml:msub>
                                                </mml:msup>
                                                <mml:mn>0</mml:mn>
                                                <mml:mrow>
                                                    <mml:mo stretchy="true">(</mml:mo>
                                                    <mml:msub>
                                                        <mml:mi>L</mml:mi>
                                                        <mml:mi>S</mml:mi>
                                                    </mml:msub>
                                                    <mml:mo>&#x2212;</mml:mo>
                                                    <mml:msub>
                                                        <mml:mi>L</mml:mi>
                                                        <mml:mi>w</mml:mi>
                                                    </mml:msub>
                                                    <mml:mo>+</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                    <mml:mo stretchy="true">)</mml:mo>
                                                </mml:mrow>
                                            </mml:mrow>
                                        </mml:mfrac>
                                    </mml:math>
                                    <label>(3.2)</label>
                                </disp-formula>to all entries in the PWM corresponding to 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>s</mml:mi>
                                    </mml:math>
                                </inline-formula>. Repeat that for all 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>s</mml:mi>
                                    </mml:math>
                                </inline-formula> in 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>S</mml:mi>
                                    </mml:math>
                                </inline-formula>, and over all 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>S</mml:mi>
                                    </mml:math>
                                </inline-formula> in 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi>D</mml:mi>
                                    </mml:math>
                                </inline-formula>. Renormalize the PWM again by enforcing 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mo>&#x2211;</mml:mo>
                                            <mml:mi>&#x03b1;</mml:mi>
                                        </mml:msub>
                                        <mml:msubsup>
                                            <mml:mi>m</mml:mi>
                                            <mml:mi>i</mml:mi>
                                            <mml:mi>&#x03b1;</mml:mi>
                                        </mml:msubsup>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>1</mml:mn>
                                        <mml:mspace width="0.5em"/>
                                        <mml:mo>&#x2200;</mml:mo>
                                        <mml:mspace width="0.5em"/>
                                        <mml:mi>i</mml:mi>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>1</mml:mn>
                                        <mml:mo>,</mml:mo>
                                        <mml:mo>&#x2026;</mml:mo>
                                        <mml:mo>,</mml:mo>
                                        <mml:msub>
                                            <mml:mi>L</mml:mi>
                                            <mml:mi>w</mml:mi>
                                        </mml:msub>
                                    </mml:math>
                                </inline-formula>.</p>
                        </list-item>
                        <list-item>
                            <label>4.</label>
                            <p>Repeat the previous two steps until convergence. We terminate the iteration when the quadratic difference between the current and the updated PWM is less than 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msup>
                                            <mml:mn>10</mml:mn>
                                            <mml:mrow>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:mn>6</mml:mn>
                                            </mml:mrow>
                                        </mml:msup>
                                    </mml:math>
                                </inline-formula> on average per entry, i.e. for 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:msub>
                                            <mml:mi>L</mml:mi>
                                            <mml:mi>w</mml:mi>
                                        </mml:msub>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>5</mml:mn>
                                    </mml:math>
                                </inline-formula> the quadratic difference is less than 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mn>5</mml:mn>
                                        <mml:mo>&#x00d7;</mml:mo>
                                        <mml:mn>4</mml:mn>
                                        <mml:mo>&#x00d7;</mml:mo>
                                        <mml:msup>
                                            <mml:mn>10</mml:mn>
                                            <mml:mrow>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:mn>6</mml:mn>
                                            </mml:mrow>
                                        </mml:msup>
                                    </mml:math>
                                </inline-formula>. Usually, this takes 
                                <inline-formula>
                                    <mml:math display="inline">
                                        <mml:mi mathvariant="script">O</mml:mi>
                                        <mml:mrow>
                                            <mml:mo stretchy="true">(</mml:mo>
                                            <mml:mn>10</mml:mn>
                                            <mml:mo stretchy="true">)</mml:mo>
                                        </mml:mrow>
                                    </mml:math>
                                </inline-formula> iterations.</p>
                        </list-item>
                    </list>
                </p>
                <p>Our code is written in C++ and python and is publicly available.
                    <sup>
                        <xref ref-type="bibr" rid="ref15">15</xref>
                    </sup>
                </p>
            </sec>
        </sec>
        <sec id="sec8" sec-type="results">
            <title>4. Results</title>
            <p>We analyzed all RNA Bind&#x2019;n Seq data available in ENCODE
                <sup>
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup> for 111 RBPs. For each RBP, we investigated 11 different binding site lengths, 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>L</mml:mi>
                            <mml:mi>w</mml:mi>
                        </mml:msub>
                        <mml:mo>=</mml:mo>
                        <mml:mn>5,6,7</mml:mn>
                        <mml:mo>,</mml:mo>
                        <mml:mo>&#x2026;</mml:mo>
                        <mml:mn>,14,15</mml:mn>
                    </mml:math>
                </inline-formula>nts, where the lower limit was chosen as 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mn>5</mml:mn>
                        <mml:mi>nts</mml:mi>
                    </mml:math>
                </inline-formula> in agreement with the literature, and the upper limit of 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mn>15</mml:mn>
                    </mml:math>
                </inline-formula>nts was determined by the available RAM. Moreover, as most reads are 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mn>20</mml:mn>
                    </mml:math>
                </inline-formula>nts long, a larger 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>L</mml:mi>
                            <mml:mi>w</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula> approaching this value would also not be warranted. For each of pair of RBP and 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>L</mml:mi>
                            <mml:mi>w</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>, we set up 16 runs with randomly initialized parameters and 4 runs in which the initial PWM was set close to the reported consensus motif (from RBP Bind&#x2019;n Seq data).
                <sup>
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup> We generally used four CPUs with three hours of maximum walltime per RBP, and eight CPUs when necessary (e.g. because the read length was larger than 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mn>20</mml:mn>
                    </mml:math>
                </inline-formula>nts. For the overwhelming majority of runs we found these resources to be sufficient, though a few runs did not complete during this time, thus not all RBPs had exactly the same number of finished runs. 
                <xref ref-type="fig" rid="f1">Figure 1</xref> gives an overview of the run statistics. As a local minimum was not found in all randomly initialized runs, the figure shows the fraction of &#x201c;convergent&#x201d; runs, i.e. runs in which the final log-likelihood was larger than the initial one. In addition, we observed that even convergent runs were sometimes dominated by the unspecific binding term 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msup>
                            <mml:mi mathvariant="normal">e</mml:mi>
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                        </mml:msup>
                    </mml:math>
                </inline-formula>. Since in this case the inferred PWM is not meaningful as it does not explain the data as well as the sequence-unspecific term, we report the &#x201c;specific&#x201d; cases, where 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>E</mml:mi>
                            <mml:mn>0</mml:mn>
                        </mml:msub>
                        <mml:mo>&lt;</mml:mo>
                        <mml:mn>0</mml:mn>
                    </mml:math>
                </inline-formula>. As shown in 
                <xref ref-type="fig" rid="f1">Figure 1</xref> all RBPs with a large fraction (
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mo>&gt;</mml:mo>
                        <mml:mn>0.4</mml:mn>
                    </mml:math>
                </inline-formula>) of convergent runs, led to the recovery of a specific motif. In contrast, RBPs with a low fraction (
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mo>&lt;</mml:mo>
                        <mml:mn>0.3</mml:mn>
                    </mml:math>
                </inline-formula>) of convergent runs had solutions dominated by unspecific binding.</p>
            <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                <label>Figure 1. </label>
                <caption>
                    <title>Summary of fraction of convergent and specific outcomes of 16 runs with random initializations per 
                        <inline-formula>
                            <mml:math display="inline">
                                <mml:msub>
                                    <mml:mi>L</mml:mi>
                                    <mml:mi>w</mml:mi>
                                </mml:msub>
                                <mml:mo>&#x2208;</mml:mo>
                                <mml:mn>5,6,7</mml:mn>
                                <mml:mo>,</mml:mo>
                                <mml:mo>&#x2026;</mml:mo>
                                <mml:mn>,14,15</mml:mn>
                            </mml:math>
                        </inline-formula>.</title>
                    <p>&#x201c;Convergent&#x201d; means that the final log-likelihood was larger than the initial one, &#x201c;specific&#x201d; means that 
                        <inline-formula>
                            <mml:math display="inline">
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo>&lt;</mml:mo>
                                <mml:mn>0</mml:mn>
                            </mml:math>
                        </inline-formula>, i.e. the non-specific binding term has a limited contribution to the overall energy of RBP-oligomer interaction. Some runs did not finish within the given maximum walltime, therefore the absolute numbers of runs was not exactly the same for all RBPs. All runs done for a given RBP, regardless of the 
                        <inline-formula>
                            <mml:math display="inline">
                                <mml:msub>
                                    <mml:mi>L</mml:mi>
                                    <mml:mi>w</mml:mi>
                                </mml:msub>
                            </mml:math>
                        </inline-formula> are shown together.</p>
                </caption>
                <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure1.gif"/>
            </fig>
            <p>Following 
                <xref ref-type="disp-formula" rid="e9">eq. (2.9)</xref>, we calculated the dissociation constants 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>K</mml:mi>
                            <mml:mi mathvariant="normal">D</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>s for all convergent and specific outcomes of the EM procedure. Dissociation constants were given relative to a PWM of length 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mn>5</mml:mn>
                    </mml:math>
                </inline-formula>, with equal probabilities for all four nucleotides (
                <inline-formula>
                    <mml:math display="inline">
                        <mml:mn>0.25</mml:mn>
                    </mml:math>
                </inline-formula>) disregarding unspecific binding i.e. 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>E</mml:mi>
                            <mml:mn>0</mml:mn>
                        </mml:msub>
                        <mml:mo>&#x2192;</mml:mo>
                        <mml:mo>&#x2212;</mml:mo>
                        <mml:mo>&#x221e;</mml:mo>
                    </mml:math>
                </inline-formula>. This allows us to compare the binding strength of different motifs for an RBP with each other and also of motifs for different RBPs, as done in 
                <xref ref-type="fig" rid="f2">Figure 2</xref>. Below we highlight the motif and relative 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>K</mml:mi>
                            <mml:mi mathvariant="normal">D</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>s for the motifs with the lowest and highest 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>K</mml:mi>
                            <mml:mi mathvariant="normal">D</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula> from random initialization, as well as the motif with the lowest 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>K</mml:mi>
                            <mml:mi mathvariant="normal">D</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula> obtained by initializing with the consensus PWM provided for the respective RBP by Ref. 
                <xref ref-type="bibr" rid="ref10">10</xref>. As a first plausibility check, we find that longer motifs tend to have lower 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>K</mml:mi>
                            <mml:mi mathvariant="normal">D</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula>s, i.e. higher binding affinities, which is expected due to the larger number of bonds the larger binding sites can form with the RBP. We found convergent and specific results in 82 of 111 cases. Comparing the lowest 
                <inline-formula>
                    <mml:math display="inline">
                        <mml:msub>
                            <mml:mi>K</mml:mi>
                            <mml:mi mathvariant="normal">D</mml:mi>
                        </mml:msub>
                    </mml:math>
                </inline-formula> motifs to the ones obtained by starting from the motifs found by kmer-enrichment analysis,
                <sup>
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup> we found that the consensus motif is contained in or differs in at most one position in 48 cases. 19 of the highest affinity motifs did not contain the ENCODE consensus; in 17 of these cases our algorithm found a less complex motif (mostly poly(A)), while in two cases (for CPEB1 and EIF4H) the motifs found by our algorithm are more complex. In 14 cases, the motif found by our algorithm appeared related to the consensus, but it was less polarized or differed from consensus in more than one position. For one RBP, PTBP3, no motif was reported based on RNA Bind&#x2019;n-Seq data, while we found a convergent and specific poly(A) motif. The seemingly large bias towards poly(A) in case of no other motif found is consistent with adenine being the most prominent nucleotide in all of the investigated libraries.</p>
            <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                <label>Figure 2. </label>
                <caption>
                    <title>Inferred RBP binding specificities.</title>
                    <p>
                        <inline-formula>
                            <mml:math display="inline">
                                <mml:msub>
                                    <mml:mi>K</mml:mi>
                                    <mml:mi mathvariant="normal">D</mml:mi>
                                </mml:msub>
                            </mml:math>
                        </inline-formula>&#x2019;s are relative to an unpolarized PWM of length 5 with no unspecific binding. Top/middle: highest/lowest affinity motifs (lowest/highest relative 
                        <inline-formula>
                            <mml:math display="inline">
                                <mml:msub>
                                    <mml:mi>K</mml:mi>
                                    <mml:mi mathvariant="normal">D</mml:mi>
                                </mml:msub>
                            </mml:math>
                        </inline-formula>) from random initialization, bottom: highest affinity motif from initializations with the consensus motif.</p>
                </caption>
                <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure2.gif"/>
            </fig>
            <p>Below we assess the correspondence of the results given by our model with prior knowledge for a few proteins that have been extensively studied. Examples where we found motifs consistent with prior literature are RBFOX2, NOVA1, PUM1, PUF60, ELAVL4, hnRNPC, and PCBP1. In contrast, for CELF1, EWSR1, hnRNPD, and hnRNPK we either did not obtain a convergent and specific result or the result was clearly different from the reported consensus. We also discuss CPEB1, EIF4H, and PTBP3 as cases where our model appears to deliver new information. Note that as we start our analysis from sequenced DNAs we use T in the sequence logos, though of course, the RNA oligonucleotides contained U&#x2019;s.</p>
            <sec id="sec9">
                <title>4.1 RBFOX2</title>
                <p>RBFOX2 is a key regulator of alternative splicing
                    <sup>
                        <xref ref-type="bibr" rid="ref16">16</xref>
                    </sup> that was extensively studied with a variety of methods (e.g. Ref. 
                    <xref ref-type="bibr" rid="ref17">17</xref>). The RBFOX2 Bind&#x2019;n Seq dataset
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> consists in nine libraries obtained using nine different protein concentrations and two protein-free control libraries, all containing reads of 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>50</mml:mn>
                        </mml:math>
                    </inline-formula> nucleotides (nts) in length, including the adaptor. RBFOX2 is widely used to benchmark computational analysis methods (c.f. Ref. 
                    <xref ref-type="bibr" rid="ref18">18</xref>) and thus the corresponding dataset was carefully generated, to include multiple, high-quality libraries. Established techniques like kmer-enrichment analysis and the streaming-kmer-algorithm (SKA) predict a consensus 6mer TGCATG as the most prominent motif followed by other GCATG-containing 6mers.
                    <sup>
                        <xref ref-type="bibr" rid="ref18">18</xref>
                    </sup> Our results in 
                    <xref ref-type="fig" rid="f3">Figure 3</xref> reproduce the importance of the GCATG morif, but also the slight T bias at the preceding position, see 
                    <xref ref-type="fig" rid="f3">Figure 3(A)</xref>. Moreover, we find the subdominant PWM 
                    <xref ref-type="fig" rid="f3">Figure 3(B)</xref> which shares a CATG core with the highest affinity motif, but over-emphasizes the A bias downstream. These results demonstrate that our algorithm identifies the known consensus for RBFOX2.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for RBFOX2 obtained from our model with random initialization.</title>
                        <p>The highest affinity motif contains the known consensus GCATG, flanked by positions with low A/T bias. Both high and low affinity motifs share a CATG core.</p>
                    </caption>
                    <graphic id="gr3" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure3.gif"/>
                </fig>
                <p>The log-values of relative 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>&#x2019;s are in the range 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.48</mml:mn>
                        </mml:math>
                    </inline-formula> - 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>2.39</mml:mn>
                        </mml:math>
                    </inline-formula>, meaning that the two motifs shown in 
                    <xref ref-type="fig" rid="f3">Figure 3(A)</xref> and 
                    <xref ref-type="fig" rid="f3">(B)</xref> differ by 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:mn>100</mml:mn>
                        </mml:math>
                    </inline-formula> fold in their 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>. The unspecific binding term in the run that yielded the motif from panel (A) was 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>15.57</mml:mn>
                        </mml:math>
                    </inline-formula>, while the one from the panel (B) run was 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.49</mml:mn>
                        </mml:math>
                    </inline-formula>. The overall log-likelihoods of the dataset were 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>8.51</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> vs. 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>8.77</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, respectively. Thus, the higher affinity motif corresponds to a higher peak in the likelihood landscape, and accounts for a more specific interaction. The binding specificity of RBFOX2 in 
                    <xref ref-type="fig" rid="f3">Figure 3(A)</xref> is similar to that of the closely related RBFOX3 (c.f. 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>).</p>
            </sec>
            <sec id="sec10">
                <title>4.2 NOVA1</title>
                <p>The neuro-oncological ventral antigen 1 (NOVA1) is a neuron-specific RBP
                    <sup>
                        <xref ref-type="bibr" rid="ref19">19</xref>
                    </sup> found by SELEX experiments to bind a triple CAT-repeat.
                    <sup>
                        <xref ref-type="bibr" rid="ref20">20</xref>
                    </sup> This is confirmed by the motif identified by our algorithm (
                    <xref ref-type="fig" rid="f4">Figure 4</xref>). In contrast, the streaming kmer-enrichment analysis of RNA Bind&#x2019;n Seq data in ENCODE predicted a single CAT.
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup>
                </p>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> NOVA1 motifs obtained by our model with random initialization.</title>
                        <p>(A) shows the characteristic triple CAT-repeat.</p>
                    </caption>
                    <graphic id="gr4" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure4.gif"/>
                </fig>
                <p>Relative log-
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> values ranged from 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.76</mml:mn>
                        </mml:math>
                    </inline-formula> to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>1.63</mml:mn>
                        </mml:math>
                    </inline-formula>, the unspecific binding term was 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>6.10</mml:mn>
                        </mml:math>
                    </inline-formula> for 
                    <xref ref-type="fig" rid="f4">Figure 4(A)</xref> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.61</mml:mn>
                        </mml:math>
                    </inline-formula> for 
                    <xref ref-type="fig" rid="f4">Figure 4(B)</xref>, and the log-likelihoods of the data were 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.80</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>7.91</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, respectively.</p>
            </sec>
            <sec id="sec11">
                <title>4.3 PUM1</title>
                <p>Pumilio homolog 1 is a well-studied member of the PUF famility of proteins, which binds to 3&#x2019;UTRs of a variety of mRNAs and regulates their translation.
                    <sup>
                        <xref ref-type="bibr" rid="ref21">21</xref>
                    </sup> While immunoprecipitation (IP) experiments found its binding specificity to be TGTAHATA,
                    <sup>
                        <xref ref-type="bibr" rid="ref22">22</xref>
                    </sup> SKA analysis of RNA Bind&#x2019;n Seq yielded TGTAH without further context.
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> In contrast, we do find the full motif in the Bind&#x2019;n Seq data (
                    <xref ref-type="fig" rid="f5">Figure 5</xref>). All obtained motifs are related to each other via shifts and show only gradual differences at single positions, which also reflects in characteristic numbers (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </inline-formula>) being distributed only over a narrow range. The relative log-dissociation constants for 
                    <xref ref-type="fig" rid="f5">Figure 5</xref> are 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.36</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.36</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.67</mml:mn>
                        </mml:math>
                    </inline-formula>, where the former two differ by only 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.0004</mml:mn>
                        </mml:math>
                    </inline-formula>. The corresponding unspecific binding strengths are 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.47</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.21</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.58</mml:mn>
                        </mml:math>
                    </inline-formula>. The log likelihoods of the data range from 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.39</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.43</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <title>Lowest (A), intermediate (B), and highest (C) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for PUM1 obtained from our model with random initialization.</title>
                        <p>The intermediate motif (B) represents the consensus.
                            <sup>
                                <xref ref-type="bibr" rid="ref22">22</xref>
                            </sup>
                        </p>
                    </caption>
                    <graphic id="gr5" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure5.gif"/>
                </fig>
            </sec>
            <sec id="sec12">
                <title>4.4 PUF60</title>
                <p>A representative from the same family of RBPs is the Poly(U)-binding splicing factor PUF60,
                    <sup>
                        <xref ref-type="bibr" rid="ref23">23</xref>
                    </sup> which regulates both transcription and translation.
                    <sup>
                        <xref ref-type="bibr" rid="ref24">24</xref>
                    </sup> Our algorithm finds the expected motif (poly(T)) as having the highest affinity 
                    <xref ref-type="fig" rid="f6">Figure 6(A)</xref>. The relative log-
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> values for T-rich motifs are 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.76</mml:mn>
                        </mml:math>
                    </inline-formula>, with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>10.53</mml:mn>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>&#x2273;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>7.5</mml:mn>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.81</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. Different variations (see 
                    <xref ref-type="fig" rid="f6">Figure 6(B)</xref>) of the poly(T) motif are found, having in common the alternating pattern of very polarized &#x2013; less polarized T (the latter blended with C). Poly(A)-containing motifs (c.f. 
                    <xref ref-type="fig" rid="f6">Figure 6(C)</xref>) also appear at 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>&gt;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.33</mml:mn>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>&gt;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>5.49</mml:mn>
                        </mml:math>
                    </inline-formula>, with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>4.04</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. The motif with the highest dissociation constant obtained from random initializations is shown in 
                    <xref ref-type="fig" rid="f6">Figure 6(C)</xref>, has 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>0.98</mml:mn>
                        </mml:math>
                    </inline-formula>, and corresponds to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.64</mml:mn>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>5.68</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <title>Lowest (A), intermediate (B) and highest (C) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for PUF60 obtained from our model with random initialization.</title>
                    </caption>
                    <graphic id="gr6" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure6.gif"/>
                </fig>
            </sec>
            <sec id="sec13">
                <title>4.5 ELAVL4</title>
                <p>The ELAV-like protein 4 (ELAVL4) is an RBP that is exclusively expressed in neurons.
                    <sup>
                        <xref ref-type="bibr" rid="ref25">25</xref>
                    </sup> It binds A/T-rich elements according to genome-wide IP experiments,
                    <sup>
                        <xref ref-type="bibr" rid="ref26">26</xref>
                    </sup>
                    <sup>,</sup>
                    <sup>
                        <xref ref-type="bibr" rid="ref27">27</xref>
                    </sup> a pattern that we also observe in PWMs from our model (see 
                    <xref ref-type="fig" rid="f7">Figure 7</xref>). Relative log-
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>&#x2019;s range from 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.71</mml:mn>
                        </mml:math>
                    </inline-formula> to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>1.94</mml:mn>
                        </mml:math>
                    </inline-formula>, unspecific binding strength from 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.46</mml:mn>
                        </mml:math>
                    </inline-formula> to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.24</mml:mn>
                        </mml:math>
                    </inline-formula>, and the log-likelihood of the data from 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.95</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> to 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.07</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Lowest (A), intermediate (B), and highest (C) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for ELVAL4 obtained from our model with random initialization.</title>
                    </caption>
                    <graphic id="gr7" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure7.gif"/>
                </fig>
            </sec>
            <sec id="sec14">
                <title>4.6 HNRNPC</title>
                <p>Turning towards another large familiy of RBPs, the heterogeneous nuclear ribonucleoproteins (hnRNPs), we highlight hnRNPC,
                    <sup>
                        <xref ref-type="bibr" rid="ref28">28</xref>
                    </sup> a protein that is involved pre-mRNA processing.
                    <sup>
                        <xref ref-type="bibr" rid="ref29">29</xref>
                    </sup> It is known to bind to poly(U) motifs, typically pentamers, found in both SKA analysis of RNA Bind&#x2019;n Seq data
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> and in electrophoretic mobility shift assays.
                    <sup>
                        <xref ref-type="bibr" rid="ref30">30</xref>
                    </sup> Our model does recover the dominant poly(T) motif at 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.76</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>8.54</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.09</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. Mildly subdominant motifs are A-rich and less repetitive (
                    <xref ref-type="fig" rid="f8">Figure 8(B), (C)</xref>) with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.76</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>7.67</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.09</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, as well as 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.01</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>8.54</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.09</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
                <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                    <label>Figure 8. </label>
                    <caption>
                        <title>Lowest (A), intermediate (B), and highest (C) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for hnRNPC obtained from our model with random initialization.</title>
                    </caption>
                    <graphic id="gr8" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure8.gif"/>
                </fig>
                <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                    <label>Figure 9. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for PCBP1 obtained from our model with random initialization.</title>
                    </caption>
                    <graphic id="gr9" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure9.gif"/>
                </fig>
            </sec>
            <sec id="sec15">
                <title>4.7 PCBP1</title>
                <p>As a last example of RBPs of which we predict a motif that agrees with prior data, we highlight the poly (rC)-binding protein 1 (PCBP1).
                    <sup>
                        <xref ref-type="bibr" rid="ref31">31</xref>
                    </sup> Its binding specificity consists of a few very polarized C&#x2019;s linked by A/T-enriched sequence.
                    <sup>
                        <xref ref-type="bibr" rid="ref32">32</xref>
                    </sup> This pattern is clearly reproduced by our highest affinity motif in 
                    <xref ref-type="fig" rid="f9">Figure (9)(A)</xref> (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.81</mml:mn>
                        </mml:math>
                    </inline-formula>) with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>13.26</mml:mn>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.99</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. The lowest affinity motif (
                    <xref ref-type="fig" rid="f9">Figure 9(B)</xref>) that random PWM-initialization yielded had 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>1.62</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.49</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>6.07</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
            </sec>
            <sec id="sec16">
                <title>4.8 CELF1</title>
                <p>CELF1 is an RBP of the CUG-binding CELF family,
                    <sup>
                        <xref ref-type="bibr" rid="ref33">33</xref>
                    </sup>
                    <sup>,</sup>
                    <sup>
                        <xref ref-type="bibr" rid="ref34">34</xref>
                    </sup> that participates in multiple steps of post-transcriptional processing of RNAs, including splicing, translation and decay.
                    <sup>
                        <xref ref-type="bibr" rid="ref35">35</xref>
                    </sup> CELF1 requires UGU motifs for high-affinity interaction with RNAs.
                    <sup>
                        <xref ref-type="bibr" rid="ref36">36</xref>
                    </sup> The corresponding Bind&#x2019;n Seq dataset
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> consists of libraries generated for seven different RBP concentrations, each containing 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x223c;</mml:mo>
                            <mml:mn>2</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>7</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> reads of 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>S</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mn>40</mml:mn>
                        </mml:math>
                    </inline-formula>.</p>
                <p>Since 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>16</mml:mn>
                        </mml:math>
                    </inline-formula> runs with completely random PWM intialization for 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mn>5</mml:mn>
                            <mml:mo>,</mml:mo>
                            <mml:mn>6</mml:mn>
                            <mml:mo>,</mml:mo>
                            <mml:mo>&#x2026;</mml:mo>
                            <mml:mn>,14,15</mml:mn>
                        </mml:math>
                    </inline-formula> did not yield any local optimum of the probability landscape we decided to test whether the biased initialization of the PWM with the known motif (TGT), which was found as as enriched 3-mer in RNA Bind&#x2019;n Seq
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> enables the recovery of longer motifs. However, our model predicted only poly(A) stretches as convergent optima of the probability landscape, see 
                    <xref ref-type="fig" rid="f10">Figure 10(A)</xref>. The relative log-dissociation constants were 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>1.67</mml:mn>
                            <mml:mo>&#x2273;</mml:mo>
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>&#x2273;</mml:mo>
                            <mml:mn>2.17</mml:mn>
                        </mml:math>
                    </inline-formula>, unspecific binding was characterized by 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>4.0</mml:mn>
                            <mml:mo>&#x2273;</mml:mo>
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>&#x2273;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.75</mml:mn>
                        </mml:math>
                    </inline-formula>, and the likelihood of the daat was between 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>8.41</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula> and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>8.48</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
            </sec>
            <sec id="sec17">
                <title>4.9 EWSR1</title>
                <p>EWSR1, or Ewing&#x2019;s sarcoma protein, forms fusions with a number of other proteins and serves as a transcriptional activator in human solid tumors like Ewing&#x2019;s sarcoma and malignant melanoma.
                    <sup>
                        <xref ref-type="bibr" rid="ref37">37</xref>
                    </sup> Information about its binding specificity is scarce, while SKA analysis of RNA Bind&#x2019;n Seq data predicts G-rich elements.
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> Our investigation lead to different varieties of poly(A) motifs (
                    <xref ref-type="fig" rid="f11">Figure 11(A)</xref> and 
                    <xref ref-type="fig" rid="f11">(B)</xref>), which vary in relative dissociation constants 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.79</mml:mn>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mn>0.91</mml:mn>
                        </mml:math>
                    </inline-formula> in log-
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula>, unspecific binding energy 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>4.18</mml:mn>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.55</mml:mn>
                        </mml:math>
                    </inline-formula>, and data likelihoods 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.2</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>4.50</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
                <fig fig-type="figure" id="f10" orientation="portrait" position="float">
                    <label>Figure 10. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for CELF1 predicted by our model from random initialization.</title>
                        <p>Enrichment analysis suggests a prominent 
                            <inline-formula>
                                <mml:math display="inline">
                                    <mml:mi mathvariant="italic">TGT</mml:mi>
                                </mml:math>
                            </inline-formula> as consensus, which our model cannot recover, even when initializing with 
                            <inline-formula>
                                <mml:math display="inline">
                                    <mml:mtext mathvariant="italic">NTGTN</mml:mtext>
                                </mml:math>
                            </inline-formula>.</p>
                    </caption>
                    <graphic id="gr10" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure10.gif"/>
                </fig>
                <fig fig-type="figure" id="f11" orientation="portrait" position="float">
                    <label>Figure 11. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for EWSR1 predicted by our model from random initialization.</title>
                    </caption>
                    <graphic id="gr11" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure11.gif"/>
                </fig>
            </sec>
            <sec id="sec18">
                <title>4.10 HNRNPD</title>
                <p>Within the class of heterogeneous ribonucleoproteins (hnRNPs), hnRNPD (also known as AUF1) is a well-known A/U-rich element RNA binding protein with important role in RNA decay.
                    <sup>
                        <xref ref-type="bibr" rid="ref38">38</xref>
                    </sup> HNRNPD has been reported to bind clusters of AUUUA elements.
                    <sup>
                        <xref ref-type="bibr" rid="ref38">38</xref>
                    </sup> The ENCODE-database
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> lists AUAAU as another possible binding site for hnRNPD. We find poly(A) stretches of different lengths as convergent and specific binding motifs (see 
                    <xref ref-type="fig" rid="f12">Figure 12</xref>). While A is the dominating nucleotide at every position, it is followed consistently by T, which fits the reported A/U-rich binding domains in the literature. We find unspecific binding terms in the range 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>7.56</mml:mn>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.48</mml:mn>
                        </mml:math>
                    </inline-formula>, where 
                    <xref ref-type="fig" rid="f12">Figure 12(B)</xref> corresponds to the highest affinity motif (
                    <xref ref-type="fig" rid="f12">Figure 12(A)</xref>) for which 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.17</mml:mn>
                        </mml:math>
                    </inline-formula>. The two motifs shown in 
                    <xref ref-type="fig" rid="f12">Figure 12</xref> come from a range of dissociation constants 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mn>0.42</mml:mn>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>&#x2264;</mml:mo>
                            <mml:mn>1.95</mml:mn>
                        </mml:math>
                    </inline-formula> and data likelihoods 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.23</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>&#x2265;</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.30</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. Direct initialization with the kmer-enrichment consensus ATAAT did not lead to any convergent and specific result.</p>
                <fig fig-type="figure" id="f12" orientation="portrait" position="float">
                    <label>Figure 12. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for hnRNPD predicted by our model from random initialization.</title>
                        <p>Enrichment analysis suggests a quite unpolar A/T-rich binding domain. Both motifs, (A) and (B), show A as the dominant nucleotide, followed by T at each position.</p>
                    </caption>
                    <graphic id="gr12" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure12.gif"/>
                </fig>
            </sec>
            <sec id="sec19">
                <title>4.11 HNRNPK</title>
                <p>We were also interested in determining whether we can recover G/C-rich binding motifs from the data and therefore applied the model to heterogeneous nuclear ribonucleoprotein K (hnRNPK), a member of the poly(C) binding family of proteins.
                    <sup>
                        <xref ref-type="bibr" rid="ref39">39</xref>
                    </sup> We could only recover one of the two consensus motifs reported in the ENCODE analysis of these data (GCCCA, from SKA)
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> when specifically initializing the PWM with this motif. The second reported motif, with the CACGC consensus, could not be found by our algorithm even when the PWM was initialized with the motif itself and even when sequences containing the first motif were eliminated, indicating that this motif does not correspond to a local maximum of the likelihood function. We did not find any PWMs of 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>L</mml:mi>
                                <mml:mi>w</mml:mi>
                            </mml:msub>
                            <mml:mo>&gt;</mml:mo>
                            <mml:mn>5</mml:mn>
                        </mml:math>
                    </inline-formula> in this data set, whether we used random initialization or shorter motif-guided initialization.</p>
            </sec>
            <sec id="sec20">
                <title>4.12 CPEB1</title>
                <p>The cytoplasmic polyadenylation element-binding protein 1 (CPEB1)
                    <sup>
                        <xref ref-type="bibr" rid="ref40">40</xref>
                    </sup> serves as a translational regulator by binding specific U-rich sequences in 3&#x2019;UTRs inducing cytoplasmic adenylation. The streaming kmer algorithm predicts a poly(T) motif from RNA Bind&#x2019;n Seq data, which is what we recover as the lowest affinity motif in 
                    <xref ref-type="fig" rid="f13">Figure 13(D)</xref> with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.39</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.30</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>1.59</mml:mn>
                        </mml:math>
                    </inline-formula>. The penta(T) is also part of a higher affinity motif (see 
                    <xref ref-type="fig" rid="f13">Figure 13(C)</xref>) with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>5.90</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mn>2.92</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>0.00</mml:mn>
                        </mml:math>
                    </inline-formula>. More complex motifs with higher affinity, but less specificity are displayed in 
                    <xref ref-type="fig" rid="f13">Figure 13(A)</xref> with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.05</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.56</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.10</mml:mn>
                        </mml:math>
                    </inline-formula> and in 
                    <xref ref-type="fig" rid="f13">Figure 13(B)</xref> with 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.32</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>2.56</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.10</mml:mn>
                        </mml:math>
                    </inline-formula>. While the difference in affinity between the first three motifs shown in 
                    <xref ref-type="fig" rid="f13">Figure 13(A-C)</xref> is negligible, while the penta(T) motif alone (
                    <xref ref-type="fig" rid="f13">Figure 13(D)</xref> differs from the highest affinity one by 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2248;</mml:mo>
                            <mml:mn>50</mml:mn>
                        </mml:math>
                    </inline-formula>-fold, which is quite substantial.</p>
                <fig fig-type="figure" id="f13" orientation="portrait" position="float">
                    <label>Figure 13. </label>
                    <caption>
                        <title>Lowest (A), intermediate (B) and (C), and highest (D) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for CPEB1 predicted by our model from random initialization.</title>
                        <p>Although the consensus binding motif is recovered in (C) and (D), we predict (A) and (B) to be stronger binding specificities.</p>
                    </caption>
                    <graphic id="gr13" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure13.gif"/>
                </fig>
            </sec>
            <sec id="sec21">
                <title>4.13 EIF4H</title>
                <p>As a representative of the family of eukaryotic translation initiation factors, we highlight the eukaryotic translation initiation factor 4H (EIF4H).
                    <sup>
                        <xref ref-type="bibr" rid="ref41">41</xref>
                    </sup> SKA enrichment studies of RNA Bind&#x2019;n Seq data predict a poly(G) motif for this protein,
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> but we did not find corroborating data obtained with other approaches. Our model identified only one convergent and specific motif shown in 
                    <xref ref-type="fig" rid="f14">Figure 14</xref>, which is pyrimidine-rich. The relative log-
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> is relatively high, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>0.11</mml:mn>
                        </mml:math>
                    </inline-formula>, as is the contribution of unspecific binding, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>0.67</mml:mn>
                        </mml:math>
                    </inline-formula>. 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>7.44</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>.</p>
                <fig fig-type="figure" id="f14" orientation="portrait" position="float">
                    <label>Figure 14. </label>
                    <caption>
                        <title>Single convergent and specific PWM for EIF4H predicted by our model from random initialization.</title>
                    </caption>
                    <graphic id="gr14" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure14.gif"/>
                </fig>
            </sec>
            <sec id="sec22">
                <title>4.14 PTBP3</title>
                <p>Polypyrimidine tract-binding protein 3 (PTBP3) is involved in the regulation of numerous steps of protein production, i.e. splicing, alternative 3&#x2019; end processing, mRNA stability and RNA localization.
                    <sup>
                        <xref ref-type="bibr" rid="ref42">42</xref>
                    </sup> Data on PTBP3 binding specificity is lacking, and enriched kmer was reported by the SKA from RNA Bind&#x2019;n Seq data.
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> Our model predicts different variations of poly(A), see 
                    <xref ref-type="fig" rid="f15">Figure 15</xref>. The predicted binding specificities with the lowest 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> features 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.02</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>1.28</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.15</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. The motif corresponding to the highest 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                            </mml:msub>
                        </mml:math>
                    </inline-formula> had 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:msub>
                                <mml:mi>E</mml:mi>
                                <mml:mn>0</mml:mn>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>3.86</mml:mn>
                        </mml:math>
                    </inline-formula>, 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:msubsup>
                                <mml:mi>K</mml:mi>
                                <mml:mi mathvariant="normal">D</mml:mi>
                                <mml:mi>rel</mml:mi>
                            </mml:msubsup>
                            <mml:mo>=</mml:mo>
                            <mml:mn>2.14</mml:mn>
                        </mml:math>
                    </inline-formula>, and 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>log</mml:mo>
                            <mml:mi>P</mml:mi>
                            <mml:mrow>
                                <mml:mo stretchy="true">(</mml:mo>
                                <mml:mi>D</mml:mi>
                                <mml:mo>|</mml:mo>
                                <mml:mi>M</mml:mi>
                                <mml:mo>,</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:mo stretchy="true">)</mml:mo>
                            </mml:mrow>
                            <mml:mo>=</mml:mo>
                            <mml:mo>&#x2212;</mml:mo>
                            <mml:mn>1.12</mml:mn>
                            <mml:mo>&#x00d7;</mml:mo>
                            <mml:msup>
                                <mml:mn>10</mml:mn>
                                <mml:mn>9</mml:mn>
                            </mml:msup>
                        </mml:math>
                    </inline-formula>. Thus, compared to other RBPs (c.f. 
                    <xref ref-type="fig" rid="f14">Figure 14</xref>), the specificity of PTBP3 remains to be better defined.</p>
                <fig fig-type="figure" id="f15" orientation="portrait" position="float">
                    <label>Figure 15. </label>
                    <caption>
                        <title>Lowest (A) and highest (B) 
                            <italic toggle="yes">K</italic>
                            <sub>D</sub> PWMs for PTBP3 predicted by our model from random initialization.</title>
                    </caption>
                    <graphic id="gr15" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure15.gif"/>
                </fig>
            </sec>
            <sec id="sec23">
                <title>4.15 Other RBPs</title>
                <p>There are other proteins covered in the Bind&#x2019;n Seq data
                    <sup>
                        <xref ref-type="bibr" rid="ref10">10</xref>
                    </sup> whose specificity was studied before. For example, we analyzed the data corresponding to MBNL1,
                    <sup>
                        <xref ref-type="bibr" rid="ref43">43</xref>
                    </sup> hnRNPL,
                    <sup>
                        <xref ref-type="bibr" rid="ref44">44</xref>
                    </sup> FUS,
                    <sup>
                        <xref ref-type="bibr" rid="ref45">45</xref>
                    </sup> TAF15.
                    <sup>
                        <xref ref-type="bibr" rid="ref46">46</xref>
                    </sup>
                </p>
                <p>Random and consensus intialization did result in convergent and specific binding specificities for FUS, however, these were poly(A) motifs which do not agree with the GGUG consensus from SELEX experiments.
                    <sup>
                        <xref ref-type="bibr" rid="ref47">47</xref>
                    </sup> For the other threee proteins our model did not deliver any convergent results, even when the PWM was directly initialized with the expected consensus motif. This indicates that the enrichment did not work equally well for all the RBPs studied with the Bind&#x2019;n Seq method or that our method does not identify well motifs that are very short (
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x223c;</mml:mo>
                            <mml:mn>2</mml:mn>
                        </mml:math>
                    </inline-formula>nts, observed in the SKA enrichment analysis). In this analysis, kmer enrichments are computed by counting the number of occurrences of every possible kmer in the foreground samples (RBP concentration 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>&#x2260;</mml:mo>
                            <mml:mn>0</mml:mn>
                        </mml:math>
                    </inline-formula>) and in the background samples (RBP concentration 
                    <inline-formula>
                        <mml:math display="inline">
                            <mml:mo>=</mml:mo>
                            <mml:mn>0</mml:mn>
                        </mml:math>
                    </inline-formula>), and finally normalizing the foreground abundances by the background to extract the respective enrichment. The higher the enrichment of a given kmer, the higher the likelihood of it being bound by the RBP used in the experiment is thought to be. We computed these enrichments for 6mers, as done in the ENCODE studies. The results, shown in 
                    <xref ref-type="fig" rid="f16">Figure 16</xref> indicate that only RBFOX2 has a very clear hierarchy of top enriched 6mers, while the other investigated RBPs show a much flatter hierarchy of motif enrichments. An analysis of the Levenshtein distance of these motifs showed no clear difference in the pattern of distances among the leading motifs across the investigated RBPs. This suggests that these motifs correspond to many local minima of comparable depth, which precludes our algorithm finding clear PWMs representing the binding sites. Conversely, it becomes unclear whether the specificity of these RBPs would be well represented as weight matrices, or whether another model, for e.g. clusters of short, degenerate motifs may better represent the specificity of these RBPs.</p>
                <fig fig-type="figure" id="f16" orientation="portrait" position="float">
                    <label>Figure 16. </label>
                    <caption>
                        <title>Logarithmic (base 
                            <inline-formula>
                                <mml:math display="inline">
                                    <mml:mi>e</mml:mi>
                                </mml:math>
                            </inline-formula>) enrichment (count in foreground reads normalized by count in background) for all 
                            <inline-formula>
                                <mml:math display="inline">
                                    <mml:msup>
                                        <mml:mn>4</mml:mn>
                                        <mml:mn>6</mml:mn>
                                    </mml:msup>
                                    <mml:mo>=</mml:mo>
                                    <mml:mn>4096</mml:mn>
                                </mml:math>
                            </inline-formula> possible 6mers in all datasets (corresponding to different RBPs), ranked by enrichment.</title>
                        <p>The top most enriched motifs are for RBFOX2: TGCATG, FUS: GCGCGC, hnRNPL: CACACA, MBNL1: GCTGCT, TAF15: GGGGGG. All except the RBFOX2 motif are repeats of shorter oligomers.</p>
                    </caption>
                    <graphic id="gr16" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/165610/5ece0cb6-470e-446b-ad8a-12de6e730b64_figure16.gif"/>
                </fig>
            </sec>
        </sec>
        <sec id="sec24">
            <title>5. Conclusion</title>
            <p>We constructed a thermodynamic model that can be used to infer characteristic position weight matrices for the binding domains of RBPs from data obtained from affinity-based enrichment of oligonucleotides. Since we directly model the RBP-binding specificity as PWMs, our method bypasses arbitrary choices in the alignment of kmers found to be individually enriched in the data. We evaluate our model on 111 RNA Bind&#x2019;n Seq data sets in the public domain,
                <sup>
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup> finding convergent and specific motifs for 82 RBPs. For 48 of these there is complete agreement with previously reported motifs, 14 partly agree, 19 disagree, and one did not have a reported motif (PTBP3). In most cases of disagreement, our model predicts the binding element to be, in fact, poly(A), while in two cases (CPEB1 and EIF4H), we predict more complex motifs than previously reported.</p>
            <p>Many of the RBPs for which we did not recover a PWM tend to be more degenerate motif than RBPs for which some motif emerged. E.g. ESRP1, EWSR1, FUS and other proteins have been shown to bind G-repeats, which are relatively rare in the Bind&#x2019;n Seq libraries. It is likely that to identify such motifs it is crucially important that the background model is accurate. How to best construct this model remains to be determined in future work. In addition, it is possible that the binding sites of these proteins are not contiguous, linear motifs, but rather contain variable length spacers of form structures such as G-quadruplexes. It will thus be important to explore other types of models in the future, e.g. models that allow more flexible spacing of RBP contact points on RNAs in the future.</p>
        </sec>
    </body>
    <back>
        <sec id="sec28" sec-type="data-availability">
            <title>Data availability</title>
            <sec id="sec29">
                <title>Underlying data</title>
                <table-wrap id="T1" orientation="portrait" position="float">
                    <label>
Table 1. </label>
                    <caption>
                        <title>ENCODE file accession IDs and data repository DOIs of RNA Bind&#x2019;n Seq samples from Ref. 
                            <xref ref-type="bibr" rid="ref10">10</xref> used for motif prediction.</title>
                        <p>DOI&#x2019;s are doi:10.17989/
                            <inline-formula>

                                <mml:math display="inline">
                                    <mml:mo>&lt;</mml:mo>
                                </mml:math>
</inline-formula>experiment identifier
                            <inline-formula>

                                <mml:math display="inline">
                                    <mml:mo>&gt;</mml:mo>
                                </mml:math>
</inline-formula>.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">RBP name</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">experiment identifier</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">RBP name</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
experiment identifier</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TARDBP</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR466JPT</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">IGF2BP3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR164XGH</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPK</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR368NMO</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PCBP2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR673FLQ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF9</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR724HZI</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SFPQ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR951YCV</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TAF15</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR827QYL</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">XRCC6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR189MAB</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">FUS</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR936LOF</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">IGF2BP1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR928XOW</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPL</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR954TYO</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">KHSRP</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR915BDY</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">PCBP1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR539RTM</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">LIN28B</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR369RLA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBFOX2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR441HLP</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NSUN2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR387CDD</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM22</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR006TPX</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM25</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR759QKO</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPC</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR569UIU</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SAFB2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR558RBK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR252RIJ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">EWSR1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR063HQO</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TIA1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR064NOY</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">FUBP1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR843QMF</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TRA2A</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR741VUK</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">FUBP3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR697VZN</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TROVE2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR653ZTY</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">IGF2BP2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR588GYZ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">XRN2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR382CPM</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PUM1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR845GNW</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">AKAP8L</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR110GHL</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SF1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR318HZC</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">APOBEC3C</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR472KKU</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZRANB2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR927QJQ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">CELF1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR992NHR</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">DAZAP1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR005ZRL</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">EIF3D</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR488AUU</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPA0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR170PBM</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">EIF4G2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR600HIW</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPA2B1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR890PDQ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPF</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR376SUZ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">CSDE1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR443NPK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ILF2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR906EKN</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPD</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR175OMA</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">MBNL1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR006QKZ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR379HWF</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">PPP1R10</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR297UTH</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM47</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR264RVK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">PUF60</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR773QCC</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">A1CF</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR934TDK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR345PWR</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">BOLL</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR497LIF</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR914PGB</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">CNOT4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR806UCE</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SUCLG1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR474NYR</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">CPEB1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR084YCO</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SYNCRIP</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR642PJQ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">DAZ3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR449VKY</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TRNAU1AP</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR419XDN</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">EIF4H</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR945YVY</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ELAVL4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR171TTH</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NOVA1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR834CED</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ESRP1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR082AKW</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">NUPL2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR102MQN</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">EXOSC4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR432LUH</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PABPC3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR051WAN</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPA1L2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR958WXF</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PABPN1L</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR334QCK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPCL1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR915CDT</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PCBP4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR769AEI</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPDL</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR055HDN</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PRR3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR191PTZ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">HNRNPH2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR328PGZ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">PTBP3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR741ZPT</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">KHDRBS2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR575QYE</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RALYL</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR229VBP</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">KHDRBS3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR583NVI</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBFOX3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR421UDF</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">MSI1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR329RIP</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM11</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR199MJK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM15B</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR655NWZ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBMS3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR224KSF</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM20</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR446UHZ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">RC3H1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR728SXZ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM23</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR525PNM</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SF3B6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR079FDB</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM24</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR742AEU</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SNRPA</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR167ZZB</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR331BKR</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SNRPB2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR606JGJ</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM41</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR637HFY</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF10</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR744POX</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM45</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR626INQ</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF11</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR073DSH</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM4B</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR905BJK</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR275JFN</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBM6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR548RVM</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">SRSF8</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR929OLV</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">RBMS2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR492CFG</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">TDRD10</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR456IMV</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">THUMPD1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR021IWR</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZNF326</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR335JQK</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">TRA2B</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR391FEW</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">UNK</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR497VCL</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">YBX2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR919USP</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZC3H10</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR605EEO</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZC3H18</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR614KXG</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZCRB1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR205HMN</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZFP36</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR315VQD</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZFP36L1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR570AIV</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">ZFP36L2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">ENCSR249GVR</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
        </sec>
        <sec id="sec25">
            <title>Software availability</title>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://git.scicore.unibas.ch/zavolan_group/pipelines/bind-n-seq-pwms">https://git.scicore.unibas.ch/zavolan_group/pipelines/bind-n-seq-pwms
</ext-link>

                <sup>

                    <xref ref-type="bibr" rid="ref15">15</xref>
</sup>
            </p>
            <p>Archived source code and results of runs at time of publication of version 2: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.11072860">https://doi.org/10.5281/zenodo.11072860</ext-link>

                <sup>

                    <xref ref-type="bibr" rid="ref48">48</xref>
</sup>
            </p>
            <p>License: MIT</p>
        </sec>
        <ack>
            <title>Acknowledgements</title>
            <p>We would like to thank Erik van Nimwegen for useful conversations and fruitful suggestions. Calculations were performed at sciCORE (
                <ext-link ext-link-type="uri" xlink:href="http://scicore.unibas.ch/">http://scicore.unibas.ch/</ext-link>) scientific computing core facility at University of Basel.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lunde</surname>
                            <given-names>BM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Moore</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Varani</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>RNA-binding proteins: modular design for efficient function.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Rev. Mol. Cell Biol.</italic>
</source>
                    <year>2007</year>;<volume>8</volume>:<fpage>479</fpage>&#x2013;<lpage>490</lpage>.
                    <pub-id pub-id-type="pmid">17473849</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrm2178</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5507177</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kazan</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ray</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chan</surname>
                            <given-names>ET</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>RNAcontext: a new method for learning the sequence and structure binding preferences of RNA-binding proteins.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Comput. Biol.</italic>
</source>
                    <year>2010</year>;<volume>6</volume>:<fpage>e1000832</fpage>.
                    <pub-id pub-id-type="pmid">20617199</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pcbi.1000832</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2895634</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Weirauch</surname>
                            <given-names>MT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cote</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Norel</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Evaluation of methods for modeling transcription factor sequence specificity.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2013</year>;<volume>31</volume>:<fpage>126</fpage>&#x2013;<lpage>134</lpage>.
                    <pub-id pub-id-type="pmid">23354101</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.2486</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3687085</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hentze</surname>
                            <given-names>MW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Castello</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schwarzl</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A brave new world of RNA-binding proteins.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Rev. Mol. Cell Biol.</italic>
</source>
                    <year>2018</year>;<volume>19</volume>:<fpage>327</fpage>&#x2013;<lpage>341</lpage>.
                    <pub-id pub-id-type="pmid">29339797</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrm.2017.130</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Imig</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brunschweiger</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Br&#x00fc;mmer</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>miR-CLIP capture of a miRNA targetome uncovers a lincRNA H19-miR-106a interaction.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Chem. Biol.</italic>
</source>
                    <year>2015</year>;<volume>11</volume>:<fpage>107</fpage>&#x2013;<lpage>114</lpage>.
                    <pub-id pub-id-type="pmid">25531890</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nchembio.1713</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hafner</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Katsantoni</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Koester</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>CLIP and complementary methods.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Rev. Meth. Prim.</italic>
</source>
                    <year>2021</year>;<volume>1</volume>.</mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lambert</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Robertson</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jangi</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>RNA Bind-n-Seq: quantitative assessment of the sequence and structural binding specificity of RNA binding proteins.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell.</italic>
</source>
                    <year>2014</year>;<volume>54</volume>:<fpage>887</fpage>&#x2013;<lpage>900</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.molcel.2014.04.016</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Omidi</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zavolan</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pachkov</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>
                        <italic toggle="yes">Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors.</italic>
                    </article-title>
                    <source>

                        <italic toggle="yes">PLoS Comput. Biol.</italic>
</source>
                    <year>2017</year>;<volume>13</volume>:<fpage>1</fpage>.</mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bisswanger</surname>
                            <given-names>H</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Enzyme Kinetics: Principles and Methods.</italic>
</source>
                    <publisher-name>Wiley</publisher-name>;<year>2008</year>.</mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Luo</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hitz</surname>
                            <given-names>BC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gabdank</surname>
                            <given-names>I</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>New developments on the Encyclopedia of DNA Elements (ENCODE) data portal.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2020</year>;<volume>48</volume>:<fpage>D882</fpage>&#x2013;<lpage>D889</lpage>.
                    <pub-id pub-id-type="doi">10.1093/nar/gkz1062</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shannon</surname>
                            <given-names>CE</given-names>
                        </name>
</person-group>:
                    <article-title>A mathematical theory of communication.</article-title>
                    <source>

                        <italic toggle="yes">Bell Syst. Tech. J.</italic>
</source>
                    <year>1948</year>;<volume>27</volume>:<fpage>379</fpage>&#x2013;<lpage>423</lpage>.
                    <pub-id pub-id-type="doi">10.1002/j.1538-7305.1948.tb01338.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dempster</surname>
                            <given-names>AP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Laird</surname>
                            <given-names>NM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rubin</surname>
                            <given-names>DB</given-names>
                        </name>
</person-group>:
                    <article-title>Maximum likelihood from incomplete data via the em algorithm.</article-title>
                    <source>

                        <italic toggle="yes">J. R. Stat. Soc. Series B Methodol.</italic>
</source>
                    <year>1977</year>;<volume>39</volume>:<fpage>1</fpage>&#x2013;<lpage>22</lpage>.
                    <pub-id pub-id-type="doi">10.1111/j.2517-6161.1977.tb01600.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Nimwegen</surname>
                            <given-names>E</given-names>
                            <prefix>van</prefix>
                        </name>
</person-group>:
                    <article-title>Finding regulatory elements and regulatory motifs: a general probabilistic framework.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2007</year>;<volume>8 Suppl 6</volume>:<fpage>S4</fpage>.
                    <pub-id pub-id-type="pmid">17903285</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-8-S6-S4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1995539</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <label>14</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Galassi</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Davies</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Theiler</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <source>

                        <italic toggle="yes">GNU Scientific Library Reference Manual.</italic>
</source>
                    <year>2021</year>.</mixed-citation>
            </ref>
            <ref id="ref15">
                <label>15</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schlusser</surname>
                            <given-names>N</given-names>
                        </name>
</person-group>:
                    <article-title>Bind&#x2019;n Seq PWMs.</article-title>
                    <year>2022</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://git.scicore.unibas.ch/zavolan_group/pipelines/bind-n-seq-pwms.git">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ponthier</surname>
                            <given-names>JL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schluepen</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>W</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Fox-2 splicing factor binds to a conserved intron motif to promote inclusion of protein 4.1R alternative exon 16.</article-title>
                    <source>

                        <italic toggle="yes">J. Biol. Chem.</italic>
</source>
                    <year>2006</year>;<volume>281</volume>:<fpage>12468</fpage>&#x2013;<lpage>12474</lpage>.
                    <pub-id pub-id-type="pmid">16537540</pub-id>
                    <pub-id pub-id-type="doi">10.1074/jbc.M511556200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Van Nostrand</surname>
                            <given-names>EL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pratt</surname>
                            <given-names>GA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shishkin</surname>
                            <given-names>AA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP).</article-title>
                    <source>

                        <italic toggle="yes">Nat. Methods.</italic>
</source>
                    <year>2016</year>;<volume>13</volume>:<fpage>508</fpage>&#x2013;<lpage>514</lpage>.
                    <pub-id pub-id-type="pmid">27018577</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3810</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4887338</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lambert</surname>
                            <given-names>NJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Robertson</surname>
                            <given-names>AD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Burge</surname>
                            <given-names>CB</given-names>
                        </name>
</person-group>:
                    <article-title>RNA Bind-n-Seq: Measuring the Binding Affinity Landscape of RNA-Binding Proteins.</article-title>
                    <source>

                        <italic toggle="yes">Methods Enzymol.</italic>
</source>
                    <year>2015</year>;<volume>558</volume>:<fpage>465</fpage>.
                    <pub-id pub-id-type="doi">10.1016/bs.mie.2015.02.007</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Buckanovich</surname>
                            <given-names>RJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yang</surname>
                            <given-names>YY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Darnell</surname>
                            <given-names>RB</given-names>
                        </name>
</person-group>:
                    <article-title>The onconeural antigen Nova-1 is a neuron-specific RNA-binding protein, the activity of which is inhibited by paraneoplastic antibodies.</article-title>
                    <source>

                        <italic toggle="yes">J. Neurosci.</italic>
</source>
                    <year>1996</year>;<volume>16</volume>:<fpage>1114</fpage>&#x2013;<lpage>1122</lpage>.
                    <pub-id pub-id-type="pmid">8558240</pub-id>
                    <pub-id pub-id-type="doi">10.1523/JNEUROSCI.16-03-01114.1996</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6578795</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Buckanovich</surname>
                            <given-names>RJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Darnell</surname>
                            <given-names>RB</given-names>
                        </name>
</person-group>:
                    <article-title>The neuronal RNA binding protein Nova-1 recognizes specific RNA targets in vitro and in vivo.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell. Biol.</italic>
</source>
                    <year>1997</year>;<volume>17</volume>:<fpage>3194</fpage>&#x2013;<lpage>3201</lpage>.
                    <pub-id pub-id-type="pmid">9154818</pub-id>
                    <pub-id pub-id-type="doi">10.1128/MCB.17.6.3194</pub-id>
                    <pub-id pub-id-type="pmcid">PMC232172</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zamore</surname>
                            <given-names>PD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hall</surname>
                            <given-names>TM</given-names>
                        </name>
</person-group>:
                    <article-title>Crystal structure of a Pumilio homology domain.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell.</italic>
</source>
                    <year>2001</year>;<volume>7</volume>:<fpage>855</fpage>&#x2013;<lpage>865</lpage>.
                    <pub-id pub-id-type="pmid">11336708</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S1097-2765(01)00229-5</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Morris</surname>
                            <given-names>AR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mukherjee</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Keene</surname>
                            <given-names>JD</given-names>
                        </name>
</person-group>:
                    <article-title>Ribonomic analysis of human Pum1 reveals cistrans conservation across species despite evolution of diverse mRNA target sets.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell. Biol.</italic>
</source>
                    <year>2008</year>;<volume>28</volume>:<fpage>4093</fpage>&#x2013;<lpage>4103</lpage>.
                    <pub-id pub-id-type="pmid">18411299</pub-id>
                    <pub-id pub-id-type="doi">10.1128/MCB.00155-08</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2423135</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bouffard</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Barbar</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Re</surname>
                            <given-names>F</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Interaction cloning and characterization of RoBPI, a novel protein binding to human Ro ribonucleoproteins.</article-title>
                    <source>

                        <italic toggle="yes">RNA.</italic>
</source>
                    <year>2000</year>;<volume>6</volume>:<fpage>66</fpage>&#x2013;<lpage>78</lpage>.
                    <pub-id pub-id-type="pmid">10668799</pub-id>
                    <pub-id pub-id-type="doi">10.1017/S1355838200990277</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1369894</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hastings</surname>
                            <given-names>ML</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Allemand</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Duelli</surname>
                            <given-names>DM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Control of pre-mRNA splicing by the general splicing factors PUF60 and U2AF(65).</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2007</year>;<volume>2</volume>: e538.
                    <pub-id pub-id-type="pmid">17579712</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0000538</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1888729</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Szabo</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dalmau</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Manley</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>HuD, a paraneoplastic encephalomyelitis antigen, contains RNA-binding domains and is homologous to Elav and Sex-lethal.</article-title>
                    <source>

                        <italic toggle="yes">Cell.</italic>
</source>
                    <year>1991</year>;<volume>67</volume>:<fpage>325</fpage>&#x2013;<lpage>333</lpage>.
                    <pub-id pub-id-type="pmid">1655278</pub-id>
                    <pub-id pub-id-type="doi">10.1016/0092-8674(91)90184-Z</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ray</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kazan</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chan</surname>
                            <given-names>ET</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Biotechnol.</italic>
</source>
                    <year>2009</year>;<volume>27</volume>:<fpage>667</fpage>&#x2013;<lpage>670</lpage>.
                    <pub-id pub-id-type="pmid">19561594</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.1550</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bolognani</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Contente-Cuomo</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Perrone-Bizzozero</surname>
                            <given-names>NI</given-names>
                        </name>
</person-group>:
                    <article-title>Novel recognition motifs and biological functions of the RNA-binding protein HuD revealed by genome-wide identification of its targets.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic. Acids Res.</italic>
</source>
                    <year>2010</year>;<volume>38</volume>:<fpage>117</fpage>&#x2013;<lpage>130</lpage>.
                    <pub-id pub-id-type="pmid">19846595</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkp863</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2800223</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Nakagawa</surname>
                            <given-names>TY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Swanson</surname>
                            <given-names>MS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wold</surname>
                            <given-names>BJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Molecular cloning of cDNA for the nuclear ribonucleoprotein particle C proteins: a conserved gene family.</article-title>
                    <source>

                        <italic toggle="yes">Proc. Natl. Acad. Sci. U S A.</italic>
</source>
                    <year>1986</year>;<volume>83</volume>:<fpage>2007</fpage>&#x2013;<lpage>2011</lpage>.
                    <pub-id pub-id-type="pmid">3457372</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.83.7.2007</pub-id>
                    <pub-id pub-id-type="pmcid">PMC323219</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Koenig</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zarnack</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rot</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Struct. Mol. Biol.</italic>
</source>
                    <year>2010</year>;<volume>17</volume>:<fpage>909</fpage>&#x2013;<lpage>915</lpage>.
                    <pub-id pub-id-type="pmid">20601959</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nsmb.1838</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3000544</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Buratti</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brindisi</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Giombi</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>TDP-43 binds heterogeneous nuclear ribonucleoprotein A/B through its C-terminal tail: an important region for the inhibition of cystic fibrosis transmembrane conductance regulator exon 9 splicing.</article-title>
                    <source>

                        <italic toggle="yes">J. Biol. Chem.</italic>
</source>
                    <year>2005</year>;<volume>280</volume>:<fpage>37572</fpage>&#x2013;<lpage>37584</lpage>.
                    <pub-id pub-id-type="pmid">16157593</pub-id>
                    <pub-id pub-id-type="doi">10.1074/jbc.M505557200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aasheim</surname>
                            <given-names>HC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Loukianova</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Deggerdal</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Tissue specific expression and cDNA structure of a human transcript encoding a nucleic acid binding [oligo (dC)] protein related to the pre-mRNA binding protein K.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic. Acids Res.</italic>
</source>
                    <year>1994</year>;<volume>22</volume>:<fpage>959</fpage>&#x2013;<lpage>964</lpage>.
                    <pub-id pub-id-type="pmid">8152927</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/22.6.959</pub-id>
                    <pub-id pub-id-type="pmcid">PMC307915</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Thisted</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lyakhov</surname>
                            <given-names>DL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liebhaber</surname>
                            <given-names>SA</given-names>
                        </name>
</person-group>:
                    <article-title>Optimized RNA targets of two closely related triple KH domain proteins, heterogeneous nuclear ribonucleoprotein K and alphaCP-2KL, suggest Distinct modes of RNA recognition.</article-title>
                    <source>

                        <italic toggle="yes">J. Biol. Chem.</italic>
</source>
                    <year>2001</year>;<volume>276</volume>:<fpage>17484</fpage>&#x2013;<lpage>17496</lpage>.
                    <pub-id pub-id-type="pmid">11278705</pub-id>
                    <pub-id pub-id-type="doi">10.1074/jbc.M010594200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Timchenko</surname>
                            <given-names>LT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Miller</surname>
                            <given-names>JW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Timchenko</surname>
                            <given-names>NA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Identification of a (CUG) n triplet repeat RNA-binding protein and its expression in myotonic dystrophy.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic. Acids Res.</italic>
</source>
                    <year>1996</year>;<volume>24</volume>:<fpage>4407</fpage>&#x2013;<lpage>4414</lpage>.
                    <pub-id pub-id-type="pmid">8948631</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/24.22.4407</pub-id>
                    <pub-id pub-id-type="pmcid">PMC146274</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ladd</surname>
                            <given-names>AN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Charlet</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cooper</surname>
                            <given-names>TA</given-names>
                        </name>
</person-group>:
                    <article-title>The CELF family of RNA binding proteins is implicated in cell-specific and developmentally regulated alternative splicing.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell. Biol.</italic>
</source>
                    <year>2001</year>;<volume>21</volume>:<fpage>1285</fpage>&#x2013;<lpage>1296</lpage>.
                    <pub-id pub-id-type="pmid">11158314</pub-id>
                    <pub-id pub-id-type="doi">10.1128/MCB.21.4.1285-1296.2001</pub-id>
                    <pub-id pub-id-type="pmcid">PMC99581</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dembowski</surname>
                            <given-names>JA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Grabowski</surname>
                            <given-names>PJ</given-names>
                        </name>
</person-group>:
                    <article-title>The CUGBP2 splicing factor regulates an ensemble of branchpoints from perimeter binding sites with implications for autoregulation.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Genet.</italic>
</source>
                    <year>2009</year>;<volume>5</volume>:<fpage>e1000595</fpage>.
                    <pub-id pub-id-type="pmid">19680430</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pgen.1000595</pub-id>
                    <pub-id pub-id-type="pmcid">PMC2715136</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref36">
                <label>36</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Marquis</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Paillard</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Audic</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>CUG-BP1/CELF1 requires UGU-rich sequences for high-affinity binding.</article-title>
                    <source>

                        <italic toggle="yes">Biochem. J.</italic>
</source>
                    <year>2006</year>;<volume>400</volume>:<fpage>291</fpage>&#x2013;<lpage>301</lpage>.
                    <pub-id pub-id-type="pmid">16938098</pub-id>
                    <pub-id pub-id-type="doi">10.1042/BJ20060490</pub-id>
                    <pub-id pub-id-type="pmcid">PMC1652823</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref37">
                <label>37</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ohno</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ouchida</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The EWS gene, involved in Ewing family of tumors, malignant melanoma of soft parts and desmoplastic small round cell tumors, codes for an RNA binding protein with novel regulatory domains.</article-title>
                    <source>

                        <italic toggle="yes">Oncogene.</italic>
</source>
                    <year>1994</year>;<volume>9</volume>:<fpage>3087</fpage>&#x2013;<lpage>3097</lpage>.
                    <pub-id pub-id-type="pmid">8084618</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Xu</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>CY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shyu</surname>
                            <given-names>AB</given-names>
                        </name>
</person-group>:
                    <article-title>Versatile role for hnRNP D isoforms in the differential regulation of cytoplasmic mRNA turnover.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell. Biol.</italic>
</source>
                    <year>2001</year>;<volume>21</volume>:<fpage>6960</fpage>&#x2013;<lpage>6971</lpage>.
                    <pub-id pub-id-type="pmid">11564879</pub-id>
                    <pub-id pub-id-type="doi">10.1128/MCB.21.20.6960-6971.2001</pub-id>
                    <pub-id pub-id-type="pmcid">PMC99872</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Swanson</surname>
                            <given-names>MS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dreyfuss</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Classification and purification of proteins of heterogeneous nuclear ribonucleoprotein particles by RNA-binding specificities.</article-title>
                    <source>

                        <italic toggle="yes">Mol. Cell. Biol.</italic>
</source>
                    <year>1988</year>;<volume>8</volume>:<fpage>2237</fpage>&#x2013;<lpage>2241</lpage>.
                    <pub-id pub-id-type="pmid">3386636</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref40">
                <label>40</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Welk</surname>
                            <given-names>JF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Charlesworth</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Smith</surname>
                            <given-names>GD</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Identification and characterization of the gene encoding human cytoplasmic polyadenylation element binding protein.</article-title>
                    <source>

                        <italic toggle="yes">Gene.</italic>
</source>
                    <year>2001</year>;<volume>263</volume>:<fpage>113</fpage>&#x2013;<lpage>120</lpage>.
                    <pub-id pub-id-type="pmid">11223249</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0378-1119(00)00588-6</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref41">
                <label>41</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Richter-Cook</surname>
                            <given-names>NJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dever</surname>
                            <given-names>TE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hensold</surname>
                            <given-names>JO</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Purification and characterization of a new eukaryotic protein translation factor. Eukaryotic initiation factor 4H.</article-title>
                    <source>

                        <italic toggle="yes">J. Biol. Chem.</italic>
</source>
                    <year>1998</year>;<volume>273</volume>:<fpage>7579</fpage>&#x2013;<lpage>7587</lpage>.
                    <pub-id pub-id-type="doi">10.1074/jbc.273.13.7579</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref42">
                <label>42</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Valcarcel</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gebauer</surname>
                            <given-names>F</given-names>
                        </name>
</person-group>:
                    <article-title>Post-transcriptional regulation: the dawn of PTB.</article-title>
                    <source>

                        <italic toggle="yes">Curr. Biol.</italic>
</source>
                    <year>1997</year>;<volume>7</volume>:<fpage>R705</fpage>&#x2013;<lpage>R708</lpage>.
                    <pub-id pub-id-type="pmid">9382788</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0960-9822(06)00361-7</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref43">
                <label>43</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Miller</surname>
                            <given-names>JW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Urbinati</surname>
                            <given-names>CR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Teng-Umnuay</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Recruitment of human muscleblind proteins to (CUG)(n) expansions associated with myotonic dystrophy.</article-title>
                    <source>

                        <italic toggle="yes">EMBO J.</italic>
</source>
                    <year>2000</year>;<volume>19</volume>:<fpage>4439</fpage>&#x2013;<lpage>4448</lpage>.
                    <pub-id pub-id-type="doi">10.1093/emboj/19.17.4439</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref44">
                <label>44</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hahm</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cho</surname>
                            <given-names>OH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>JE</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Polypyrimidine tract-binding protein interacts with HnRNP L.</article-title>
                    <source>

                        <italic toggle="yes">FEBS Lett.</italic>
</source>
                    <year>1998</year>;<volume>425</volume>:<fpage>401</fpage>&#x2013;<lpage>406</lpage>.
                    <pub-id pub-id-type="pmid">9563502</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0014-5793(98)00269-5</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref45">
                <label>45</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Iko</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kodama</surname>
                            <given-names>TS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kasai</surname>
                            <given-names>N</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Domain architectures and characterization of an RNA-binding protein, TLS.</article-title>
                    <source>

                        <italic toggle="yes">J. Biol. Chem.</italic>
</source>
                    <year>2004</year>;<volume>279</volume>:<fpage>44834</fpage>&#x2013;<lpage>44840</lpage>.
                    <pub-id pub-id-type="pmid">15299008</pub-id>
                    <pub-id pub-id-type="doi">10.1074/jbc.M408552200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref46">
                <label>46</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Morris</surname>
                            <given-names>GF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rice</surname>
                            <given-names>AP</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Wild-type and transactivation-defective mutants of human immunodeficiency virus type 1 Tat protein bind human TATA-binding protein in vitro.</article-title>
                    <source>

                        <italic toggle="yes">J. Acquir. Immune Defic. Syndr. Hum. Retrovirol.</italic>
</source>
                    <year>1996</year>;<volume>12</volume>:<fpage>128</fpage>&#x2013;<lpage>138</lpage>.
                    <pub-id pub-id-type="pmid">8680883</pub-id>
                    <pub-id pub-id-type="doi">10.1097/00042560-199606010-00005</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref47">
                <label>47</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lerga</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hallier</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Delva</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Identification of an RNA binding specificity for the potential splicing factor TLS.</article-title>
                    <source>

                        <italic toggle="yes">J Biol Chem</italic>
</source>
                    <year>2001</year>;<volume>276</volume>:<fpage>6807</fpage>&#x2013;<lpage>6816</lpage>.
                    <pub-id pub-id-type="pmid">11098054</pub-id>
                    <pub-id pub-id-type="doi">10.1074/jbc.M008304200</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref48">
                <label>48</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schlusser</surname>
                            <given-names>N</given-names>
                        </name>
</person-group>:
                    <article-title>PWMs from RNA Bind&#x2019;n&#x2019;Seq data (3.0).</article-title>
                    <year>Apr, 2024</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.8028034</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report284353">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.165610.r284353</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Zhang</surname>
                        <given-names>Jun</given-names>
                    </name>
                    <xref ref-type="aff" rid="r284353a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5842-7424</uri>
                </contrib>
                <aff id="r284353a1">
                    <label>1</label>Texas Tech University Health Science Center El Paso (TTUHSCEP), El Paso, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>24</day>
                <month>7</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Zhang J</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport284353" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.135164.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The revision has addressed the issues in the first version.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Partly</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>No source data required</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>RNA-binding protein, structural biology.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report299542">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.165610.r299542</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Schlundt</surname>
                        <given-names>Andreas</given-names>
                    </name>
                    <xref ref-type="aff" rid="r299542a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r299542a1">
                    <label>1</label>Institute for Biochemistry, University of Greifswald, Greifswald, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>22</day>
                <month>7</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Schlundt A</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport299542" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.135164.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This manuscript by Schlusser and Zavolan provides a bioinformatic approach to infer PWM representations in motifs targeted by RNA-binding proteins. It uses available RBnS data from the ENCODE database. Its suggested strength and improvement over the canonical way to infer consensus motifs in RBnS is that it bypasses the prior definition of kmers for the alignment procedure of enriched sequences.</p>
            <p> As a result the authors present convergent and specific motifs for 82 RBPs (out of 111 in the database). For 48 of these there is complete and for 14 partial agreement with previously reported motifs considering the RBnS database as the correct reference.</p>
            <p> </p>
            <p> The current manuscript version had undergone a prior round of revision with visible improvement. Although I am not an expert in the bioinformatic background I can nevertheless clearly see the (functionally relevant) motivation of the approach and highly appreciate it in light of the need to try to exploit the in-depth information of HTS-derived motifs of RBPs beyond the current standards. Still, I feel the approach misses to address or at least discuss the most obvious challenge (which could on the longer run be implemented?): the majority of RBPs uses more than one domain for interacting with RNA and individual domains may contribute with orders of magnitude difference in affinity to the combined specific target motif, while only this combined motif is the functionally relevant specific one. While this is an ongoing challenge in the interpretation of all CLIP and RBnS data, I wonder how much the herein presented may address this issue a bit more realistic.</p>
            <p> Together with this, here are my concrete points I suggest to address and clarify:</p>
            <p> 1) The authors themselves mention the possible role of spaced elements targeted by multiple domains within one RBP. Similar to RBnS, a proper analysis towards this direction is challenging, but could e.g. include the consideration of multiple motifs. Have the authors thought about this? I suggest a bit more discussion into this direction. Mixed motifs with weighted contributions from relative affinities of domains will remain a major point to address in the future.</p>
            <p> </p>
            <p> 2) In line with above: Considering multiple binding sites of an RBP on an oligo may mean more than one RBP binding to it OR multiple domains of one RBP binding to one combined element. I may not have fully understood how the estimation of Kd values according to the formula (2.6) may take this into consideration?</p>
            <p> </p>
            <p> 3) Why is (page 5) the assumption that RBP binding is usually of low affinity in order to simplify the equation? Many RBPs bind in the low nM range, which is pretty strong, be it specific or not.</p>
            <p> </p>
            <p> 4) Is there any particular way to treat motifs that are apparently part of RNA structure (Rc3h1)? This should be highly converging and specific reg. the central 5mer, but likely not in a larger context (i.e. the stem beneath the 3-5 nt loop). There may be less such structured motifs according to our expectations, but on the other there could be more of them hidden the RBnS data and we just neglect their fold context.</p>
            <p> </p>
            <p> 5) For the les converging/specific RBPs: is there a correlation to their amount of domains present? And adding to that: seeing the motifs shown in Fig. 2, some seem to show large differences between the low vs. high K
                <sub>D</sub>. Could this correlate with the number of RBDs in the RBP, such that the affine ones come from the strong-binding domains and the other one from the subordinate domains? And: could also the occurrence of polyA relate to the number of domains and thus too many parallel motifs (e.g. for the IGF2BPs)? Apparently exactly those ones would need the definition of a motif cluster, while current approaches merely provide a weighted motif, which in the end is not precisely the right one for any of the six domains.</p>
            <p> </p>
            <p> 6) Have the authors specifically taken into consideration the different concentrations of RBPs in the RBnS data? Esp. when there are multiple concentrations tested? Sorry if I have missed this.</p>
            <p> </p>
            <p> 7) Could the authors in the end provide a somewhat fair discussion of how their inferred motifs now should be seen relative to RBnS motifs?! Do they suggest to question RBnS logos (which in the end still could be "wrong&#x201d;) or do they just claim to have provided a faster and more efficient, user-friendly way to derive logos from RBnS data?</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>Structural biochemistry/biophysics of multi-domain RNA-binding proteins and their target RNAs centered around solution methods NMR and SAXS</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report299537">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.165610.r299537</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Wang</surname>
                        <given-names>Junbai</given-names>
                    </name>
                    <xref ref-type="aff" rid="r299537a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-5505-5613</uri>
                </contrib>
                <aff id="r299537a1">
                    <label>1</label>University of Oslo, Campus AHUS/Oslo, Norway</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>9</day>
                <month>7</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Wang J</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport299537" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.135164.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>reject</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Manuscript by Schlusser N and Zavolan M describe a biophysical model for inferring motifs of RNA-binding proteins. They describe the derivation of LL function to the proposed EM algorithm for predicting enriched RNA binding motifs in sequences. The method was tested on RNA Bind&#x2019;n Seq data from Encode with success on 7 RNA-binding proteins but failed on the other 4 RNA-binding proteins mentioned in the main text. In total, authors tested the method on 111 RNA-binding datasets, 82 of them obtained results where 48 with good agreement but 34 with either marginal or poor agreement. In summary, the success rate of this new method on 111 datasets is only around 43%, which is very low when compared to many public tools in DNA/RNA motif analysis. There are three major problems in the manuscript: 
                <list list-type="order">
                    <list-item>
                        <p>The theory behind their proposed biophysical model for motif finding is very unclear to me. Authors claim that their method is based on Boltzmann machine, which has been used many decades in protein-DNA interaction theory such as a classical paper in biophysics published in 1987 Berg OG et.al.,(ref 1 ), several applications of it in DNA sequences analysis in Djordjevic M et.al., 2003 (Ref 2) and Foat BC et.al., 2006 (Ref 3), and new development in more advanced Fermi-Dirac form of protein-DNA interactions by considering protein concentration with a Bayesian solution&#x00a0;Wang J et.al., 2009 (Ref 4)&#x00a0;Yang M et.al., 2023 (Ref 5) for DNA motif finding. From my point of view, there is not a clear association between the authors&#x2019; sections (2.1 ,2.2) and the biophysical theory of aforementioned works that have been applied successfully in numerous datasets and applications. In particular, I am completely lost when authors describing &#x201c;combine (2.9) with (2.5), the output of the optimization procedure, and the reference (2.10) allows us to compute the logarithm of the dissociation constant of RBP-RNA binding &#x2026;&#x201d; in section 2.2 because I do not understand the reason for combining these two equations. After going through sections 3.1 and 3.2, I feel that the proposed method is actually in analogy to k-mer motif enrichment tests such as a paper in Ghandi M et.al. 2014 (Ref 6)</p>
                    </list-item>
                    <list-item>
                        <p>The manuscript describes motif prediction in RNA-binding proteins, but all of the figures are showing DNA binding motifs where T shall be replaced by U in the RNA sequence. It is very weird that the paper describes RNA-binding proteins but all sequence logo are DNA binding motifs. Authors have to correct these errors in all sequence logo figures. As we all know methods for motif finding in either DNA or RNA sequences are almost the same, but the prediction results (e.g., DNA-binding motif, RNA-binding motif) are not the same in biology.</p>
                    </list-item>
                </list> 3)&#x00a0; I am not able to compile and run cpp code that authors provided in the github, nor can I find demo input data to reproduce their results. Authors need to provide compiled cpp binary code (e..g,, Linux and MAC version) as well as relevant demo data for readers to reproduce some of their predictions in the manuscript.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>No</p>
            <p>Is the description of the method technically sound?</p>
            <p>No</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>No</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>No</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>Theoretical Physics, Computational biology, Bioinformatics, Applied Mathematics, Algorithm</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.</p>
        </body>
        <back>
            <ref-list>
                <title>References</title>
                <ref id="rep-ref-299537-1">
                    <label>1</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters.</article-title>
                        <source>
                            <italic>J Mol Biol</italic>
                        </source>.<year>1987</year>;<volume>193</volume>(<issue>4</issue>) :
                        <elocation-id>10.1016/0022-2836(87)90354-8</elocation-id>
                        <fpage>723</fpage>-<lpage>50</lpage>
                        <pub-id pub-id-type="pmid">3612791</pub-id>
                        <pub-id pub-id-type="doi">10.1016/0022-2836(87)90354-8</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-299537-2">
                    <label>2</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>A biophysical approach to transcription factor binding site discovery.</article-title>
                        <source>
                            <italic>Genome Res</italic>
                        </source>.<year>2003</year>;<volume>13</volume>(<issue>11</issue>) :
                        <elocation-id>10.1101/gr.1271603</elocation-id>
                        <fpage>2381</fpage>-<lpage>90</lpage>
                        <pub-id pub-id-type="pmid">14597652</pub-id>
                        <pub-id pub-id-type="doi">10.1101/gr.1271603</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-299537-3">
                    <label>3</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE.</article-title>
                        <source>
                            <italic>Bioinformatics</italic>
                        </source>.<year>2006</year>;<volume>22</volume>(<issue>14</issue>) :
                        <elocation-id>10.1093/bioinformatics/btl223</elocation-id>
                        <fpage>e141</fpage>-<lpage>9</lpage>
                        <pub-id pub-id-type="pmid">16873464</pub-id>
                        <pub-id pub-id-type="doi">10.1093/bioinformatics/btl223</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-299537-4">
                    <label>4</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>BayesPI - a new model to study protein-DNA interactions: a case study of condition-specific protein binding parameters for Yeast transcription factors.</article-title>
                        <source>
                            <italic>BMC Bioinformatics</italic>
                        </source>.<year>2009</year>;<volume>10</volume>:
                        <elocation-id>10.1186/1471-2105-10-345</elocation-id>
                        <fpage>345</fpage>
                        <pub-id pub-id-type="pmid">19857274</pub-id>
                        <pub-id pub-id-type="doi">10.1186/1471-2105-10-345</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-299537-5">
                    <label>5</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Identifying functional regulatory mutation blocks by integrating genome sequencing and transcriptome data.</article-title>
                        <source>
                            <italic>iScience</italic>
                        </source>.<year>2023</year>;<volume>26</volume>(<issue>8</issue>) :
                        <elocation-id>10.1016/j.isci.2023.107266</elocation-id>
                        <fpage>107266</fpage>
                        <pub-id pub-id-type="pmid">37520692</pub-id>
                        <pub-id pub-id-type="doi">10.1016/j.isci.2023.107266</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-299537-6">
                    <label>6</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Enhanced regulatory sequence prediction using gapped k-mer features.</article-title>
                        <source>
                            <italic>PLoS Comput Biol</italic>
                        </source>.<year>2014</year>;<volume>10</volume>(<issue>7</issue>) :
                        <elocation-id>10.1371/journal.pcbi.1003711</elocation-id>
                        <fpage>e1003711</fpage>
                        <pub-id pub-id-type="pmid">25033408</pub-id>
                        <pub-id pub-id-type="doi">10.1371/journal.pcbi.1003711</pub-id>
                    </mixed-citation>
                </ref>
            </ref-list>
        </back>
        <sub-article article-type="response" id="comment15438-299537">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>schlusser</surname>
                            <given-names>niels</given-names>
                        </name>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>10</day>
                    <month>2</month>
                    <year>2026</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>1. The theory behind their proposed biophysical model for motif finding is very unclear to me. Authors claim that their method is based on Boltzmann machine, which has been used many decades in protein-DNA interaction theory such as a classical paper in biophysics published in 1987 Berg OG et.al.,(ref 1 ), several applications of it in DNA sequences analysis in Djordjevic M et.al., 2003 (Ref 2) and Foat BC et.al., 2006 (Ref 3), and new development in more advanced Fermi-Dirac form of protein-DNA interactions by considering protein concentration with a Bayesian solution Wang J et.al., 2009 (Ref 4) Yang M et.al., 2023 (Ref 5) for DNA motif finding. From my point of view, there is not a clear association between the authors&#x2019; sections (2.1 ,2.2) and the biophysical theory of aforementioned works that have been applied successfully in numerous datasets and applications. In particular, I am completely lost when authors describing &#x201c;combine (2.9) with (2.5), the output of the optimization procedure, and the reference (2.10) allows us to compute the logarithm of the dissociation constant of RBP-RNA binding &#x2026;&#x201d; in section 2.2 because I do not understand the reason for combining these two equations. After going through sections 3.1 and 3.2, I feel that the proposed method is actually in analogy to k-mer motif enrichment tests such as a paper in Ghandi M et.al. 2014 (Ref 6)</bold>
                </p>
                <p> </p>
                <p> Indeed, there is a large literature around the inference of sequence specificity of nucleic acid proteins, including using biophysical models. As explained in the text, the motifs that were derived from Bind&#x2019;n&#x2019;Seq data did not result from the application of any of these previously developed approaches, which is what we in fact did in our study. We used a Bayesian, thermodynamic model that was initially developed to analyze SELEX data and explicitly modeled sequence-non-specific interactions between proteins and RNA. We felt that such interactions could play an important role in obtaining the Bind&#x2019;n&#x2019;Seq data. Given the framework of our model we then derived relative KD&#x2019;s that allow us to compare binding sites of different possible lengths. We have realized that the sentence that the reviewer is referring to was somewhat confusing and we rewrote it to make the derivation more clear.&#x00a0;&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>2. The manuscript describes motif prediction in RNA-binding proteins, but all of the figures are showing DNA binding motifs where T shall be replaced by U in the RNA sequence. It is very weird that the paper describes RNA-binding proteins but all sequence logo are DNA binding motifs. Authors have to correct these errors in all sequence logo figures. As we all know methods for motif finding in either DNA or RNA sequences are almost the same, but the prediction results (e.g., DNA-binding motif, RNA-binding motif) are not the same in biology.</bold>
                </p>
                <p> </p>
                <p> On page 11, Section 4, where we have mentioned why the logos show Ts and not Us. We do not expect that users familiar with RNA biology will be confused by these logos.</p>
                <p> </p>
                <p> </p>
                <p> 
                    <bold>3)&#x00a0; I am not able to compile and run cpp code that authors provided in the github, nor can I find demo input data to reproduce their results. Authors need to provide compiled cpp binary code (e..g,, Linux and MAC version) as well as relevant demo data for readers to reproduce some of their predictions in the manuscript.</bold>
                </p>
                <p> </p>
                <p> We have revisited the repository, included data for demo as well as a Makefile for ease of compilation. We have not included executables as they will cause compatibility issues later on. The only dependency remaining is GSL, but that is a long-supported open source package.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report284354">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.165610.r284354</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>S&#x00f6;ding</surname>
                        <given-names>Johannes</given-names>
                    </name>
                    <xref ref-type="aff" rid="r284354a1">1</xref>
                    <xref ref-type="aff" rid="r284354a2">2</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9642-8244</uri>
                </contrib>
                <aff id="r284354a1">
                    <label>1</label>Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, G&#x00f6;ttingen, Germany</aff>
                <aff id="r284354a2">
                    <label>2</label>Campus-Institut Data Science (CIDAS), Georg-August-Universitat Gottingen, G&#x00f6;ttingen, Lower Saxony, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>5</day>
                <month>7</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 S&#x00f6;ding J</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport284354" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.135164.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>I am glad to see that the method appears to work well after the adjustment to the background model.</p>
            <p> </p>
            <p> 1. What other methods or software tools exist that can analyze Bind&#x2019;n&#x2019;seq data?</p>
            <p> </p>
            <p> 2. How does your tool compare (quantitatively) with others to analyze Bind&#x2019;n&#x2019;seq data?</p>
            <p> </p>
            <p> </p>
            <p> Minor comments:</p>
            <p> </p>
            <p> 3. Equation (2.4): &#x00a0;P(IG|S,c,M,E_0)&#x00a0; = &#x2026; =&gt;&#x00a0; P(S |&#x00a0; bound, c, M, E&#x2080;) =</p>
            <p> As a simple general rule, when you sum over all possible outcomes of all variables on the left of the conditioning bar |, you have to get 1. Here, summing over all possible sequences S in your library gives 1, thanks to the normalization in the denominator from Bayes&#x2019; rule.</p>
            <p> </p>
            <p> Same notation in the text a few lines below the equation.</p>
            <p> </p>
            <p> 4. A bit below: P(S) = f_s is *not* a likelihood but a prior probability.</p>
            <p> 5. In the conclusion, you wrote&#x201d; In addition, it is possible that the binding sites of these proteins are not contiguous, linear motifs, but rather contain variable length spacers of form structures such as G-quadruplexes.&#x201d; You could cite these manuscript here:</p>
            <p> (Jolma A, et al., 2020 [Ref 1]), (&#x00a0;Sohrabi-Jahromi S, 2021 [Ref 2])</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>No</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>Biophysics, statistical modeling</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <back>
            <ref-list>
                <title>References</title>
                <ref id="rep-ref-284354-1">
                    <label>1</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Binding specificities of human RNA-binding proteins toward structured and linear RNA sequences.</article-title>
                        <source>
                            <italic>Genome Res</italic>
                        </source>.<year>2020</year>;<volume>30</volume>(<issue>7</issue>) :
                        <elocation-id>10.1101/gr.258848.119</elocation-id>
                        <fpage>962</fpage>-<lpage>973</lpage>
                        <pub-id pub-id-type="pmid">32703884</pub-id>
                        <pub-id pub-id-type="doi">10.1101/gr.258848.119</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-284354-2">
                    <label>2</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins.</article-title>
                        <source>
                            <italic>Bioinformatics</italic>
                        </source>.<year>2021</year>;<volume>37</volume>(<issue>Suppl_1</issue>) :
                        <elocation-id>10.1093/bioinformatics/btab300</elocation-id>
                        <fpage>i308</fpage>-<lpage>i316</lpage>
                        <pub-id pub-id-type="pmid">34252974</pub-id>
                        <pub-id pub-id-type="doi">10.1093/bioinformatics/btab300</pub-id>
                    </mixed-citation>
                </ref>
            </ref-list>
        </back>
        <sub-article article-type="response" id="comment15437-284354">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>schlusser</surname>
                            <given-names>niels</given-names>
                        </name>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>10</day>
                    <month>2</month>
                    <year>2026</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <bold>1. What other methods or software tools exist that can analyze Bind&#x2019;n&#x2019;seq data?</bold>
                </p>
                <p> </p>
                <p> The ENCODE resource of RBP binding motifs is based on the &#x201c;streaming k-mer algorithm&#x201d; (SKA) that was developed along with the experimental method (
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.molcel.2014.04.016">https://doi.org/10.1016/j.molcel.2014.04.016</ext-link>). These motifs are typically used as such in downstream analyses (e.g. 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/bib/bbab149">https://doi.org/10.1093/bib/bbab149</ext-link>). Dissociation constants calculated from Bind&#x2019;n&#x2019;Seq data with SKA correlated better with the values measured by surface plasmon resonance for a few control RBPs (RBFOX2, CELF1, MBNL1) than the estimates based on motif enrichment scores. We have not come across many efforts to reanalyse these data. One study adapted the Bind&#x2019;n&#x2019;Seq strategy to determine protein-DNA interactions in bacteria and in the process applied a different type of motif enrichment approach, implemented as an R script and seemingly optimized for bacteria (
                    <ext-link ext-link-type="uri" xlink:href="https://bmcmicrobiol.biomedcentral.com/articles/10.1186/s12866-019-1672-7">https://bmcmicrobiol.biomedcentral.com/articles/10.1186/s12866-019-1672-7</ext-link>). Another study compared motifs and predictions made for RBPs studied with multiple methods (RNAcompete, Bind&#x2019;n&#x2019;Seq, CLIP) and in this context tested various ways to estimate motif enrichment scores (
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/bib/bbab149">https://doi.org/10.1093/bib/bbab149</ext-link>). Differences between motif enrichment scores calculated in various ways from Bind&#x2019;n&#x2019;Seq data were negligible, while the differences between the motifs obtained with different experimental techniques were substantial.</p>
                <p> </p>
                <p> 
                    <bold>2. How does your tool compare (quantitatively) with others to analyze Bind&#x2019;n&#x2019;seq data?</bold>
                </p>
                <p> </p>
                <p> The only method previously used to analyze Bind&#x2019;n&#x2019;seq data is SKA. In the absence of &#x201c;ground truth&#x201d; PWMs that might be obtained, for e.g. from biochemical measurements, we could only assess global features of the results of our approach and of SKA: the number/proportion of experiments where a motif could be detected, the number/proportion of motifs that were similar, the complexity of the respective motifs when the motifs disagreed. These results are presented at the end of the second paragraph in Section 4: Results (top of page 8): SKA always reports a motif, even when it does not agree with the consensus known from other types of experiments. We identified motifs for 82 of the 111 RBPs (74%). In 48 cases, these motifs agreed with or contained the consensus motif from SKA and in 14 additional cases the motifs were more distant, though still bearing some similarity. However, in 19 cases our method could converge on the SKA-reported motif, despite starting from the SKA-reported consensus. Rather, the motifs identified were poly(A). This supports the notion that some experiment-dependent factors obscure the specificity of RBPs.&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>Minor comments:&#x00a0;</bold>
                </p>
                <p> </p>
                <p> 
                    <bold>3. Equation (2.4):&#x00a0; P(IG|S,c,M,E_0)&#x00a0; = &#x2026; =&gt;&#x00a0; P(S |&#x00a0; bound, c, M, E&#x2080;) =</bold>
                </p>
                <p> 
                    <bold>As a simple general rule, when you sum over all possible outcomes of all variables on the left of the conditioning bar |, you have to get 1. Here, summing over all possible sequences S in your library gives 1, thanks to the normalization in the denominator from Bayes&#x2019; rule.</bold>
                </p>
                <p> </p>
                <p> 
                    <bold>Same notation in the text a few lines below the equation.</bold>
                </p>
                <p> </p>
                <p> </p>
                <p> We agree with the referee and have changed the notation as suggested.</p>
                <p> </p>
                <p> 
                    <bold>4. A bit below: P(S) = f_s is *not* a likelihood but a prior probability.</bold>
                </p>
                <p> </p>
                <p> We have changed the wording to &#x201c;probability&#x201d; instead of &#x201c;likelihood.&#x201d;</p>
                <p> </p>
                <p> 
                    <bold>5. In the conclusion, you wrote&#x201d; In addition, it is possible that the binding sites of these proteins are not contiguous, linear motifs, but rather contain variable length spacers of form structures such as G-quadruplexes.&#x201d; You could cite these manuscript here:</bold>
                </p>
                <p> (Jolma A, et al., 2020 [Ref 1]), ( Sohrabi-Jahromi S, 2021 [Ref 2])</p>
                <p> </p>
                <p> We have added the two citations to the discussion.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report215390">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.148267.r215390</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Zhang</surname>
                        <given-names>Jun</given-names>
                    </name>
                    <xref ref-type="aff" rid="r215390a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5842-7424</uri>
                </contrib>
                <aff id="r215390a1">
                    <label>1</label>Texas Tech University Health Science Center El Paso (TTUHSCEP), El Paso, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>22</day>
                <month>11</month>
                <year>2023</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Zhang J</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport215390" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.135164.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>reject</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The manuscript presents a thermodynamic model for scrutinizing RNA motifs associated with RNA-binding proteins, utilizing position-specific weight matrices. Although the method effectively characterizes the RNA-binding specificity of RBFOX2, its application to other RNA-binding proteins, such as CELF1, HNRNPD, and HNRNPK, proves unsuccessful. The manuscript, categorized as a method paper, lacks comprehensive details about the model's development and fails to adequately address why the method specifically succeeds for RBFOX2.</p>
            <p> </p>
            <p> Key areas of improvement are outlined below: 
                <list list-type="order">
                    <list-item>
                        <p>
                            <bold>Unclear Definition of Parameter PIP:</bold>
                        </p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>The manuscript lacks a clear definition of the parameter PIP. Providing a concise explanation will enhance reader understanding and facilitate the application of the method.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Determination of Parameter c and its Relation to Protein Concentrations:</bold>
                        </p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Clarify how the parameter c is determined and elaborate on its relationship with the concentrations of RNA-binding proteins. A more in-depth explanation will contribute to the method's transparency.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Thermodynamic Model Explanation:</bold>
                        </p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Offer a more detailed explanation of the thermodynamic model to eliminate the necessity for readers to consult the original reference. This will enhance accessibility and comprehension for a wider audience.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Derivation of Equation 2.1:</bold>
                        </p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Clearly outline the derivation of Equation 2.1 to provide readers with insights into the model's foundational principles. This will aid in understanding the method's inner workings.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Computation Time Information:</bold>
                        </p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Include information about the expected computation time, as this is crucial for users assessing the feasibility of implementing the method in their own studies.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Evidence for Data Quality Variation:</bold>
                        </p>
                        <p> &#x00a0; 
                            <list list-type="bullet">
                                <list-item>
                                    <p>While the manuscript attributes the method's failure for CELF1, HNRNPD, and HNRNPK to lower data quality, provide concrete evidence supporting this claim. Offer a detailed analysis of the data quality disparities between RBFOX2 and the other proteins.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Consideration of Variable Spacer Length in RNA-Binding Proteins:</bold> 
                            <list list-type="bullet">
                                <list-item>
                                    <p>Acknowledge the fact that many RNA-binding proteins, including CELF1, HNRNPD, and HNRNPK, bind to RNA with variable spacer lengths. The manuscript should discuss why the PWM method, assuming a fixed RNA motif length, may be inadequate for proteins with multiple RNA-binding domains connected by long linkers. This acknowledgment will help users understand the method's limitations and guide appropriate applications.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                </list> In conclusion, addressing these points will significantly enhance the manuscript's clarity, transparency, and utility, providing a more comprehensive understanding of the developed method and its limitations.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Partly</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Partly</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>No source data required</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>RNA-binding protein, structural biology.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment12055-215390">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>schlusser</surname>
                            <given-names>niels</given-names>
                        </name>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>22</day>
                    <month>7</month>
                    <year>2024</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Dear Reviewer,&#x00a0;</p>
                <p> </p>
                <p> We apologize for the delay in submitting the answer to your valuable comments. The delay was caused by a technical issue with the web portal.</p>
                <p> Thank you for the insightful comments and questions related to our paper, which we have</p>
                <p> taken into careful consideration in revising our manuscript. We believe the manuscript is</p>
                <p> much improved as a result. Below we comment on the most significant changes.</p>
                <p> </p>
                <p> As a result of the revised terminology and calculation of frequency priors f_S, some of the</p>
                <p> comments have become superfluous. Nevertheless, the main comments were addressed as</p>
                <p> follows: 
                    <list list-type="order">
                        <list-item>
                            <p>We clarified the definition of P_IP in and around Eq. (2.4), and the notation has changed.</p>
                        </list-item>
                        <list-item>
                            <p>Around Eq. (2.1), we elaborated on the role of the ratio of concentrations of bound and unbound RBP c.</p>
                        </list-item>
                        <list-item>
                            <p>We explained the derivation of the thermodynamic model in greater detail (c.f. above comments) so that it does not require the reader to consult the cited reference of [1].</p>
                        </list-item>
                        <list-item>
                            <p>We also gave more detail around Eq. (2.1) (c.f. item 2).</p>
                        </list-item>
                        <list-item>
                            <p>Information about the computation time is given in the first paragraph of Sec. 4 &#x201d;results&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Regarding the evidence for data quality variation: we have applied the model to the entire set of RBPs assayed by RNA Bind&#x2019;n Seq and, with the revised prior calculation we recovered the consensus motifs for 48 out of 82 RBPs. While in response to the comments of reviewer #1 we also tried initialization with the consensus motif/enriched kmers, the results for the 34 proteins for which random initialization did not lead to the recovery of meaningful, non-poly(A) motif, have not improved. Thus, our model is, in principle, able to recover the expected binding motifs.</p>
                        </list-item>
                        <list-item>
                            <p>The reviewer suggested that the PWM model may be inadequate for many RBPs that bind to discontinuous motifs. While a large body of prior work uses the PWM representation of RBP binding sites, there are known example where structure plays a role. How to appropriately represent structured binding sites is an open problem that we think is beyond the scope of our work. Nevertheless, we added a comment about this possibility explaining some of our results in the discussion.</p>
                        </list-item>
                    </list> We hope we have adequately responded to your comments.</p>
                <p> </p>
                <p> Yours sincerely,</p>
                <p> The authors</p>
                <p> </p>
                <p> </p>
                <p> References</p>
                <p> [1] S. Omidi, M. Zavolan, M. Pachkov, J. Breda, S. Berger, and E. van Nimwegen, Automated incor-</p>
                <p> poration of pairwise dependency in transcription factor binding site prediction using dinucleotide</p>
                <p> weight tensors, PLOS Computational Biology 13 (2017) 1.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report192424">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.148267.r192424</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>S&#x00f6;ding</surname>
                        <given-names>Johannes</given-names>
                    </name>
                    <xref ref-type="aff" rid="r192424a1">1</xref>
                    <xref ref-type="aff" rid="r192424a2">2</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9642-8244</uri>
                </contrib>
                <aff id="r192424a1">
                    <label>1</label>Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, G&#x00f6;ttingen, Germany</aff>
                <aff id="r192424a2">
                    <label>2</label>Campus-Institut Data Science (CIDAS), Georg-August-Universitat Gottingen, G&#x00f6;ttingen, Lower Saxony, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>31</day>
                <month>8</month>
                <year>2023</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 S&#x00f6;ding J</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport192424" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.135164.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The manuscript describes a method to learn &#x00a0;position-weight matrices (PWM) to describe the sequence specific binding of an RBP given Bind'n'seq data for this RBP. The method is based on a biophysically motivated, statistical model for the likelihood of Bind'n'seq data given the PWM to describe the sequence specific binding of the RBP to the library DNA oligomers. The model parameters (PWM and non-specific binding energy E&#x2080;) are learned using an expectation maximization algorithm. The method is applied to Bind'n'seq data for 9 RBPs. For some RBPs correct PWMs are retrieved. For others, the most likely PWM does not describe the known binding motif but is rather an unrelated, low-complexity motif that appears to reflect simple sequence biases in the experiment. &#x00a0;</p>
            <p> </p>
            <p> The statistical model is very sound and the manuscript is well written. The results are disappointing, in particular since it is not clear whether the negative results for a good part of the data sets are due to the inadequacy of the statistical model or some bug in the implementation.&#x00a0;</p>
            <p> </p>
            <p> 1. My guess is the following. The method uses a Markov model of order d to model the prior probability P(S) = f_S for a sequence S to be part of the background library (before enrichment). It was found that "the better the prediction of likelihood of sequences in the foreground samples." This led the authors to set d=14. A Markov model of order d=14 has 3*4^28 ~ 10^9 parameters, around the same number as 14-mers in the entire sequencing library of around 10^7 reds. This means the Markov model is hopelessly over-parameterized and the reason that the likelihood was increasing for increasing d is simply that the overlap between the (d+1)-mers in the background and foreground libraries decreases with increasing d. A reasonable choice of d is around 4, certainly not more than 8. The overfitting of the pior probability model could explain the many weird local optima observed by the authors, since it adds enormous noise to the sequence space such that a single mutation can change the f_s dramatically, resulting in many spurious PWM &#x00a0;minima.</p>
            <p> </p>
            <p> 2. The notation in equations (2.1) to (2.4) is quite "unstatistical" and confusing. The way (2.1) and &#x00a0;(2.3) are written, the left hand side would need to sum to one when summing over all sequences s or S, respectively. This is of course not the case. What the authors rather meant to write on the left hand side of eq (2.1) is P(bound | s,c,M,E&#x2080;), with a variable &#x00a0;bound in {0,1} indicating whether the sequence on the right side of the conditioning is bound by the RBR or not. The left hand side of eq. (2.3) should read P(bound | S,c,M,E&#x2080;).</p>
            <p> </p>
            <p> With this corrected notation and using f_S = P(S) for the prior probability of finding a sequence S in the background library (with &#x2211;_S P(S) = 1), equation (2.4) turns into the correct Bayes theorem,</p>
            <p> </p>
            <p> &#x00a0; &#x00a0;P(S | bound,c,M,E&#x2080;) = P(bound | S,c,M,E&#x2080;) &#x00a0;P(S) / &#x00a0;&#x2211;_&#x03c3; &#x00a0;P(bound | &#x03c3;,c,M,E&#x2080;) &#x00a0;P(&#x03c3;) .</p>
            <p> </p>
            <p> Please correct the text accordingly:&#x00a0;</p>
            <p> "(2.4) is essentially a formulation of Bayes&#x2019; theorem with conditional probability P_b(S|c,M,E&#x2080;) of having a read S bound by an RBP, the likelihood of finding a read S in the pool washed over the RBP, fS, and an overall normalization (denominator)."&#x00a0;</p>
            <p> =&gt;</p>
            <p> "(2.4) is essentially a formulation of Bayes&#x2019; theorem with the likelihood P(bound | S,c,M,E&#x2080;) of having a read S bound by an RBP, the likelihood of finding a read S in the pool washed over the RBP, P(S)=f_S, and an overall normalization in the denominator."&#x00a0;</p>
            <p> </p>
            <p> 3. To improve the search for the global optimum, the authors could compute the most highly enriched 6-mers and start their PWM optimization from a soft version of each of the top 20 or so 6-mers.&#x00a0;</p>
            <p> </p>
            <p> 4. Please derive the EM update equations transparently.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>No</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>Biophysics, statistical modeling</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment11479-192424">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>schlusser</surname>
                            <given-names>niels</given-names>
                        </name>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>26</day>
                    <month>4</month>
                    <year>2024</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Dear reviewer,</p>
                <p> </p>
                <p> Thank you for the insightful comments and questions related to our paper, which we have</p>
                <p> taken into careful consideration in revising our manuscript. We believe the manuscript is</p>
                <p> much improved as a result. Below we comment on the most significant changes.</p>
                <p> </p>
                <p> Addressing your feedback:</p>
                <p> </p>
                <p> 1. You suggested that we do not over-parameterize the background model, but</p>
                <p> rather use a Markov model of order 4&#x2212;8. We followed this suggestion, setting d = 4 and</p>
                <p> adapting the normalization of frequency priors accordingly. While implementing this</p>
                <p> suggestion we also noticed a mistake in the initial implementation of the normalization,</p>
                <p> which likely contributed to higher order background giving improved results initially.</p>
                <p> As a result of these changes we found non-trivial motifs for a larger proportion of the</p>
                <p> RBPs, which led us to apply the model to all RBP RNA Bind&#x2019;seq datasets in ENCODE</p>
                <p> (111 in total).</p>
                <p> </p>
                <p> 2. You also pointed out some sloppiness in our initial notation, which we have</p>
                <p> now corrected, along with the text referring to the respective equation (2.4).</p>
                <p> </p>
                <p> 3. We did try various ways of initializing the PWM from enriched kmers and consensus</p>
                <p> motifs, but the number of cases where this led to the convergence to the expected motif,</p>
                <p> while the random initialization did not, was very small.</p>
                <p> </p>
                <p> 4. We elaborated more on the derivation of the EM update equations, in particular, we</p>
                <p> give the derivative whose root is calculated.</p>
                <p> </p>
                <p> We hope we have adequately responded to your comments.</p>
                <p> </p>
                <p> Yours sincerely,</p>
                <p> Mihaela Zavolan and Niels Schlusser</p>
            </body>
        </sub-article>
    </sub-article>
</article>
