<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="systematic-review" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.151493.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Systematic Review</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>(Semi)automated approaches to data extraction for systematic reviews and meta-analyses in social sciences: A living review</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 2 approved, 2 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Legate</surname>
                        <given-names>Amanda</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7763-7630</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Nimon</surname>
                        <given-names>Kim</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2543-8386</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Noblin</surname>
                        <given-names>Ashlee</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Human Resource Development, The University of Texas at Tyler, Tyler, Texas, 75799, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:alegate@patriots.uttyler.edu">alegate@patriots.uttyler.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>26</day>
                <month>9</month>
                <year>2024</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2024</year>
            </pub-date>
            <volume>13</volume>
            <elocation-id>664</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>23</day>
                    <month>9</month>
                    <year>2024</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Legate A et al.</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/13-664/pdf"/>
            <abstract>
                <sec>
                    <title>Background</title>
                    <p>An abundance of rapidly accumulating scientific evidence presents novel opportunities for researchers and practitioners alike, yet such advantages are often overshadowed by resource demands associated with finding and aggregating a continually expanding body of scientific information. Data extraction activities associated with evidence synthesis have been described as time-consuming to the point of critically limiting the usefulness of research. Across social science disciplines, the use of automation technologies for timely and accurate knowledge synthesis can enhance research translation value, better inform key policy development, and expand the current understanding of human interactions, organizations, and systems. Ongoing developments surrounding automation are highly concentrated in research for evidence-based medicine with limited evidence surrounding tools and techniques applied outside of the clinical research community. The goal of the present study is to extend the automation knowledge base by synthesizing current trends in the application of extraction technologies of key data elements of interest for social scientists.</p>
                </sec>
                <sec>
                    <title>Methods</title>
                    <p>We report the baseline results of a living systematic review of automated data extraction techniques supporting systematic reviews and meta-analyses in the social sciences. This review follows PRISMA standards for reporting systematic reviews.</p>
                </sec>
                <sec>
                    <title>Results</title>
                    <p>The baseline review of social science research yielded 23 relevant studies.</p>
                </sec>
                <sec>
                    <title>Conclusions</title>
                    <p>When considering the process of automating systematic review and meta-analysis information extraction, social science research falls short as compared to clinical research that focuses on automatic processing of information related to the PICO framework. With a few exceptions, most tools were either in the infancy stage and not accessible to applied researchers, were domain specific, or required substantial manual coding of articles before automation could occur. Additionally, few solutions considered extraction of data from tables which is where key data elements reside that social and behavioral scientists analyze.</p>
                </sec>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Automated data extraction</kwd>
                <kwd>systematic review</kwd>
                <kwd>meta-analysis</kwd>
                <kwd>evidence synthesis</kwd>
                <kwd>social science research</kwd>
                <kwd>APA Journal Article Reporting Standards (JARS)</kwd>
            </kwd-group>
            <funding-group>
                <funding-statement>The author(s) declared that no grants were involved in supporting this work.</funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>In this revised version, several updates have been made to improve clarity, transparency, and align with peer review feedback: (a) Minor grammatical revisions have been made throughout the manuscript, including the use of passive construction where suggested; (b) Acronyms for databases and search platforms have been spelled out to ensure accessibility for readers; (c) Database search dates have been added to enhance transparency and replicability; (d) Additional references to the research protocol and extended data files have been incorporated to provide greater context and transparency regarding the review methodology; and (e) Search strategy limitations, particularly concerning language restrictions, database selection, and the exclusion of gray literature, have been explicitly addressed in the limitations section.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec id="sec5" sec-type="intro">
            <title>Introduction</title>
            <p>Across disciplines, systematic reviews and meta-analyses are integral to exploring and explaining phenomena, discovering causal inferences, and supporting evidence-based decision making. The concept of metascience represents an array of evidence synthesis approaches which support combining existing research results to summarize what is known about a specific topic (
                <xref ref-type="bibr" rid="ref18">Davis et al., 2014</xref>; 
                <xref ref-type="bibr" rid="ref30">Gough et al., 2020</xref>). Researchers use a variety of systematic review methodologies to synthesize evidence within their domains or to integrate extant knowledge bases spanning multiple disciplines and contexts. When engaging in quantitative evidence synthesis, researchers often supplement the systematic review with meta-analysis (a principled statistical process for grouping and summarizing quantitative information reported across studies within a research domain). As technology advances, in addition to greater access to data, researchers are presented with new forms and sources of data to support evidence synthesis (
                <xref ref-type="bibr" rid="ref12">Bosco et al.
                    <italic toggle="yes">,</italic> 2017</xref>; 
                <xref ref-type="bibr" rid="ref33">Ip et al., 2012</xref>; 
                <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>).</p>
            <p>Systematic reviews and meta-analyses are fundamental to supporting reproducibility and generalizability of research surrounding social and cultural aspects of human behavior, however, the process of extracting data from primary research is a labor-intensive effort, fraught with the potential for human error (see 
                <xref ref-type="bibr" rid="ref59">Pigott &amp; Polanin, 2020</xref>). Comprehensive data extraction activities associated with evidence synthesis have been described as time-consuming to the point of critically limiting the usefulness of existing approaches (
                <xref ref-type="bibr" rid="ref32">Holub et al., 2021</xref>). Moreover, research indicates that it can take several years for original studies to be included in a new review due to the rapid pace of new evidence generation (
                <xref ref-type="bibr" rid="ref35">Jonnalagadda et al., 2015</xref>).</p>
            <sec id="sec6">
                <title>The need for this review</title>
                <p>In the clinical research domain, particularly in Randomized Control Trials (RCTs), automation technologies for data extraction are evolving rapidly (see 
                    <xref ref-type="bibr" rid="ref65">Schmidt et al., 2023</xref>). In contrast with the more defined standards that have evolved throughout clinical research domains, within and across social sciences, substantial variation exists in research designs, reporting protocols, and even publication outlet standards (
                    <xref ref-type="bibr" rid="ref18">Davis et al., 2014</xref>; 
                    <xref ref-type="bibr" rid="ref70">Short et al., 2018</xref>; 
                    <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>). In health intervention research, targeted data elements generally include Population (or Problem), Intervention, Control, and Outcome (i.e., PICO; see 
                    <xref ref-type="bibr" rid="ref24">Eriksen and Frandsen, 2018</xref>; 
                    <xref ref-type="bibr" rid="ref72">Tsafnat et al., 2014</xref>). While experimental designs are considered a gold-standard for translational value, many phenomena examined across the social sciences occur within contexts which necessitate research pragmatism in both design and methodological considerations (
                    <xref ref-type="bibr" rid="ref18">Davis et al., 2014</xref>).</p>
                <p>Consider, for example, the field of Human Resource Development (HRD). In HRD, a primary focal hub for research includes outcomes of workplace interventions intended to inform and improve areas such as learning, training, organizational development, and performance improvement (
                    <xref ref-type="bibr" rid="ref68">Shirmohammadi et al., 2021</xref>). While measuring intervention outcomes is a substantial area of discourse, HRD researchers have predominantly relied on cross-sectional survey data and the most commonly employed quantitative method is Structural Equation Modeling (
                    <xref ref-type="bibr" rid="ref57">Park et al., 2021</xref>). Thus, meta-analyses are increasingly essential for supporting reproducibility and generalizability of research. In these fields, data elements targeted for extraction would rarely align with the PICO framework, but rather, meta-analytic endeavors would entail extraction of measures such as effect sizes, model fit indices, or instrument psychometric properties (
                    <xref ref-type="bibr" rid="ref8">Appelbaum et al., 2018</xref>).</p>
            </sec>
            <sec id="sec7">
                <title>Related research</title>
                <p>Serving as a model for the present study, 
                    <xref ref-type="bibr" rid="ref65">Schmidt et al. (2023)</xref> are conducting a living systematic review (LSR) of tools and techniques available for (semi)automated extraction of data elements pertinent to synthesizing the effects of healthcare interventions (see 
                    <xref ref-type="bibr" rid="ref31">Higgins et al., 2022</xref>). Exploring a range of data-mining and text classification methods for systematic reviews, the authors uncovered that early often employed approaches (e.g., rule-based extraction) gave way to classical machine-learning (e.g., na&#x00ef;ve Bayes and support vector machine classifiers), and more recently, trends indicate increased application of deep learning architectures such as neural networks and word embeddings (for yearly trends in reported systems architectures, see 
                    <xref ref-type="bibr" rid="ref63">Schmidt et al., 2021</xref>, p. 8).</p>
                <p>In social sciences and related disciplines, several related reviews of tools and techniques for automating tasks associated with systematic reviews and meta-analyses have been conducted. 
                    <xref ref-type="table" rid="T1">Table 1</xref> provides a summary of related research.</p>
                <table-wrap id="T1" orientation="portrait" position="float">
                    <label>Table 1. </label>
                    <caption>
                        <title>Related literature.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Reference</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Research discipline (Sample size)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Primary focus</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref4">Antons et al. (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Innovation (
                                    <italic toggle="yes">n</italic>=140)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Text mining methods in innovation research</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref14">Cairo et al. (2019)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Computer Science (
                                    <italic toggle="yes">n</italic>=17)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">ML techniques for secondary studies</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref21">Dridi et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Management (
                                    <italic toggle="yes">n</italic>=124)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Scholarly data mining applications</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref28">G&#x00f6;pfert et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Multidisciplinary (
                                    <italic toggle="yes">n</italic>=80)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Measurement extraction methods using NLP</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref25">Feng et al. (2017)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Software Engineering (
                                    <italic toggle="yes">n</italic>=32)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Text mining techniques and tools to facilitate SLRs</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref38">Kohl et al. (2018)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Multidisciplinary (
                                    <italic toggle="yes">n</italic>=22)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Tools for systematic reviews and mapping studies</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref62">Roldan-Baluis et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Multidisciplinary (
                                    <italic toggle="yes">n</italic>=46)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">NLP and ML for processing unstructured texts in digital format</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref69">Sundaram and Berleant (2023)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Multidisciplinary (
                                    <italic toggle="yes">n</italic>=29)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Text mining-based automation of SLRs</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref73">Wagner et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Information Systems and related Social Sciences (NR)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Review of AI in literature reviews</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref76">Yang et al. (2023)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Education (
                                    <italic toggle="yes">n</italic>=161)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Text mining techniques in educational research</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <p>
                            <italic toggle="yes">Note.</italic> AI = Artificial Intelligence, ML = Machine Learning, SLR = Systematic Literature Review, NLP = Natural Language Processing, NR = Not Reported.</p>
                    </table-wrap-foot>
                </table-wrap>
                <p>Based on extant reviews analyzing trends in Artificial Intelligence (AI) technologies for automating Systematic Literature Review (SLR) efforts outside of clinical domains, we noted several trends. First, techniques to facilitate abstraction, generalization, and grouping of primary studies represent the majority of (semi)automated approaches. Second, extant reviews highlight a predominant focus on supporting search and study selection stages, with significant gaps in (semi)automating data extraction. Third, evaluation concerns underscore the importance of performance metrics, validation procedures, benchmark datasets and improved transparency and reporting standards to ensure the reliability and effectiveness of AI techniques. Finally, challenges in cross-discipline transferability illuminate the need for domain-specific adaptations and infrastructures.</p>
                <p>Existing reviews evidence the widespread application of techniques such as topic modeling, clustering, and classification to support abstraction, generalization, and grouping of primary research studies. Topic modeling, particularly Latent Dirichlet Allocation (LDA), is commonly applied to (semi)automate content analysis, facilitating the distillation of complex information into meaningful insights and identification of overarching trends and patterns across a literature corpus (
                    <xref ref-type="bibr" rid="ref4">Antons et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref21">Dridi et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref62">Roldan-Baluis et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref76">Yang et al., 2023</xref>). Additionally, classification and clustering techniques are commonly applied for tasks such as mining article metadata and automatically grouping papers by relevance to SLR research questions are (
                    <xref ref-type="bibr" rid="ref25">Feng et al., 2017</xref>; 
                    <xref ref-type="bibr" rid="ref69">Sundaram &amp; Berleant, 2023</xref>; 
                    <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>).</p>
                <p>(Semi)automation efforts in social sciences and related disciplines have primarily addressed supporting the search and study selection stages of SLRs (
                    <xref ref-type="bibr" rid="ref14">Cairo et al., 2019</xref>; 
                    <xref ref-type="bibr" rid="ref25">Feng et al., 2017</xref>), with significant gaps in automation techniques for tasks such as data extraction (
                    <xref ref-type="bibr" rid="ref28">G&#x00f6;pfert et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref69">Sundaram &amp; Berleant, 2023</xref>). Further, available software tools lack functionality to support activities beyond study selection (
                    <xref ref-type="bibr" rid="ref38">Kohl et al., 2018</xref>). Key findings across these reviews underscore the need for more comprehensive automation solutions, particularly for quantitative data extraction (
                    <xref ref-type="bibr" rid="ref28">G&#x00f6;pfert et al., 2022</xref>).</p>
                <p>Additionally, researchers express transparency concerns regarding AI&#x2019;s reliance on black box models (
                    <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>) and limited visibility into underlying processes and algorithms in proprietary software solutions (
                    <xref ref-type="bibr" rid="ref4">Antons et al., 2020</xref>). Adding to these considerations, 
                    <xref ref-type="bibr" rid="ref4">Antons et al. (2020)</xref> identified substantial reporting gaps, including 35 of 140 articles omitting details about software used. Since metrics alone may not be sufficient to objectively assess AI performance (
                    <xref ref-type="bibr" rid="ref21">Dridi et al., 2021</xref>), strategies for mitigating bias and ensuring transparency and fairness represent a substantial topic of automation discourse.</p>
                <p>Ongoing research of AI tools for clinical studies (
                    <xref ref-type="bibr" rid="ref69">Sundaram &amp; Berleant, 2023</xref>) and the extraction of PICO data elements from RCTs (
                    <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>) underscore the success of domain-specific adaptation efforts. While the promise of adopting AI-based techniques and tools in social science domains is evident (
                    <xref ref-type="bibr" rid="ref14">Cairo et al., 2019</xref>; 
                    <xref ref-type="bibr" rid="ref25">Feng et al., 2017</xref>), extant research reveals challenges in transferring existing technologies across disciplines. Further, many SLR software applications are tailored specifically for health and medical science research (
                    <xref ref-type="bibr" rid="ref38">Kohl et al., 2018</xref>). Literature suggests that overcoming global obstacles can be facilitated by concentrated efforts to develop domain-specific knowledge representations, such as standardized construct taxonomies and vocabularies (
                    <xref ref-type="bibr" rid="ref25">Feng et al., 2017</xref>; 
                    <xref ref-type="bibr" rid="ref28">G&#x00f6;pfert et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>).</p>
            </sec>
            <sec id="sec8">
                <title>Objectives</title>
                <p>In the present study, we conduct a baseline review of existing and emergent techniques for the (semi)automated data extraction which focus on target data entities and elements relevant to evidence synthesis across social sciences research domains. This review covers data extraction tools for a range of data types&#x2014;both quantitative and qualitative. Per the research protocol, social sciences categories included in this review were based on the branches of science and academic activity boundaries described by 
                    <xref ref-type="bibr" rid="ref82">Cohen (2021</xref>; Chapter 2). Additional description is available in the project repositories, see &#x2018;Data availability&#x2019; section. We report findings that supplement the growing body of research dedicated to the automatic extraction of data from clinical and medical research.</p>
            </sec>
        </sec>
        <sec id="sec9" sec-type="methods">
            <title>Methods</title>
            <sec id="sec10">
                <title>Protocol registration</title>
                <p>This LSR was conducted following a pre-registered and published protocol (
                    <xref ref-type="bibr" rid="ref42">Legate &amp; Nimon, 2023b</xref>). For additional details and project repositories, see &#x2018;Data availability&#x2019; section.</p>
            </sec>
            <sec id="sec11">
                <title>Living review</title>
                <p>We adopted the LSR methodology for this study primarily due to the pace of emerging evidence, particularly in light of ongoing technological advancements. The ongoing nature of an LSR allows for continuous surveillance, ensuring timely presentation of new information that may influence findings (
                    <xref ref-type="bibr" rid="ref22">Elliott et al., 2014</xref>, 
                    <xref ref-type="bibr" rid="ref23">2017</xref>; 
                    <xref ref-type="bibr" rid="ref37">Khamis et al., 2019</xref>). This baseline review was initiated upon peer approval of the associated protocol (
                    <xref ref-type="bibr" rid="ref42">Legate &amp; Nimon, 2023b</xref>). It remains our intent for the review to be continually updated via living methodological surveys of published research (
                    <xref ref-type="bibr" rid="ref37">Khamis et al., 2019</xref>) following the workflow schedule as previously published in the protocol (see 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>; 
                    <xref ref-type="bibr" rid="ref42">Legate &amp; Nimon, 2023b</xref>). Necessary adjustments to the workflow will be detailed within each subsequent update.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>LSR workflow.</title>
                        <p>This image is reproduced under the terms of a Creative Commons Attribution 4.0 International license (CC-BY 4.0) from 
                            <xref ref-type="bibr" rid="ref42">Legate and Nimon (2023b)</xref>.</p>
                        <p>
                            <italic toggle="yes">Note.</italic> Arrows represent stages involved in a static systematic review; the dotted line (from &#x201c;Publish Report&#x201d; to &#x201c;Search&#x201d;) represents the stage at which the review process is repeated from the beginning while the review remains in living status.</p>
                    </caption>
                    <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure1.gif"/>
                </fig>
            </sec>
            <sec id="sec12">
                <title>Eligibility criteria</title>
                <p>As in prior reviews, English language reports, published 2005 or later were considered for inclusion (
                    <xref ref-type="bibr" rid="ref35">Jonnalagadda et al., 2015</xref>; 
                    <xref ref-type="bibr" rid="ref54">O&#x2019;Mara-Eves et al., 2015</xref>; 
                    <xref ref-type="bibr" rid="ref64">Schmidt et al., 2020</xref>). Eligible studies utilized, presented, and/or evaluated semi-automated approaches to support evidence-synthesis research methods (e.g., systematic reviews, psychometric meta-analyses, meta-analyses of effect sizes, etc.). Studies may have reported on any automated technique for data extraction, given that at least one entity was extracted semi-automatically from the abstracts or full-text of a literature corpus and sufficient detail was reported for:
                    <list list-type="alpha-lower">
                        <list-item>
                            <label>a)</label>
                            <p>entity(ies) or data element(s) extracted;</p>
                        </list-item>
                        <list-item>
                            <label>b)</label>
                            <p>location of the extracted entities (e.g., abstract, methods, results sections); and</p>
                        </list-item>
                        <list-item>
                            <label>c)</label>
                            <p>the automation tool and/or technique used to support extraction.</p>
                        </list-item>
                    </list>
                </p>
                <p>Editorials, briefs, or opinion pieces and/or engaged in narrative discussion without applying automation tools or technologies to extract data from research literature were not considered eligible. Per the protocol, studies were also considered ineligible if they applied tools or techniques to:
                    <list list-type="alpha-lower">
                        <list-item>
                            <label>a)</label>
                            <p>extract data exclusively from medical, biomedical, clinical (e.g., RCTs), or natural science research;</p>
                        </list-item>
                        <list-item>
                            <label>b)</label>
                            <p>extract metadata only (i.e., citation details) from research articles; or</p>
                        </list-item>
                        <list-item>
                            <label>c)</label>
                            <p>extract data from alternative (i.e., non-research literature) sources (e.g., web scraping, electronic communications, transcripts, etc.).</p>
                        </list-item>
                    </list>
                </p>
            </sec>
            <sec id="sec13">
                <title>Search sources</title>
                <p>The search strategy for this review was developed by adapting the search strategy from a related LSR of clinical research (
                    <xref ref-type="bibr" rid="ref64">Schmidt et al., 2020</xref>).</p>
                <p>We initially intended to conduct searches in the same databases used by 
                    <xref ref-type="bibr" rid="ref64">Schmidt et al. (2020</xref>, 
                    <xref ref-type="bibr" rid="ref63">2021</xref>), excluding medical research sources. Because 
                    <italic toggle="yes">IEEE</italic> content is indexed in 
                    <italic toggle="yes">Web of Science</italic> (
                    <xref ref-type="bibr" rid="ref77">Young, 2023</xref>), we did not include 
                    <italic toggle="yes">IEEE Xplore</italic> as a separate source. We added two additional databases (
                    <italic toggle="yes">ACL</italic> and 
                    <italic toggle="yes">ArXiv</italic>) and conducted a search for data extraction tools in the Systematic Review Toolbox (
                    <xref ref-type="bibr" rid="ref46">Marshall et al., 2022</xref>) to capture associated articles. Searches were conducted in the Association for Computational Linguistics (ACL) Anthology, arXiv Research-Sharing Platform (arXiv), and DBLP Computer Science Bibliography (DBLP) on June 15, 2023; in the Web of Science Core Collection (WOS) on June 8, 2023; and in the Systematic Review Toolbox on October 2, 2023.</p>
                <p>The Web of Science search and deduplication followed procedures stated in the protocol (
                    <xref ref-type="bibr" rid="ref42">Legate &amp; Nimon, 2023b</xref>). We adapted source code developed by 
                    <xref ref-type="bibr" rid="ref63">Schmidt et al. (2021)</xref> for automating search, retrieval, and deduplication functions on full database dumps for 
                    <italic toggle="yes">ACL</italic>, 
                    <italic toggle="yes">ArXiv</italic>, and 
                    <italic toggle="yes">DBLP</italic> platforms. Complete details, including citation indices and specific setting applied, search syntax, and adapted source code are available in the project repository (see &#x2018;Data availability&#x2019; section).</p>
            </sec>
            <sec id="sec14">
                <title>Study selection</title>
                <p>Title, abstract, and full-text screening was conducted using Rayyan (
                    <xref ref-type="bibr" rid="ref55">Ouzzani et al., 2016</xref>; free and subscription accounts available at 
                    <ext-link ext-link-type="uri" xlink:href="https://www.rayyan.ai/">https://www.rayyan.ai/</ext-link>). Three researchers (1000 abstracts per week) screened all titles and abstracts. Researchers met weekly to review, resolve conflicts, and further develop the codebook for this LSR. All conflicts that arose during the title and abstract screening (
                    <italic toggle="yes">n</italic>=103/
                    <italic toggle="yes">N</italic>=10,644) were resolved on a weekly basis. Where disagreements arose, they were related to methods for abstractive text summarization and transferability of methods applied to clinical research studies (i.e., RCTs). In cases where level of abstraction and potential for transferability could not be determined from the abstract alone, full text articles were reviewed and discussed by all three researchers until consensus was reached.</p>
                <p>For the data extraction stage, a Google form was developed following items of interest as described in the protocol. All data extraction tasks were performed independently in triplicate. Researchers met weekly to review and reach a consensus on coding of extracted items of interest. The extraction form was updated over the course of data extraction to better fit project goals and promote reliability of future updates.</p>
                <p>We originally intended to conduct Inter-Rater Reliability (IRR) assessments to provide reliability estimates following each stage of the baseline review (
                    <xref ref-type="bibr" rid="ref10">Belur et al., 2018</xref>; 
                    <xref ref-type="bibr" rid="ref79">Zhao et al., 2022</xref>). Given the nascency of our research and scope of our items of interest, coding forms allowed for input of &#x201c;other&#x201d; responses (e.g., APA data elements) that were not included in extant reviews that focus on medical and clinical data extraction (e.g., PICO elements). Further, data extraction presented opportunities to develop reporting structure for methods and items of interest that were not reported in prior literature (e.g., NER, open-source tools). A weekly review meeting was used to continually develop the project codebook to promote continuity, structure, and develop an IRR framework for future iterations of this review.</p>
            </sec>
        </sec>
        <sec id="sec15" sec-type="results">
            <title>Results</title>
            <sec id="sec16">
                <title>Search results</title>
                <p>Search results are presented in the PRISMA flowchart (see 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>). A total of 11,336 records were identified through all search sources, including databases and publications available through the Systematic Review Toolbox (
                    <xref ref-type="bibr" rid="ref46">Marshall et al., 2022</xref>). After deduplication, 10,644 articles were included in the title and abstract screening stage. We retrieved 46 articles for full-text screening. One duplicate print was detected during full text screening and was removed. This iteration of the LSR includes 23 articles. Detailed description of deduplication and preliminary screening procedures are available in the OSF project repository (see &#x2018;Data availability&#x2019; section).</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>PRISMA diagram.</title>
                        <p>
                            <italic toggle="yes">Note.</italic> ACL=ACL Anthology (
                            <ext-link ext-link-type="uri" xlink:href="https://aclanthology.org/">https://aclanthology.org/</ext-link>), arXiv=arXiv Research-Sharing Platform (
                            <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/">https://arxiv.org/</ext-link>), DBLP=DBLP Computer Science Bibliography (
                            <ext-link ext-link-type="uri" xlink:href="https://dblp.org/">https://dblp.org/</ext-link>), WOS=Web of Science Core Collection, SRTool=Systematic Review Toolbox (
                            <ext-link ext-link-type="uri" xlink:href="http://systematicreviewtools.com/">http://systematicreviewtools.com/</ext-link>).</p>
                    </caption>
                    <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure2.gif"/>
                </fig>
                <p>The following sections describe the rationale for exclusions, followed by a brief overview of studies included in the baseline review. These results are presented in 
                    <xref ref-type="fig" rid="f3">Figures 3</xref> and 
                    <xref ref-type="fig" rid="f4">4</xref>, respectively. An overview of included studies is presented in 
                    <xref ref-type="table" rid="T2">Table 2</xref>.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Excluded publications.</title>
                        <p>
                            <italic toggle="yes">Note.</italic> Domain = exclusively medical, biomedical, clinical, or natural science (
                            <italic toggle="yes">n</italic>=2); Target entities = Lack of detail in reporting extracted entities (
                            <italic toggle="yes">n</italic>=7); Application = no application, testing, or extraction conducted manually (
                            <italic toggle="yes">n</italic>=6); Lack of Detail in Reporting Corpus or Wrong Corpus (
                            <italic toggle="yes">n</italic>=7).</p>
                    </caption>
                    <graphic id="gr3" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure3.gif"/>
                </fig>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>Included publications.</title>
                        <p>
                            <italic toggle="yes">Note.</italic> Presented Tool = Describe/demonstrated a software tool, system, or application for data extraction (
                            <italic toggle="yes">n</italic>=12), Developed Method = Developed techniques and/or methods for automated data extraction (
                            <italic toggle="yes">n</italic>=9); Evaluated Techniques = Tested or evaluated the performance of existing tools, techniques, or methods (
                            <italic toggle="yes">n</italic>=2); Applied Tool = Applied automation tools to conduct secondary research (
                            <italic toggle="yes">n</italic>=0).</p>
                    </caption>
                    <graphic id="gr4" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure4.gif"/>
                </fig>
                <table-wrap id="T2" orientation="portrait" position="float">
                    <label>Table 2. </label>
                    <caption>
                        <title>Included studies.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Title</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Reference</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Summary description</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">A model for the identification of the functional structures of unstructured abstracts in the social sciences</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref67">Shen et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Proposed a high-performance model for identifying functional structures of unstructured abstracts in the social sciences.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">A Semi-automatic Data Extraction System for Heterogeneous Data Sources: a Case Study from Cotton Industry</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref49">Nayak et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Proposed a novel data extraction system based on text mining approaches to discover relevant and focused information from diverse unstructured data sources.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">An interactive query-based approach for summarizing scientific documents</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Proposed an interactive multi-document text summarization approach that allows users to specify composition of a summary and refine initial query by user-selected keywords and sentences extracted from retrieved documents.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Automatic results identification in software engineering papers. Is it possible?</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref71">Torres et al. (2012)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Analyzed existing methods for sentence classification in scientific papers and evaluates their feasibility in unstructured papers in the Software Engineering area.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Contextual information retrieval in research articles: Semantic publishing tools for the research community</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref2">Angrosh et al. (2014)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduced conceptual framework (and linked data application) for modeling contexts associated with sentences and converting information extracted from research articles into machine-understandable data.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">CORWA: A Citation-Oriented Related Work Annotation Dataset</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref44">Li et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Presented new approach to related work generation in academic research papers and introduced annotation dataset for labeling different types of citation text fragments from various information sources.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">DASyR (IR) - Document Analysis System for Systematic Reviews (in Information Retrieval)</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref60">Piroi et al. (2015)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduced a semi-automatic document analysis system/framework for annotating published papers for ontology population, particularly in domains lacking adequate dictionaries.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Detecting In-line Mathematical Expressions in Scientific Documents</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al. (2017)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Reported preliminary results applying a method for identifying in-line mathematical expressions in PDF documents utilizing both layout and linguistic features.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Extracting the characteristics of life cycle assessments via data mining</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed and Zhang (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Proposed a method for automatically extracting key characteristics of life cycle assessments (LCAs) from journal articles.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Machine Reading of Hypotheses for Organizational Research Reviews and Pre-trained Models via R Shiny App for Non-Programmers</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref17">Chen et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduced NLP models for accelerating the discovery, extraction, and organization of theoretical developments from social science publications.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">MetaSeer.STEM: Towards Automating Meta-Analyses</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref50">Neppalli et al. (2016)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Proposed a machine learning-based system developed to support automated extraction of data pertinent to STEM education meta-analyses.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Mining Social Science Publications for Survey Variables</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref81">Zielinski and Mutschke (2017)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Described a work-in-progress development of new techniques or methods for identifying variables used in social science research.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Ontology-based and User-focused Automatic Text Summarization (OATS): Using COVID-19 Risk Factors as an Example</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref16">Chen et al. (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Proposed an ontology-based system which users could access and utilize for automatically generating text summarization from unstructured text.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Ontology-Driven Information Extraction from Research Publications</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref58">Pertsas and Constantopoulos (2018)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduced a system designed to extract information from research articles, associate it with other sources, and infer new knowledge.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Research Method Classification with Deep Transfer Learning for Semi-Automatic Meta-Analysis of Information Systems Papers</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref3">Anisienia et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Presented an artifact that uses deep transfer learning for multi-label classification of research methods for an Information Systems corpus.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Scaling Systematic Literature Reviews with Machine Learning Pipelines</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al. (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Described a pipeline that automates three stages of a systematic review: searching for documents, selecting relevant documents, and extracting data.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Searching for tables in digital documents</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref45">Liu et al. (2007)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduced an automatic table extraction and search engine system.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Section-wise indexing and retrieval of research articles</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref66">Shahid and Afzal (2018)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Described development and evaluation of a technique for tagging paper's content with logical sections appearing in scientific documents.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Sysrev: A FAIR Platform for Data Curation and Systematic Evidence Review</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref13">Bozada et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduced a platform for aiding in systematic reviews and data extraction by providing access to digital documents and facilitating collaboration in research projects.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Team EP at TAC 2018: Automating data extraction in systematic reviews of environmental agents</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref51">Nowak and Kunstman (2019)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Presented a solution for automating data extraction in systematic reviews of environmental agents.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">The Canonical Model of Structure for Data Extraction in Systematic Reviews of Scientific Research Articles</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref1">Aliyu et al. (2018)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Developed a canonical model of structure approach that identifies sections from documents and extracts the headings and subheadings from the sections.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Towards a Semi-Automated Approach for Systematic Literature Reviews Completed Research</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref19">Denzler et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Presented a flexible and modifiable artifact to support systematic literature review processes holistically.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">UniHD@CL-SciSumm 2020: Citation Extraction as Search</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref7">Aumiller et al. (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Presented method to identify references from citation text spans and classify citation spans by discourse facets.</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec id="sec17">
                <title>Excluded publications</title>
                <p>Most studies were excluded due to lack of detail in extracted data entities (
                    <italic toggle="yes">n</italic>=7) and wrong corpus or data source (
                    <italic toggle="yes">n</italic>=7). 
                    <xref ref-type="bibr" rid="ref15">Carri&#x00f3;n-Toro et al. (2022)</xref>, for example, developed a method and software tool supporting researchers with selection of relevant key criteria in a field of study based on term frequencies. While text summarization has proven valuable for evidence synthesis tasks, the primary focus of this LSR involves efforts to extract specific data points from primary research (
                    <xref ref-type="bibr" rid="ref53">O&#x2019;Connor et al., 2019</xref>). We also excluded extraction techniques that were not applied to abstracts or full text of research articles. 
                    <xref ref-type="bibr" rid="ref52">Ochoa-Hern&#x00e1;ndez et al. (2018)</xref>, for instance, presented a method to automatically extract concepts from web blog articles.</p>
                <p>The second most common exclusion category were articles that presented techniques or systems utilizing pre-extracted data (
                    <italic toggle="yes">n</italic>=4). 
                    <xref ref-type="bibr" rid="ref5">Ali and Gravino (2018)</xref>, for example, proposed an ontology-based SLR system with semantic web technologies; however, the data (derived from a prior review conducted by the authors) were added to the ontology system after the manual extraction stage. Finally, articles were excluded due to exclusive application in medical/clinical research (
                    <italic toggle="yes">n</italic>=2), or the proposed tool had not yet been implemented (
                    <italic toggle="yes">n</italic>=2). 
                    <xref ref-type="bibr" rid="ref29">Goswami et al. (2019)</xref>, for example, described and evaluated a supervised ML framework to identify and extract anxiety outcome measures from clinical trial articles. 
                    <xref ref-type="bibr" rid="ref80">Zhitomirsky-Geffet et al. (2020)</xref> presented a conceptual description of a network-based data model capable of mining quantitative results from social sciences articles, but the system had not been implemented at the time of publication.</p>
            </sec>
            <sec id="sec18">
                <title>Included publications</title>
                <p>The majority of included studies (
                    <italic toggle="yes">n</italic>=12) presented or described a software tool, system, or application to support researchers extracting data from research literature. The second most common inclusion category focused on the development of specialized techniques or methods for automating data extraction tasks (
                    <italic toggle="yes">n</italic>=9). We identified two studies that evaluated or tested the performance of existing tools or methods for (semi)automated data extraction. Unlike related reviews of data extraction methods for healthcare interviews (see 
                    <xref ref-type="bibr" rid="ref65">Schmidt et al., 2023</xref>), we did not identify social science studies applying existing automated data extraction tools to conduct secondary research.</p>
            </sec>
            <sec id="sec19">
                <title>Automated approaches</title>
                <p>To report approaches identified, we organized the extracted data under four overarching categories, including: (1) data preprocessing and feature engineering, (2) model architectures and components, (3) rule-bases, and (4) evaluation metrics. See &#x2018;Data Availability&#x2019; section for labeling and additional descriptions of techniques. We opted to extract and report rule-based techniques separately because the approaches we identified intertwined with various aspects of the data processing and extraction pipeline, spanning data preprocessing to the model architecture itself. This distinction allows for more discussion about the prevalence, scope and utility of these techniques.</p>
            </sec>
            <sec id="sec20">
                <title>Data preprocessing and feature engineering</title>
                <p>The data preprocessing category encompasses methods and techniques used to preprocess raw text and data before it is fed into ML/NLP models. This includes tasks such as tokenization, stemming, lemmatization, stop word removal, and other steps necessary to clean and prepare the text data for analysis. 
                    <xref ref-type="fig" rid="f5">Figure 5</xref> plots the aggregate results of preprocessing techniques identified.</p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <title>Data preprocessing.</title>
                    </caption>
                    <graphic id="gr5" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure5.gif"/>
                </fig>
                <p>Nearly all studies applied tokenization and/or segmentation (83%, 
                    <italic toggle="yes">n</italic>=19) for breaking down text into manageable units. Similarly, PDF parsing/extraction techniques were applied in 65% (
                    <italic toggle="yes">n</italic>=15) of studies, the remaining studies applied extraction to other document formats (e.g., journal articles available online in HTML format; see 
                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed &amp; Zhang, 2020</xref>). While similar methods, which additionally take into account syntactic structure, including chunking and dependency parsing were less frequently applied (
                    <xref ref-type="bibr" rid="ref2">Angrosh et al., 2014</xref>; 
                    <xref ref-type="bibr" rid="ref44">Li et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos, 2018</xref>). Tagging methods, including PoS tagging (assigning grammatical categories, e.g., noun, verb), followed by concept tagging (e.g., semantic annotation), or sequence tagging, where labels were assigned based on order of appearance, were used in 43% (
                    <italic toggle="yes">n</italic>=15) of studies. Nine studies used manual annotation for training and/or evaluation.</p>
                <p>Among noise reduction approaches, stop-word removal was the most common, stemming, normalization, and lemmatization were applied, though less frequently. For stemming approaches, the Porter stemmer (
                    <xref ref-type="bibr" rid="ref61">Porter, 1980</xref>), including its extensions (e.g., Porter2, S-stemmer, snowball stemmer), were as commonly reported as traditional stemmers (see 
                    <xref ref-type="bibr" rid="ref7">Aumiller et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref66">Shahid &amp; Afzal, 2018</xref>). Optical Character Recognition (OCR) appeared in three studies, however, 
                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al. (2017)</xref> used OCR only as a benchmark for evaluating their CRF method for detecting math expressions.</p>
                <p>Feature engineering (e.g., ranking functions, representation learning and feature extraction techniques) covers a range of methods essential for transforming raw text data into structured, machine-readable representations to facilitate downstream ML/NLP tasks (
                    <xref ref-type="bibr" rid="ref39">Kowsari et al., 2019</xref>). See 
                    <xref ref-type="fig" rid="f6">Figure 6</xref>.</p>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <title>Feature engineering.</title>
                    </caption>
                    <graphic id="gr6" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure6.gif"/>
                </fig>
                <p>Word embeddings were the most frequently used techniques. We grouped ELMo (word embeddings from language models) with traditional word embeddings such as Word2Vec and Glove (
                    <xref ref-type="bibr" rid="ref39">Kowsari et al., 2019</xref>; 
                    <xref ref-type="bibr" rid="ref78">Young et al., 2018</xref>, Chapter 6). Of these, GloVe was used in four studies (
                    <xref ref-type="bibr" rid="ref17">Chen et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref51">Nowak &amp; Kunstman, 2019</xref>; 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>) and ELMo in two (
                    <xref ref-type="bibr" rid="ref51">Nowak &amp; Kunstman, 2019</xref>; 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>). The most common frequency-based feature representation approaches were Bag-of-Words (BoW, 
                    <italic toggle="yes">n</italic>=5) and Term frequency-Inverse Document Frequency (TF-IDF, 
                    <italic toggle="yes">n</italic>=4). Although less frequently applied in the corpus, methods for representing words or documents as vectors based on semantic properties such as Vector Space Models (VSM) and sentence embeddings were used as early as 2007. Other less commonly reported methods included synonym aggregation/expansion, best match ranking (BM25), shingling, and subject-verb pairings.</p>
            </sec>
            <sec id="sec21">
                <title>Model architectures and components</title>
                <p>The model architecture category focuses on the architectures and components of ML/NLP models used for data extraction. Results are shown in 
                    <xref ref-type="fig" rid="f7">Figure 7</xref>. Some approaches overlapped across applications &#x2013; e.g., semantic web or semantic indexing structures and ontology-pipeline approaches &#x2013; we grouped these techniques into categories to facilitate reporting. Likewise, all transformer-based approaches were grouped into a single category, however, specific architectures and components are discussed in the sections below, and detailed coding of extracted data is available in the supplemental data files (see &#x2018;Underlying Data&#x2019; section). Where ruled-based approaches were considered a part of the system architecture, they are reported under the &#x2018;Rule-bases&#x2019; section.</p>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Model architectures and components.</title>
                    </caption>
                    <graphic id="gr7" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure7.gif"/>
                </fig>
                <p>Overall, approaches ranged from straightforward implementations to complex layered architectures. Examples of more straightforward approaches included architectures based entirely on rule-bases (e.g., 
                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed &amp; Zhang, 2020</xref>), applications based one classification method (e.g., na&#x00ef;ve Bayes; 
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>), or those utilizing a single type of probabilistic model (
                    <xref ref-type="bibr" rid="ref2">Angrosh et al., 2014</xref>; 
                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al., 2017</xref>). At the other end of the complexity continuum, 
                    <xref ref-type="bibr" rid="ref51">Nowak and Kunstman (2019)</xref> presented an end-to-end deep learning model based on a BI-LSTM-CRF architecture with interleaved alternating LSTM layers and highway connections. In the following sections, we further elaborate on various approaches identified.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Ontology-based and Semantic Web.</italic>
                    </bold> These pipelines involve leveraging ontologies and semantic web technologies for semantic annotation or knowledge representation. Among included studies, ontology and semantic web capabilities were explored as early as 2014, but the preliminary results from this baseline review suggest an upward trend in recent years. 
                    <xref ref-type="bibr" rid="ref2">Angrosh et al. (2014)</xref>, for example, developed a Sentence Context Ontology (SENTCON) for modeling the contexts of information extracted from research documents. 
                    <xref ref-type="bibr" rid="ref60">Piroi et al. (2015)</xref> developed and presented an annotation system for populating ontologies in domains lacking adequate dictionaries. Some work focused on automatically mapping structures of research documents. For example, using an open source lexical database to develop a canonical model of structure, 
                    <xref ref-type="bibr" rid="ref1">Aliyu et al. (2018)</xref> were able to automatically identify and extract target paper sections from research documents. 
                    <xref ref-type="bibr" rid="ref66">Shahid and Afzal (2018)</xref> utilized specialized ontologies to automatically tag content in research papers by logical sections. 
                    <xref ref-type="bibr" rid="ref16">Chen et al. (2020)</xref> presented a novel framework for text summarization, including ontology-based topic identification and user-focused summarization modules.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Transformer-based Approaches.</italic>
                    </bold> Our results suggested that transformer-based approaches have experienced rapid growth since 2020. Bidirectional Encoder Representations from Transformers (BERT) and other BERT-based language models made up the majority of transformer-based approaches. Specifically BERT (
                    <xref ref-type="bibr" rid="ref7">Aumiller et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>) and SciBERT (
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref44">Li et al., 2022</xref>) were the most utilized for tasks relevant to extracting data from research in social sciences. Others language models included BioBERT (
                    <xref ref-type="bibr" rid="ref16">Chen et al., 2020</xref>) and distilBERT (
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>). We identified a recent application of the Hugging Face LED model (
                    <xref ref-type="bibr" rid="ref44">Li et al., 2022</xref>), a pre-trained longformer model developed to address length limitations associated with other transformer-based approaches (see 
                    <xref ref-type="bibr" rid="ref11">Beltagy et al., 2020</xref>).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Named Entity Recognition (NER).</italic>
                    </bold> Six of the included studies applied Named Entity Recognition (NER) techniques. Increasing availability of tools to support the entire SLR pipeline, including data extraction efforts, may be partially to credit for upward trends in NER applications. Based on applications we identified, NER would best be described as versatile. Some studies incorporated NER as an integral component embedded throughout a larger ML/NLP pipeline (e.g., 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>), others included NER subcomponents leveraged primarily for preprocessing and feature representation tasks (e.g., 
                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos, 2018</xref>), and in one study, authors took advantage of open source NER tools that could be easily integrated into a highly modifiable artifact serving as platform for future development of holistic approaches to scaling SLR tasks (e.g., 
                    <xref ref-type="bibr" rid="ref19">Denzler et al., 2021</xref>).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Extractive Questing-Answering Models.</italic>
                    </bold> Extractive questing-answering models involve tasks where a model generates answers to questions based on a given context. Question-answering models appeared in our dataset as early as 2007 (
                    <xref ref-type="bibr" rid="ref45">Liu et al., 2007</xref>), with the remaining applications published in 2020 or later. Question answering techniques have a range of applications that most readers are likely familiar with, like chatbots and intelligent assistants (e.g., Alexa, Google Assistant, Siri). However, state-of-the-art approaches for question-answering over knowledge bases are also being put to use in the data extraction arena. The study by 
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al. (2022)</xref>, for example, introduced new methods for interactive multi-document text summarization that allow users to specify summary compositions and interactively refine queries after reviewing complete sentences automatically extracted from documents.</p>
                <p>
                    <bold>
                        <italic toggle="yes">Classifiers.</italic>
                    </bold> For classification approaches, we followed 
                    <xref ref-type="bibr" rid="ref63">Schmidt et al. (2021)</xref> in reporting instances of Support Vector Machines (SVM) separately from other binary classifiers and likewise found a high prevalence of SVM usage, accounting for 50% of all binary classifiers identified (
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref66">Shahid &amp; Afzal, 2018</xref>; 
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref81">Zielinski &amp; Mutschke, 2017</xref>). Among classifiers that use a linear combination of inputs (
                    <xref ref-type="bibr" rid="ref36">Jurafsky &amp; Martin, 2024</xref>), na&#x00ef;ve Bayes was the most frequent (
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>; 
                    <xref ref-type="bibr" rid="ref66">Shahid &amp; Afzal, 2018</xref>; 
                    <xref ref-type="bibr" rid="ref71">Torres et al., 2012</xref>; 
                    <xref ref-type="bibr" rid="ref81">Zielinski &amp; Mutschke, 2017</xref>). One study used a Perceptron classifier; however, it was extended (i.e., OvR) to handle multiclass problems (
                    <xref ref-type="bibr" rid="ref7">Aumiller et al., 2020</xref>). Multi-class classifiers were less common with one instance each of k-Nearest Neighbors (aka KNN/kLog; 
                    <xref ref-type="bibr" rid="ref81">Zielinski &amp; Mutschke, 2017</xref>) and the J48 classifier (C4.5 Decision Trees; 
                    <xref ref-type="bibr" rid="ref60">Piroi et al., 2015</xref>).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Neural Networks.</italic>
                    </bold> Overall, there were a variety of neural network applications across the included studies. Most used Long Short-term Memory (LSTM), more specifically, Bidirectional LSTM (BiLSTM). We also identified one application Bidirectional Gated Recurrent Unit (BiGRU; 
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>). Convolutional Neural Network (CNN) architectures (
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref51">Nowak &amp; Kunstman, 2019</xref>; 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>) were also present. Several studies evaluated state-of-the-art deep learning methods. For example, 
                    <xref ref-type="bibr" rid="ref67">Shen et al. (2022)</xref> compared the performance of deep learning models (TextCNN and BERT) for sentence classification in social sciences abstracts. In another comparative study, 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al. (2021)</xref> compared methods for pretraining deep contextualized word representations for cutting-edge transfer learning techniques based on CNN and LSTM architectures in addition to classifier models (e.g., SVM).</p>
                <p>
                    <bold>
                        <italic toggle="yes">Probabilistic Models.</italic>
                    </bold> Among probabilistic models, Conditional Random Field (CRF) applications were predominant in our dataset. CRF was often applied for sequence labeling tasks, such as named entity recognition (e.g., 
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>), or for classification tasks (e.g., 
                    <xref ref-type="bibr" rid="ref2">Angrosh et al. 2014</xref>). Overall, included studies provided evidence that CRF can form a powerful architecture when combined with RNNs (e.g., bi-GRU-CRF, bi-LSTM-CRF; see 
                    <xref ref-type="bibr" rid="ref51">Nowak &amp; Kunstman, 2019</xref>; 
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>). We found a single application of the Maximum Entropy Markov Model (MEMM), however, based on experimental results the authors ultimately selected CRF for identifying sentence context for extraction from research publications (
                    <xref ref-type="bibr" rid="ref2">Angrosh et al., 2014</xref>).</p>
            </sec>
            <sec id="sec22">
                <title>Rule-bases</title>
                <p>Rule-based techniques involve the application of predefined rules or patterns to extract specific features from the text. Versatile and widely applicable, they offer a robust framework for automating data extraction or for capturing relevant information from large volumes of text. See 
                    <xref ref-type="fig" rid="f8">Figure 8</xref> for rule-based approaches reported across included studies.</p>
                <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                    <label>Figure 8. </label>
                    <caption>
                        <title>Rule-bases.</title>
                    </caption>
                    <graphic id="gr8" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure8.gif"/>
                </fig>
                <p>Overall, 70% (
                    <italic toggle="yes">n</italic>=16) of included studies utilized rule- or heuristic-based approaches to support a variety of tasks for data extraction. Of these, nearly half (
                    <italic toggle="yes">n</italic>=7) reported using Regular Expressions (RegEx). For example, based on rules developed from manual inspection, RegEx was used by 
                    <xref ref-type="bibr" rid="ref71">Torres et al. (2012)</xref> to construct patterns for identifying specific types of sentences (e.g., objective, results, conclusions) and by 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al. (2020)</xref> for splitting papers into specific sections (e.g., abstract, introduction, methods). Alternatively, 
                    <xref ref-type="bibr" rid="ref58">Pertsas and Constantopoulos (2018)</xref> used RegEx to exploit lexico-syntactic patterns derived from an ontology knowledge base (Activities, Goals, and Propositions). Other RegEx uses included modifying datasets to incorporate patterns related to citation mentions (
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>) or application of rule-based chunking and processing to identify and extract relevant chunks from text (
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>). The remaining six studies described custom rule-based algorithms or other heuristic approaches. 
                    <xref ref-type="bibr" rid="ref44">Li et al. (2022)</xref>, for example, applied rule-based algorithms PrefixSpan and Gap-Bide for the extraction of frequent discourse sequences. RAKE (Rapid Automatic Keyword Extraction) was applied by 
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al. (2022)</xref> to extract keywords which served as representations of a document&#x2019;s content. And 
                    <xref ref-type="bibr" rid="ref1">Aliyu et al. (2018)</xref> described a rule-based algorithm developed for processing full-text documents to identify and extract section headings.</p>
            </sec>
            <sec id="sec23">
                <title>Evaluation metrics</title>
                <p>Evaluation metrics are presented in 
                    <xref ref-type="fig" rid="f9">Figure 9</xref>. Precision, recall, F-scores, and accuracy were predominantly reported across studies, including the earliest published articles. For assessment of model performance, six studies used cross-validation (CV), a process of &#x201c;averaging several hold-out estimators of the risk corresponding to different data splits&#x201d; (
                    <xref ref-type="bibr" rid="ref6">Arlot &amp; Celisse, 2010</xref>, p. 53). 
                    <italic toggle="yes">K</italic>-fold CV (5 or 10 folds) was predominantly applied (
                    <xref ref-type="bibr" rid="ref2">Angrosh et al., 2014</xref>; 
                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al., 2017</xref>; 
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>; 
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>, with one application of leave-one-out or LOOCV (
                    <xref ref-type="bibr" rid="ref60">Piroi et al.; 2015</xref>) and one application of document level CV used as a supplemental technique to 
                    <italic toggle="yes">k</italic>-fold (
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>). Five studies provided description of user feedback and other ratings. User feedback (among other metrics) was reported by 
                    <xref ref-type="bibr" rid="ref44">Li et al. (2022)</xref> who conducted expert human comparative assessment to assess fluency, relevance, coherence, and overall quality of model citation span/sentence generation outputs. This category also included evaluation metrics not listed in the sources we adapted when developing our protocol (see 
                    <xref ref-type="bibr" rid="ref54">O&#x2019;Mara-Eaves et al., 2015</xref>, p. 3, 
                    <xref ref-type="table" rid="T1">Table 1</xref>; 
                    <xref ref-type="bibr" rid="ref63">Schmidt et al., 2021</xref>, pp. 8-9). For example, in assessing their system on values returned for queries of interest, 
                    <xref ref-type="bibr" rid="ref49">Nayak et al. (2021)</xref> reported suitably, adaptability, relevance scores, and data-dependencies. As another example, 
                    <xref ref-type="bibr" rid="ref19">Denzler et al. (2021</xref>, p. 5) evaluated their artifact based on design science aspects (i.e., validity, efficacy, and utility).</p>
                <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                    <label>Figure 9. </label>
                    <caption>
                        <title>Evaluation metrics.</title>
                    </caption>
                    <graphic id="gr9" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure9.gif"/>
                </fig>
                <p>Given the rapid growth of domain-specific ontologies and pre-trained language models, it is not surprising to find Kappa statistics reported for tasks such as evaluating agreement between human annotators when creating gold standard datasets for training and evaluation (Cohen&#x2019;s Kappa, see 
                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos, 2018</xref>; Mezzich&#x2019;s Kappa or Gwet&#x2019;s AC1, see 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>). Semantic similarity scores, which can be used to compare model generated responses against ground truth responses in query-based or question-answering based applications, were reported in two studies (Jaccard Index, 
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>; DKPro Similarity, 
                    <xref ref-type="bibr" rid="ref81">Zielinski &amp; Mutschke, 2017</xref>).</p>
            </sec>
            <sec id="sec24">
                <title>Availability, accessibility and transferability</title>
                <p>While only one study we reviewed presented an existing tool that was accessible to users through an online application (
                    <ext-link ext-link-type="uri" xlink:href="http://sysrev.com">sysrev.com</ext-link>; 
                    <xref ref-type="bibr" rid="ref13">Bozada et al., 2021</xref>) at the time of conducting this baseline review, two other studies were either being prepared or were available through other means. These included the Holistic Modifiable Literature Review tool (
                    <xref ref-type="bibr" rid="ref19">Denzler et al., 2021</xref>), which was listed as, &#x201c;currently being prepared&#x201d; (available at 
                    <ext-link ext-link-type="uri" xlink:href="https://holimolirev.github.io/HoliMoLiRev/">https://holimolirev.github.io/HoliMoLiRev/</ext-link>) and HypothesisReader (
                    <xref ref-type="bibr" rid="ref17">Chen et al., 2021</xref>), which was available to users through an Rshiny application. SysRev (
                    <xref ref-type="bibr" rid="ref13">Bozada et al., 2021</xref>) was also the only tool cataloged in the SR Toolbox (
                    <xref ref-type="bibr" rid="ref46">Marshall et al., 2022</xref>). Six of the twenty-three studies (26%) made source code openly available (
                    <xref ref-type="bibr" rid="ref17">Chen et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref19">Denzler et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed &amp; Zhang, 2020</xref>; 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al., 2017</xref>; 
                    <xref ref-type="bibr" rid="ref44">Li et al. 2022</xref>). Article references and corresponding repositories are detailed in 
                    <xref ref-type="table" rid="T3">Table 3</xref>. GitHub stood out as the most popular repository for code and data sharing, and one study made source code available online through an open access publisher.</p>
                <table-wrap id="T3" orientation="portrait" position="float">
                    <label>Table 3. </label>
                    <caption>
                        <title>Code repositories.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Reference</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Code repository</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref17">Chen et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">devtools::install_github(&#x201c;canfielder/HypothesisReader&#x201d;)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref19">Denzler et al. (2021)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">GitHub Repository: 
                                    <ext-link ext-link-type="uri" xlink:href="https://github.com/HoliMoLiRev/HoliMoLiRev">https://github.com/HoliMoLiRev/HoliMoLiRev</ext-link>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed and Zhang (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <ext-link ext-link-type="uri" xlink:href="https://methods-x.com/article/S2215-0161(20)30224-7/fulltext#supplementaryMaterial">https://methods-x.com/article/S2215-0161(20)30224-7/fulltext#supplementaryMaterial</ext-link>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al. (2020)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <ext-link ext-link-type="uri" xlink:href="https://github.com/seraphinatarrant/systematic_reviews">https://github.com/seraphinatarrant/systematic_reviews</ext-link>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al. (2017)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <ext-link ext-link-type="uri" xlink:href="https://github.com/Alab-NII/inlinemath">https://github.com/Alab-NII/inlinemath</ext-link>
                                </td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref44">Li et al. (2022)</xref>
                                </td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">
                                    <ext-link ext-link-type="uri" xlink:href="https://github.com/jacklxc/CORWA">https://github.com/jacklxc/CORWA</ext-link>
                                </td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec id="sec25">
                <title>Transferability</title>
                <p>In the evolving landscape of systematic reviews and meta-analyses, the adaptability of tools and technologies to new research domains emerged as a critical factor for enhancing research efficiency and scope. The insights provided by many of the authors working towards automation of data extraction illuminate the transferability of various tools and technologies for research targeting the extraction of data elements beyond PICO.</p>
                <p>Several authors of reviewed studies specifically addressed transferability in describing the development of their tools, and further subjected these tools to rigorous testing aimed at validating transferable capabilities (
                    <xref ref-type="bibr" rid="ref16">Chen et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>). For instance, 
                    <xref ref-type="bibr" rid="ref50">Neppalli et al. (2016)</xref> created MetaSeer.STEM with a focus on extraction of data across a range of research domains, including education, management, and health informatics. 
                    <xref ref-type="bibr" rid="ref16">Chen et al. (2020)</xref> highlighted the adaptability of OATS, showcasing its broader application potential to fields beyond the authors&#x2019; COVID-19 specific demonstration. Finally, 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al. (2020)</xref> affirmed the domain-independent nature of their framework, suggesting its suitability for various systematic reviews.</p>
                <p>Additionally, other studies highlighted the need for transferability and discussed the potential for their research tools and technologies to be extended and adapted across varying domains, stressing the importance of flexible design principles in the development of these tools (
                    <xref ref-type="bibr" rid="ref2">Angrosh et al., 2014</xref>; 
                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed &amp; Zhang, 2020</xref>). 
                    <xref ref-type="bibr" rid="ref2">Angrosh et al. (2014)</xref> explained how SENTCON&#x2019;s preliminary design was applied to a specific set of articles in computer science but emphasized that the tool was flexible enough to be applied to other domains through the use of the Web Ontology Language (OWL). 
                    <xref ref-type="bibr" rid="ref20">Diaz-Elsayed &amp; Zhang (2020)</xref> presented methods that were initially applied to wastewater-based resource recovery, but likewise emphasized that the tool was capable of evaluating other engineered systems and retrieving different types of data than those initially extracted.</p>
                <p>As noted by 
                    <xref ref-type="bibr" rid="ref17">Chen et al. (2021)</xref>, while efforts are being made to assist the process of conducting systematic reviews there is often limited generalizability of domain-specific pre-trained language models. Many studies included in our review dedicated discussion points toward addressing the critical issue of generalizability and transferability of tools developed to support the broader research community in (semi)automated data extraction tasks. Collectively, these studies suggest a positive trend toward the development of adaptable, transferable research tools and technologies. However, they also underscore the need for ongoing effort across and between diverse domains to make continued progress toward broader research applications.</p>
            </sec>
            <sec id="sec26">
                <title>Open source tools</title>
                <p>An outcome we did not anticipate was the substantial number of open source tools, toolkits, and frameworks utilized by our relatively small corpus of articles. Because we were unsure what to expect, we made every effort to capture evidence that might prove useful to social science researchers. We identified 50 different open source technologies including platforms, software, software suites, packages/libraries, algorithms, pre-trained models, controlled vocabularies/thesauri, lexical databases, knowledge representations, and more. Open source tools identified are reported in 
                    <xref ref-type="fig" rid="f10">Figure 10</xref>. Of the open source resources available to researchers, the overwhelming majority were Python tools (
                    <italic toggle="yes">n</italic>=16; see Python Package Index, 
                    <ext-link ext-link-type="uri" xlink:href="https://pypi.org/">https://pypi.org/</ext-link>) and 8 of 23 (35%) studies used the Python Natural Language Toolkit (NLTK). The full list of open-source tools and license details are available in the OSF repository (see &#x2018;Underlying Data&#x2019; section).</p>
                <fig fig-type="figure" id="f10" orientation="portrait" position="float">
                    <label>Figure 10. </label>
                    <caption>
                        <title>Open source tools.</title>
                    </caption>
                    <graphic id="gr10" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/172337/60d35ccf-b47f-4e7f-96dd-2a191f4b4a93_figure10.gif"/>
                </fig>
            </sec>
            <sec id="sec27">
                <title>APA data elements</title>
                <p>This section discusses potential for extraction of key data elements of interest, as well as locations (i.e., paper sections), structures, and review tasks addressed by the studies reviewed. We limited this section to reporting tools that users could theoretically access and use to support their own research projects. There were 12 studies that presented systems or artifacts designed to facilitate various tasks associated with identifying and extracting data from published literature. To avoid speculating as to the future availability of these tools, we included all studies which presented tools or systems where authors incorporated user interfaces (UIs), regardless of availability at the time of conducting this base review.</p>
                <p>
                    <xref ref-type="table" rid="T4">Table 4</xref> provides an overview of data elements targeted as outlined by JARS (
                    <xref ref-type="bibr" rid="ref8">Appelbaum et al., 2018</xref>, p. 6). Each tool was assessed for potential to extract specific data elements by manuscript section (i.e., methods and results reporting elements pertinent to meta-analytic research; see 
                    <xref ref-type="bibr" rid="ref42">Legate &amp; Nimon, 2023b</xref>). Where the authors did not state a tool name, we used the description of the tool as presented in the paper (e.g., 
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>).</p>
                <table-wrap id="T4" orientation="portrait" position="float">
                    <label>Table 4. </label>
                    <caption>
                        <title>APA data elements.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top"/>
                                <th align="left" colspan="1" rowspan="1" valign="top"/>
                                <th align="left" colspan="1" rowspan="1" valign="top"/>
                                <th align="left" colspan="1" rowspan="1" valign="top">CORWA</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">CIRRA</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">DASyR (IR)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Holistic Modifiable Literature Reviewer</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Hypothesis Reader</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Interactive Text Summarization System</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">MetaSeer.STEM</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">OATS</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Research Spotlight</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Semi-automatic Data Extraction System</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">SysRev</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">TableSeer</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Manuscript Section</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Item</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Example Reporting Elements</th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref44">Li et al. (2022)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref2">Angrosh et al. (2014)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref60">Piroi et al. (2015)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref19">Denzler et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref17">Chen et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al. (2022)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref50">Neppalli et al. (2016)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref16">Chen et al. (2020)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos (2018)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref49">Nayak et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref13">Bozada et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="middle">
                                    <xref ref-type="bibr" rid="ref45">Liu et al. (2007)</xref>
                                </th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="6" valign="middle">Methods</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Criteria, Data Collection &amp; Participants</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Participant selection [setting(s), location(s), date(s), % approached vs. participated], Major/topic-specific demographics [age, sex, ethnicity, achievement level(s), tenure]</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Sample Size, Power &amp; Precision</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Intended vs. actual sample size, Sample size determination [power analysis, parameter estimates]</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Measures &amp; Instrumentation</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Measures [primary, secondary, covariates], Psychometric properties [reliability coefficients, internal consistency reliability, discriminant/convergent validity, test-retest coefficients, time lag intervals]</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Conditions &amp; Design</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Experimental/manipulated [randomized, nonrandomized], Nonexperimental [observational, single- or multi-group], Other [longitudinal, N-of-1, replication]</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Data Diagnostics</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Post-collection inclusion/exclusion criteria [criteria to infer missing data, processing of outliers, data distributions, data transformations]</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Analytic Strategy &amp; Hypotheses</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Strategy for inferential statistics, Protection against error, Hypothesis (es) [primary, secondary, exploratory]</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="4" valign="middle">Results</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Participants &amp; Recruitment</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Participants (by group/stage), Dates [recruitment, repeated measures]</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Statistics &amp; Data Analysis</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Statistical/data-analytic methods, Missing data (frequency or %), Missing data methods [MCAR, MAR, MNAR], Validity issues [assumptions, distributions]</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Complex Analyses</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Analytic approach [SEM, HLM, factor analysis], Model details [fit indices], Software, Estimation Technique, Estimation Issues [e.g., convergence]</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Descriptive &amp; Inferential Statistics</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Descriptive [total sample size, sample size subgroups/cases, means, standard deviations, correlation matrices], Inferential [
                                    <italic toggle="yes">p</italic>-values, degrees of freedom, mean square effects/errors, effect size estimates, confidence intervals]</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>Unlike ongoing research that focuses on data extraction from clinical literature (e.g., PICO elements/RCTs; see 
                    <xref ref-type="bibr" rid="ref65">Schmidt et al., 2023</xref>), specific reporting guidelines were not a primary focus of the studies we identified. However, authors described target entities and/or research methods of interest with high levels of specificity. For instance, extracting descriptive statistics, sample size, and Likert scale points (
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>) and extracting research hypotheses from published literature in organizational sciences (
                    <xref ref-type="bibr" rid="ref17">Chen et al., 2021</xref>). Despite the lack of discourse surrounding specific reporting guidelines, many of the tools reviewed incorporated some form of user-prompted, annotation- or query-based approach to (semi)automated data extraction. Thus, the collective body of work lends optimism surrounding customizable state-of-the art methods that can support extraction for a wide range of disciplines, research designs, and entities or data elements of interest to social science researchers.</p>
                <p>One example of a highly flexible approach is extractive question-answering based on pre-trained Transformer models. Extractive question-answering models are able to generate direct answers from knowledge base in response to natural language questions posed by users (
                    <xref ref-type="bibr" rid="ref40">Kwiatkowski et al. 2019</xref>). These tools typically offer enhanced flexibility through user-defined prompts and mechanisms for interactive query refinement. Example tools that incorporated question answering techniques included CIRRA (
                    <xref ref-type="bibr" rid="ref60">Piroi et al., 2015</xref>), the Interactive Text Summarization System for Scientific Documents (
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>), and OATS (
                    <xref ref-type="bibr" rid="ref17">Chen et al., 2021</xref>).</p>
                <p>Other types of flexible systems allow users to view excerpts related to specific keywords or queries, supporting expedited identification and labeling of target data elements. For example several tools supported user labeling of data, followed by predictive classification based on user annotations. Although these tools do not automatically extract data for users, they do augment human effort by (semi)automating time consuming tasks associated with data annotation and extraction. For instance, Sysrev (
                    <xref ref-type="bibr" rid="ref13">Bozada et al., 2021</xref>) supports researchers in labeling and extracting data by leveraging active learning models developed to replicate user decisions across various review tasks. Likewise, MetaSeer (
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>) developed ML techniques to identify and extract numbers from documents, which were then presented to users for manual annotation. Unlike question-answering models, human-computer interactions in these examples are not based on natural language queries, however, human expertise can be used to &#x2018;train&#x2019; ML models to predict future annotation decisions. Similarly, to overcome the time-constraints of open-ended annotation in fields that lack domain-specific dictionaries, DASyR (
                    <xref ref-type="bibr" rid="ref60">Piroi et al., 2015</xref>) utilized a combination of user annotations, classification models, and contextual information for populating ontologies. They reported substantial reduction in annotation time, stating that through the DASyR UI &#x201c;five experts added approximately 30,000 annotations at a speed of 4s/annotation&#x201d; (p. 595).</p>
                <p>Lastly, we note the utility of NER for the advancement of (semi)automated extraction of APA defined data elements. NER methodologies can be leveraged alongside classification models (
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>), linked to domain specific ontologies or other knowledge bases (
                    <xref ref-type="bibr" rid="ref60">Piroi et al., 2015</xref>), or incorporated as stand-alone modules integrated into larger modifiable frameworks (
                    <xref ref-type="bibr" rid="ref19">Denzler et al. 2021</xref>). In addition to pre-trained NER models for identification and extraction of named entities, Research Spotlight (
                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos, 2018</xref>) also exploited lexico-syntactic patterns in the scholarly ontology to identify and extract non-named entities. The Semi-automatic Data Extraction System for Heterogeneous Data Sources (
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>) combined features of NER and rule-based chunking to identify and extract phrases on regular expressions as well as named entities contained in the documents. Further, NER can be implemented through open source tools as demonstrated by 
                    <xref ref-type="bibr" rid="ref19">Denzler et al. (2021)</xref> and 
                    <xref ref-type="bibr" rid="ref49">Nayak et al. (2021)</xref>.</p>
            </sec>
            <sec id="sec28">
                <title>Structure, location, and review tasks</title>
                <p>
                    <xref ref-type="table" rid="T5">Table 5</xref> provides an overview of structure and location of extracted data elements, followed by review tasks supported by tools identified. The majority developed approaches for (semi)automating extraction of data from any section of full text research articles. Two studies tested the proposed techniques on specific article sections, including titles and abstracts (
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>) and introduction and background sections (
                    <xref ref-type="bibr" rid="ref44">Li et al., 2022</xref>). Regarding structure from which data were extracted, all except one extracted from unstructured text, two extracted from both tabular structures (i.e., tables) and text (
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos, 2018</xref>), and one was designed specifically to extract elements from tables (TableSeer; 
                    <xref ref-type="bibr" rid="ref45">Liu et al., 2007</xref>).</p>
                <table-wrap id="T5" orientation="portrait" position="float">
                    <label>Table 5. </label>
                    <caption>
                        <title>Structure, location, review tasks.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="2" valign="top">Category</th>
                                <th align="left" colspan="1" rowspan="2" valign="top">Item</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">CORWA</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">CIRRA</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">DASyR (IR)</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Holistic Modifiable Literature Reviewer</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Hypothesis Reader</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Interactive Text Summarization System</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">MetaSeer.STEM</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">OATS</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Research Spotlight</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Semi-automatic Data Extraction System</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">SysRev</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">TableSeer</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref44">Li et al. (2022)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref2">Angrosh et al. (2014)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref60">Piroi et al. (2015)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref19">Denzler et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref17">Chen et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al. (2022)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref50">Neppalli et al. (2016)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref16">Chen et al. (2020)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos (2018)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref49">Nayak et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref13">Bozada et al. (2021)</xref>
                                </th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
                                    <xref ref-type="bibr" rid="ref45">Liu et al. (2007)</xref>
                                </th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="2" valign="middle">Structure</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Extract from Text</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Extract from Tables</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="5" valign="middle">Location</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Title &amp; Abstract</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Introduction &amp; Background</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Methods</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Results</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Discussion</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="15" valign="middle">Review Tasks</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">1 Formulate Review Question</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">2 Find Previous Reviews</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">3 Write Protocol</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">4 Devise Search Strategy</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">5 Search</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">6 De-duplicate</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">7 Screen Abstracts</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">8 Obtain Full Text</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">9 Screen Full Text</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">10 Snowball</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">11 Extract Data</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">12 Synthesize Data</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">13 Re-check literature</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">14 Meta-Analyze</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">15 Write up review</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>All tools focused heavily on tasks related to data extraction (e.g., identification, labeling/annotation, ontology population), which was anticipated based on our search strategy and inclusion criteria. However, several studies advanced solutions for supporting other SLR tasks or stages (see 
                    <xref ref-type="bibr" rid="ref72">Tsafnat et al., 2014</xref>). The most common task (excluding data extraction) was literature search (
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref13">Bozada et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref16">Chen et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref19">Denzler et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref45">Liu et al., 2007</xref>; 
                    <xref ref-type="bibr" rid="ref58">Pertsas &amp; Constantopoulos, 2018</xref>). Many tasks listed are likely supported by a range of computational tools and techniques (e.g., synthesize and meta-analyze results); readers interested in (semi)automating other SLR stages are referred to the Systematic Review Toolbox for an extensive catalogue of tools and methods (
                    <xref ref-type="bibr" rid="ref46">Marshall et al., 2022</xref>).</p>
            </sec>
            <sec id="sec29">
                <title>Challenges</title>
                <p>A number of challenges were reflected within the body of evidence included in this baseline review. These challenges included difficulties in identifying functional structures within unstructured texts (
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>), extracting data from PDF file sources (
                    <xref ref-type="bibr" rid="ref49">Nayak et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al., 2017</xref>), and accurately detecting in-line mathematical expressions (
                    <xref ref-type="bibr" rid="ref34">Iwatsuki et al., 2017</xref>). Computational complexity created another significant obstacle for researchers, with issues arising from text vectorization methods, optimization problems, and the computational resources required by neural network frameworks (
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>). Furthermore, challenges associated with annotation, particularly biases introduced through the automated processes and limitations of available datasets, were a topic of discourse (
                    <xref ref-type="bibr" rid="ref44">Li et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref51">Nowak &amp; Kunstman, 2019</xref>; 
                    <xref ref-type="bibr" rid="ref71">Torres et al., 2012</xref>).</p>
                <p>Compared to the medical field, domain-specific challenges, particularly those in social sciences and related fields, necessitated tailored approaches, which can become time-consuming as researchers often lack sufficient training data to develop robust classifiers (
                    <xref ref-type="bibr" rid="ref17">Chen et al., 2021</xref>; 
                    <xref ref-type="bibr" rid="ref7">Aumiller et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref81">Zielinski &amp; Mutschke, 2017</xref>). Additionally, meta-analytic methods often face hurdles related to data representation variability, which has significant limitations in the use of data extraction tools, and class imbalance in the development of classification tasks (
                    <xref ref-type="bibr" rid="ref7">Aumiller et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref50">Neppalli et al., 2016</xref>; 
                    <xref ref-type="bibr" rid="ref27">Goldfarb-Tarrant et al., 2020</xref>).</p>
            </sec>
        </sec>
        <sec id="sec30" sec-type="conclusions">
            <title>Conclusions</title>
            <p>The findings of the baseline review indicate that when considering the process of automating systematic review and meta-analysis information extraction, social science research falls short as compared to clinical research that focuses on automatic processing of information related to the PICO framework (i.e., Population, Intervention, Control, and Outcome; 
                <xref ref-type="bibr" rid="ref24">Eriksenand Frandsen, 2018</xref>; 
                <xref ref-type="bibr" rid="ref72">Tsafnat et al., 2014</xref>). For example, while an LSR focusing on clinical research that is based on the PICO framework yielded 76 studies that included original data extraction (
                <xref ref-type="bibr" rid="ref65">Schmidt et al., 2023</xref>), the present review of social science research yielded only 23 relevant studies. This is not necessarily surprising when considering the breadth of social science research and the lack of unifying frameworks and domain specific ontologies (
                <xref ref-type="bibr" rid="ref28">G&#x00f6;pfert et al., 2022</xref>; 
                <xref ref-type="bibr" rid="ref73">Wagner et al., 2022</xref>).</p>
            <p>With a few exceptions, most tools we identified were either in the infancy stage and not accessible to applied researchers, were domain specific, or require substantial manual coding of articles before automation can occur. Additionally, few solutions considered extraction of data from tables, which is where many elements (e.g., effect sizes) reside that social and behavioral scientists analyze. Further, development appears to have ceased for many of the tools identified.</p>
            <p>We found no evidence indicating hesitation on the part of social science researchers to adopt data extraction tools, on the contrary, abstractive text summarization approaches continue to gain traction across social science domains (
                <xref ref-type="bibr" rid="ref14">Cairo et al., 2019</xref>; 
                <xref ref-type="bibr" rid="ref25">Feng et al., 2017</xref>). While these methods aid researchers in distilling complex information into meaningful insights, there remains a gap in technologies developed to augment human capabilities in the extraction of key data entities of interest for secondary data collection from quantitative empirical reports.</p>
            <p>The impact of time-intensive research activities on translational value is not a new concern for the SLR research community. In many social sciences, emphasis is often placed on practical application and translational value, underscoring the importance of efficient research methodologies (
                <xref ref-type="bibr" rid="ref26">Githens, 2015</xref>). Further development of the identified systems and techniques could mitigate time delays that often result in outdated information as researchers cannot feasibly include all new evidence that may be released throughout the lifetime of a given project (
                <xref ref-type="bibr" rid="ref47">Marshall &amp; Wallace, 2019</xref>).</p>
            <sec id="sec31">
                <title>Limitations</title>
                <p>As with any method that involves subjectivity, results can be influenced by a variety of factors (e.g., study design, publication bias, researcher judgment, etc.). We worked diligently to conduct this review and document our procedures in a systematic and transparent manner; however, efforts to replicate our search strategy or screening processes may not result in the same corpus or reach the same conclusions (
                    <xref ref-type="bibr" rid="ref10">Belur et al., 2018</xref>). This baseline review presented an opportunity to better develop our search and screening strategy, methodological procedures, and research goals. Moving forward, we have developed a codebook and assessment procedures to increase the transparency and reliability of our research.</p>
                <p>A second limitation of this study was the omission of snowballing as a search strategy. Though we did not uncover applied secondary research articles utilizing automation tools, several potentially useable tools and systems were discovered in the course of this review. For future iterations of this LSR, we plan to incorporate forward snowballing from relevant articles in previous searches as part of our formalized search strategy (see 
                    <xref ref-type="bibr" rid="ref74">Wohlin et al., 2022</xref>). Additionally, our search strategy has limitations related to its focus on English-language publications, the non-exhaustive list of databases and sources consulted, and the exclusion of grey literature. Addressing these aspects in future updates could enhance the comprehensiveness of findings and provide a broader perspective on the current state of automation tools in secondary research.</p>
                <p>Finally, in this baseline review, we did not capture techniques used for optimization, training, or fine-tuning on specific datasets or tasks. Several techniques surfaced while conducting this review, such as class modifiers (e.g., OvR; 
                    <xref ref-type="bibr" rid="ref7">Aumiller et al., 2020</xref>), genetic algorithms (
                    <xref ref-type="bibr" rid="ref9">Bayatmakou et al., 2022</xref>; 
                    <xref ref-type="bibr" rid="ref71">Torres et al. (2012)</xref>, Adam optimizer (
                    <xref ref-type="bibr" rid="ref51">Nowak &amp; Kunstman, 2019</xref>); 
                    <xref ref-type="bibr" rid="ref67">Shen et al., 2022</xref>), cross entropy loss (
                    <xref ref-type="bibr" rid="ref16">Chen et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref44">Li et al., 2022</xref>), Universal Language Model Fine-tuning (e.g., ULMFiT; 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>), and back-propagation optimizers (
                    <xref ref-type="bibr" rid="ref16">Chen et al., 2020</xref>; 
                    <xref ref-type="bibr" rid="ref3">Anisienia et al., 2021</xref>). With increasing applications of pre-trained language models that can be fine-tuned for specific applications (
                    <xref ref-type="bibr" rid="ref36">Jurafsky &amp; Martin, 2024</xref>), inclusion of training and optimization approaches would provide a more comprehensive framework for reporting findings on ML/NLP approaches to data extraction. We plan to supplement future iterations of this review by capturing various optimization and training methods.</p>
            </sec>
        </sec>
        <sec id="sec34">
            <title>Ethics and consent</title>
            <p>Ethical approval and consent were not required.</p>
        </sec>
    </body>
    <back>
        <sec id="sec35" sec-type="data-availability">
            <title>Data availability</title>
            <sec id="sec36">
                <title>Underlying data</title>
                <p>OSF: (Semi)Automated Approaches to Data Extraction for Systematic Reviews and Meta-Analyses in Social Sciences: A Living Review (
                    <xref ref-type="bibr" rid="ref43">Legate &amp; Nimon, 2024</xref>). Open Science Framework: 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17605/OSF.IO/C7NSA">https://doi.org/10.17605/OSF.IO/C7NSA</ext-link> (
                    <xref ref-type="bibr" rid="ref43">Legate &amp; Nimon, 2024</xref>).</p>
                <p>This project contains the following underlying data:
                    <list list-type="bullet">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Baseline Review Underlying Data
                                <list list-type="bullet">
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Baseline Review Results.xlsx</p>
                                    </list-item>
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Baseline Search Results Folder (a folder containing results by each search source)</p>
                                    </list-item>
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Open Source Tools.xlsx</p>
                                    </list-item>
                                </list>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Baseline Review Extended Data
                                <list list-type="bullet">
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Baseline Review Deduplication and Screening.docx</p>
                                    </list-item>
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Baseline Review Search Strategy.docx</p>
                                    </list-item>
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Baseline Review PRISMA Checklist.docx</p>
                                    </list-item>
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>LSR Codebook.docx</p>
                                    </list-item>
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Regex to Boolean Sytnax.xlsx</p>
                                    </list-item>
                                </list>
                            </p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Baseline Review Code
                                <list list-type="bullet">
                                    <list-item>
                                        <label>&#x25cb;</label>
                                        <p>Adapted code files and results for automated search and screening for ACL, ArXIV, and DBLP full database dumps.</p>
                                    </list-item>
                                </list>
                            </p>
                        </list-item>
                    </list>
                </p>
                <p>Data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International Public License</ext-link> (CC-BY 4.0).</p>
            </sec>
            <sec id="sec37">
                <title>Extended data</title>
                <p>Open Science Framework: 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17605/OSF.IO/C7NSA">https://doi.org/10.17605/OSF.IO/C7NSA</ext-link>(
                    <xref ref-type="bibr" rid="ref41">Legate &amp; Nimon, 2023a</xref>).</p>
                <p>This project contains the following extended data:
                    <list list-type="bullet">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Extraction Techniques Revised.docx &#x2013; categories and descriptions of data extraction techniques, architecture components, and evaluation metrics of interest</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Review Classifications.docx &#x2013; review tasks and stages of interest</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Target Data Elements.docx &#x2013; key elements of interest for targeted data elements</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Comprehensive List of Eligible Data Elements.xlsx &#x2013; comprehensive list of elements with extraction potential per APA JARS</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Search Strategy.docx &#x2013; search syntax for preliminary search in Web of Science</p>
                        </list-item>
                        <list-item>
                            <label>&#x2022;</label>
                            <p>APA &amp; Cochrane Data Elements.xlsx &#x2013; tabled data elements for Cochrane reviews, APA Module C (clinical trials), and APA (all study designs)</p>
                        </list-item>
                    </list>
                </p>
                <p>Data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International Public License</ext-link> (CC-BY 4.0).</p>
            </sec>
            <sec id="sec38">
                <title>Reporting guidelines</title>
                <p>This study follows PRISMA reporting guidelines (
                    <xref ref-type="bibr" rid="ref56">Page et al., 2021</xref>).</p>
                <p>Open Science Framework: PRISMA checklist for &#x2018;Open Science Framework: (Semi)Automated Approaches to Data Extraction for Systematic Reviews and Meta-Analyses in Social Sciences: A Living Review&#x2019;. 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17605/OSF.IO/C7NSA">https://doi.org/10.17605/OSF.IO/C7NSA</ext-link> (
                    <xref ref-type="bibr" rid="ref43">Legate &amp; Nimon, 2024</xref>).</p>
                <p>Data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International Public License</ext-link> (CC-BY 4.0).</p>
            </sec>
        </sec>
        <sec id="sec39">
            <title>Software availability</title>
            <p>

                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Source code available from: 
                            <ext-link ext-link-type="uri" xlink:href="https://github.com/mcguinlu/COVID_suicide_living">https://github.com/mcguinlu/COVID_suicide_living
</ext-link>
                        </p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Archived source code: 
                            <ext-link ext-link-type="uri" xlink:href="http://doi.org/10.5281/zenodo.3871366">http://doi.org/10.5281/zenodo.3871366</ext-link> (
                            <xref ref-type="bibr" rid="ref48">McGuinness &amp; Schmidt, 2020</xref>).</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>The adapted version of the source code for automated searching: 
                            <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17605/OSF.IO/C7NSA">https://doi.org/10.17605/OSF.IO/C7NSA</ext-link>
                        </p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>Archived source code: 
                            <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17605/OSF.IO/C7NSA">https://doi.org/10.17605/OSF.IO/C7NSA</ext-link> (
                            <xref ref-type="bibr" rid="ref43">Legate &amp; Nimon, 2024</xref>).</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>License: MIT</p>
                    </list-item>
                </list>
            </p>
        </sec>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aliyu</surname>
                            <given-names>MB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Iqbal</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>James</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <chapter-title>The canonical model of structure for data extraction in systematic reviews of scientific research articles.</chapter-title>
                    <source>

                        <italic toggle="yes">2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).</italic>
</source>
                    <publisher-name>IEEE</publisher-name>;<year>2018, October</year>; pp.<fpage>264</fpage>&#x2013;<lpage>271</lpage>.
                    <pub-id pub-id-type="doi">10.1109/SNAMS.2018.8554896</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Angrosh</surname>
                            <given-names>MA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cranefield</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Stanger</surname>
                            <given-names>N</given-names>
                        </name>
</person-group>:
                    <article-title>Contextual information retrieval in research articles: Semantic publishing tools for the research community.</article-title>
                    <source>

                        <italic toggle="yes">Semantic Web.</italic>
</source>
                    <year>2014</year>;<volume>5</volume>(<issue>4</issue>):<fpage>261</fpage>&#x2013;<lpage>293</lpage>.
                    <pub-id pub-id-type="doi">10.3233/SW-130097</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Anisienia</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mueller</surname>
                            <given-names>RM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kupfer</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Research method classification with deep transfer learning for semi-automatic meta-analysis of information systems papers.</article-title>
                    <source>

                        <italic toggle="yes">Proceedings of the 54th Hawaii International Conference on System Sciences.</italic>
</source>
                    <year>2021</year>; pp.<fpage>6099</fpage>&#x2013;<lpage>6108</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="http://hdl.handle.net/10125/71357">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Antons</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gr&#x00fc;nwald</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cichy</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The application of text mining methods in innovation research: Current state, evolution patterns, and development priorities.</article-title>
                    <source>

                        <italic toggle="yes">R&amp;D Manag.</italic>
</source>
                    <year>2020</year>;<volume>50</volume>(<issue>3</issue>):<fpage>329</fpage>&#x2013;<lpage>351</lpage>.
                    <pub-id pub-id-type="doi">10.1111/radm.12408</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ali</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gravino</surname>
                            <given-names>C</given-names>
                        </name>
</person-group>:
                    <chapter-title>An ontology-based approach to semi-automate systematic literature reviews.</chapter-title>
                    <source>

                        <italic toggle="yes">2018 12th International Conference on Open Source Systems and Technologies (ICOSST).</italic>
</source>
                    <publisher-name>IEEE</publisher-name>;<year>2018, December</year>; pp.<fpage>09</fpage>&#x2013;<lpage>16</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ICOSST.2018.8632205</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Arlot</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Celisse</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>A survey of cross-validation procedures for model selection.</article-title>
                    <source>

                        <italic toggle="yes">Stat. Surv.</italic>
</source>
                    <year>2010</year>;<volume>4</volume>:<fpage>40</fpage>&#x2013;<lpage>79</lpage>.
                    <pub-id pub-id-type="doi">10.1214/09-SS054</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aumiller</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Almasian</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hausner</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>UniHD@CL-SciSumm 2020: Citation extraction as search.</article-title>
                    <source>

                        <italic toggle="yes">Proceedings of the First Workshop on Scholarly Document Processing.</italic>
</source>
                    <year>2020, November</year>; pp.<fpage>261</fpage>&#x2013;<lpage>269</lpage>.
                    <pub-id pub-id-type="doi">10.18653/v1/2020.sdp-1.29</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Appelbaum</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cooper</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kline</surname>
                            <given-names>RB</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Journal Article Reporting Standards for Quantitative Research in Psychology: The APA Publications and Communications Board Task Force report.</article-title>
                    <source>

                        <italic toggle="yes">Am. Psychol.</italic>
</source>
                    <year>2018</year>;<volume>73</volume>(<issue>1</issue>):<fpage>3</fpage>&#x2013;<lpage>25</lpage>.
                    <pub-id pub-id-type="pmid">29345484</pub-id>
                    <pub-id pub-id-type="doi">10.1037/amp0000191</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bayatmakou</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mohebi</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ahmadi</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>An interactive query-based approach for summarizing scientific documents.</article-title>
                    <source>

                        <italic toggle="yes">Inf. Discov. Deliv.</italic>
</source>
                    <year>2022</year>;<volume>50</volume>(<issue>2</issue>):<fpage>176</fpage>&#x2013;<lpage>191</lpage>.
                    <pub-id pub-id-type="doi">10.1108/IDD-10-2020-0124</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Belur</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tompson</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thornton</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Interrater reliability in systematic review methodology: Exploring variation in coder decision-making.</article-title>
                    <source>

                        <italic toggle="yes">Sociol. Methods Res.</italic>
</source>
                    <year>2018</year>;<volume>50</volume>(<issue>2</issue>):<fpage>837</fpage>&#x2013;<lpage>865</lpage>.
                    <pub-id pub-id-type="doi">10.1177/0049124118799372</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Beltagy</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Peters</surname>
                            <given-names>ME</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cohan</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Longformer: The long-document transformer.</article-title>
                    <source>

                        <italic toggle="yes">arXiv, abs/2004.05150.</italic>
</source>
                    <year>2020</year>.
                    <pub-id pub-id-type="doi">10.48550/arXiv.2004.05150</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bosco</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Uggerslev</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Steel</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>MetaBUS as a vehicle for facilitating meta-analysis.</article-title>
                    <source>

                        <italic toggle="yes">Hum. Resour. Manag. Rev.</italic>
</source>
                    <year>2017</year>;<volume>27</volume>(<issue>1</issue>):<fpage>237</fpage>&#x2013;<lpage>254</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.hrmr.2016.09.013</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bozada</surname>
                            <given-names>T</given-names>
                            <suffix>Jr</suffix>
                        </name>

                        <name name-style="western">
                            <surname>Borden</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Workman</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Sysrev: A FAIR platform for data curation and systematic evidence review.</article-title>
                    <source>

                        <italic toggle="yes">Front. Artif. Intell.</italic>
</source>
                    <year>2021</year>;<volume>4</volume>:<fpage>1</fpage>&#x2013;<lpage>18</lpage>. Article 685298.
                    <pub-id pub-id-type="pmid">34423285</pub-id>
                    <pub-id pub-id-type="doi">10.3389/frai.2021.685298</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8374944</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cairo</surname>
                            <given-names>LS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Figueiredo Carneiro</surname>
                            <given-names>G</given-names>
                            <prefix>de</prefix>
                        </name>

                        <name name-style="western">
                            <surname>Silva</surname>
                            <given-names>BC</given-names>
                            <prefix>da</prefix>
                        </name>
</person-group>:
                    <article-title>Adoption of machine learning techniques to perform secondary studies: A systematic mapping study for the computer science field.</article-title>
                    <source>

                        <italic toggle="yes">ICEIS.</italic>
</source>
                    <year>2019</year>;<volume>2</volume>:<fpage>351</fpage>&#x2013;<lpage>356</lpage>.
                    <pub-id pub-id-type="doi">10.5220/0007780603510356</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Carri&#x00f3;n-Toro</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Aguilar</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sant&#x00f3;rum</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>iKeyCriteria: A qualitative and quantitative analysis method to infer key criteria since a systematic literature review for the computing domain.</article-title>
                    <source>

                        <italic toggle="yes">Data.</italic>
</source>
                    <year>2022</year>;<volume>7</volume>(<issue>6</issue>):<fpage>70</fpage>.
                    <pub-id pub-id-type="doi">10.3390/data 7060070</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>PHA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Leibrand</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vasko</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Ontology-based and user-focused automatic text summarization (OATS): Using COVID-19 risk factors as an example.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv:2012.02028.</italic>
</source>
                    <year>2020</year>.
                    <pub-id pub-id-type="doi">10.48550/arXiv.2012.02028</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>VZ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Montano-Campos</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zadrozny</surname>
                            <given-names>W</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Machine reading of hypotheses for organizational research reviews and pre-trained models via R Shiny app for non-programmers.</article-title>
                    <year>2021</year>.
                    <pub-id pub-id-type="doi">10.48550/arXiv.2106.16102</pub-id>
                    <ext-link ext-link-type="uri" xlink:href="http://arXiv.org">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref82">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cohen</surname>
                            <given-names>E</given-names>
                        </name>
</person-group>:
                    <chapter-title>The boundary lens: theorising academic activity.</chapter-title>In
                    <source>

                        <italic toggle="yes">The university and its boundaries: Thriving or surviving in the 21st Century.</italic>
</source>
                    <edition>1st ed.</edition>
                    <publisher-name>Routledge</publisher-name>;<year>2021</year>; pp.<fpage>14</fpage>&#x2013;<lpage>41</lpage>.
                    <pub-id pub-id-type="doi">10.4324/9781003102953</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Davis</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mengersen</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bennett</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Viewing systematic reviews and meta-analysis in social research through different lenses.</article-title>
                    <source>

                        <italic toggle="yes">Springerplus.</italic>
</source>
                    <year>2014</year>;<volume>3</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">25279303</pub-id>
                    <pub-id pub-id-type="doi">10.1186/2193-1801-3-511</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4167883</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Denzler</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Enders</surname>
                            <given-names>MR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Akello</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>Towards a semi-automated approach for systematic literature reviews.</article-title>
                    <source>

                        <italic toggle="yes">Twenty-Seventh Americas Conference on Information Systems (AMCIS).</italic>
</source>
                    <year>2021</year>; Vol.<volume>4</volume>: pp.<fpage>1</fpage>&#x2013;<lpage>10</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://aisel.aisnet.org/amcis2021/art_intel_sem_tech_intelligent_systems/art_intel_sem_tech_intelligent_systems/4">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Diaz-Elsayed</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>Q</given-names>
                        </name>
</person-group>:
                    <article-title>Extracting the characteristics of Life Cycle Assessments via data mining.</article-title>
                    <source>

                        <italic toggle="yes">MethodsX.</italic>
</source>
                    <year>2020</year>;<volume>7</volume>(<issue>101004</issue>):<fpage>1</fpage>&#x2013;<lpage>6</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.mex.2020.101004</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Dridi</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gaber</surname>
                            <given-names>MM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Azad</surname>
                            <given-names>RMA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Scholarly data mining: A systematic review of its applications.</article-title>
                    <source>

                        <italic toggle="yes">Wiley Interdiscip. Rev.: Data Min. Knowl. Discov.</italic>
</source>
                    <year>2021</year>;<volume>11</volume>(<issue>2</issue>):<fpage>1</fpage>&#x2013;<lpage>23</lpage>.
                    <pub-id pub-id-type="doi">10.1002/widm.1395</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref22">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Elliott</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Turner</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Clavisi</surname>
                            <given-names>O</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Living systematic reviews: An emerging opportunity to narrow the evidence-practice gap.</article-title>
                    <source>

                        <italic toggle="yes">PLoS Med.</italic>
</source>
                    <year>2014</year>;<volume>11</volume>(<issue>2</issue>):<fpage>E1001603</fpage>.
                    <pub-id pub-id-type="pmid">24558353</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pmed.1001603</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3928029</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Elliott</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Synnot</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Turner</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Living systematic review: 1. Introduction&#x2014;the why, what, when, and how.</article-title>
                    <source>

                        <italic toggle="yes">J. Clin. Epidemiol.</italic>
</source>
                    <year>2017</year>;<volume>91</volume>:<fpage>23</fpage>&#x2013;<lpage>30</lpage>.
                    <pub-id pub-id-type="pmid">28912002</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jclinepi.2017.08.010</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Eriksen</surname>
                            <given-names>MB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Frandsen</surname>
                            <given-names>TF</given-names>
                        </name>
</person-group>:
                    <article-title>The impact of patient, intervention, comparison, outcome (PICO) as a search strategy tool on literature search quality: a systematic review.</article-title>
                    <source>

                        <italic toggle="yes">J. Med. Libr. Assoc.</italic>
</source>
                    <year>2018</year>;<volume>106</volume>(<issue>4</issue>):<fpage>420</fpage>&#x2013;<lpage>431</lpage>.
                    <pub-id pub-id-type="pmid">30271283</pub-id>
                    <pub-id pub-id-type="doi">10.5195/jmla.2018.345</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6148624</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Feng</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chiam</surname>
                            <given-names>YK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lo</surname>
                            <given-names>SK</given-names>
                        </name>
</person-group>:
                    <article-title>Text-mining techniques and tools for systematic literature reviews: A systematic literature review.</article-title>
                    <source>

                        <italic toggle="yes">2017 24th Asia-Pacific Software Engineering Conference (APSEC)</italic>
</source>
                    <year>2017, December</year>; pp.<fpage>41</fpage>&#x2013;<lpage>50</lpage>.
                    <pub-id pub-id-type="doi">10.1109/APSEC.2017.10</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref26">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Githens</surname>
                            <given-names>RP</given-names>
                        </name>
</person-group>:
                    <article-title>Critical action research in human resource development.</article-title>
                    <source>

                        <italic toggle="yes">Hum. Resour. Dev. Rev.</italic>
</source>
                    <year>2015</year>;<volume>14</volume>(<issue>2</issue>):<fpage>185</fpage>&#x2013;<lpage>204</lpage>.
                    <pub-id pub-id-type="doi">10.1177/1534484315581934</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Goldfarb-Tarrant</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Robertson</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lazic</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Scaling systematic literature reviews with machine learning pipelines.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv:2010.04665.</italic>
</source>
                    <year>2020</year>.
                    <pub-id pub-id-type="doi">10.48550/arXiv.2010.04665</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>G&#x00f6;pfert</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kuckertz</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Weinand</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Measurement extraction with natural language processing: A review.</article-title>
                    <source>

                        <italic toggle="yes">Findings of the Association for Computational Linguistics: EMNLP 2022.</italic>
</source>
                    <year>2022</year>;<fpage>2191</fpage>&#x2013;<lpage>2215</lpage>.
                    <pub-id pub-id-type="doi">10.18653/v1/2022.findings-emnlp.161</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Goswami</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pal</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Goldsworthy</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <chapter-title>An effective machine learning framework for data elements extraction from the literature of anxiety outcome measures to build systematic review.</chapter-title>
                    <person-group person-group-type="editor">

                        <name name-style="western">
                            <surname>Abramowicz</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Corchuelo</surname>
                            <given-names>R</given-names>
                        </name>
</person-group>, editors.
                    <source>

                        <italic toggle="yes">Business Information Systems. BIS 2019. Lecture Notes in Business Information Processing.</italic>
</source>Vol.<volume>353</volume>.
                    <publisher-loc>Cham</publisher-loc>:
                    <publisher-name>Springer</publisher-name>;<year>2019</year>; pp.<fpage>265</fpage>&#x2013;<lpage>277</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-3-030-20485-3_19</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref30">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gough</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Davies</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jamtvedt</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Evidence Synthesis International (ESI): Position statement.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2020</year>;<volume>9</volume>(<issue>1</issue>):<fpage>155</fpage>.
                    <pub-id pub-id-type="pmid">32650823</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13643-020-01415-5</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7353688</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref31">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="editor">

                        <name name-style="western">
                            <surname>Higgins</surname>
                            <given-names>JPT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thomas</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chandler</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <source>

                        <italic toggle="yes">Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022).</italic>
</source>
                    <publisher-name>Cochrane</publisher-name>;<year>2022</year>;<fpage>2022</fpage>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.training.cochrane.org/handbook">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref32">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Holub</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hardy</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kallmes</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>Toward automated data extraction according to tabular data structure: Cross-sectional pilot survey of the comparative clinical literature.</article-title>
                    <source>

                        <italic toggle="yes">JMIR Form. Res.</italic>
</source>
                    <year>2021</year>;<volume>5</volume>(<issue>11</issue>):<fpage>E33124</fpage>.
                    <pub-id pub-id-type="pmid">34821562</pub-id>
                    <pub-id pub-id-type="doi">10.2196/33124</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8663462</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref33">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ip</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hadar</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Keefe</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A web-based archive of systematic review data.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2012</year>;<volume>1</volume>(<issue>1</issue>):<fpage>15</fpage>.
                    <pub-id pub-id-type="pmid">22588052</pub-id>
                    <pub-id pub-id-type="doi">10.1186/2046-4053-1-15</pub-id>
                    <pub-id pub-id-type="pmcid">PMC3351737</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref34">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Iwatsuki</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sagara</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hara</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Detecting in-line mathematical expressions in scientific documents.</article-title>
                    <source>

                        <italic toggle="yes">Proceedings of the 2017 ACM Symposium on Document Engineering.</italic>
</source>
                    <year>2017, August</year>; pp.<fpage>141</fpage>&#x2013;<lpage>144</lpage>.
                    <pub-id pub-id-type="doi">10.1145/3103010.3121041</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref35">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jonnalagadda</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Goyal</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huffman</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Automating data extraction in systematic reviews: A systematic review.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2015</year>;<volume>4</volume>(<issue>1</issue>):<fpage>78</fpage>.
                    <pub-id pub-id-type="pmid">26073888</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13643-015-0066-7</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4514954</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref36">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jurafsky</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Martin</surname>
                            <given-names>JH</given-names>
                        </name>
</person-group>:
                    <article-title>Speech and language processing [Feb 2024 release].</article-title>
                    <year>2024</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://web.stanford.edu/~jurafsky/slp3/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref37">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Khamis</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kahale</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pardo-Hernandez</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Methods of conduct and reporting of living systematic reviews: A protocol for a living methodological survey [version 1; peer review: 2 approved].</article-title>
                    <source>

                        <italic toggle="yes">F1000 Res.</italic>
</source>
                    <year>2019</year>;<volume>8</volume>:<fpage>221</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.18005.2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref38">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kohl</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McIntosh</surname>
                            <given-names>EJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Unger</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Online tools supporting the conduct and reporting of systematic reviews and systematic maps: A case study on CADIMA and review of existing tools.</article-title>
                    <source>

                        <italic toggle="yes">Environ. Evid.</italic>
</source>
                    <year>2018</year>;<volume>7</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>17</lpage>.
                    <pub-id pub-id-type="doi">10.1186/s13750-018-0115-5</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref39">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kowsari</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Meimandi</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Heidarysafa</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Text classification algorithms: A survey.</article-title>
                    <source>

                        <italic toggle="yes">arXiv, abs/1904.08067.</italic>
</source>
                    <year>2019</year>.
                    <pub-id pub-id-type="doi">10.3390/info10040150</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref40">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kwiatkowski</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Palomaki</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Redfield</surname>
                            <given-names>O</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Natural questions: A benchmark for question answering research.</article-title>
                    <source>

                        <italic toggle="yes">Trans. Assoc. Comput. Linguist.</italic>
</source>
                    <year>2019</year>;<volume>7</volume>:<fpage>453</fpage>&#x2013;<lpage>466</lpage>.
                    <pub-id pub-id-type="doi">10.1162/tacl_a_00276</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref41">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Legate</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nimon</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>Updated supplemental files: (Semi)automated approaches to data extraction for systematic reviews and meta-analyses in social sciences: A living review protocol.</article-title>
                    <year>2023a, January 12</year>.
                    <pub-id pub-id-type="doi">10.17605/OSF.IO/EWFKP</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref42">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Legate</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nimon</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <article-title>(Semi)automated approaches to data extraction for systematic reviews and meta-analyses in social sciences: A living review protocol [version 2; peer review: 2 approved, 1 approved with reservations].</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2023b</year>;<volume>11</volume>:<fpage>1036</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.125198.2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref43">
                <mixed-citation publication-type="data">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Legate</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nimon</surname>
                            <given-names>K</given-names>
                        </name>
</person-group>:
                    <data-title>(Semi) Automated Approaches to Data Extraction for Systematic Reviews and Meta-Analyses in Social Sciences: A Living Review.</data-title>[Dataset].
                    <source>

                        <italic toggle="yes">OSF.</italic>
</source>
                    <year>2024, May 5</year>.
                    <pub-id pub-id-type="doi">10.17605/OSF.IO/C7NSA</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref44">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mandal</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ouyang</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>CORWA: A citation-oriented related work annotation dataset.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv:2205.03512.</italic>
</source>
                    <year>2022</year>.
                    <pub-id pub-id-type="doi">10.48550/arXiv.2205.03512</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref45">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bai</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mitra</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <chapter-title>Searching for tables in digital documents.</chapter-title>
                    <source>

                        <italic toggle="yes">Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).</italic>
</source>
                    <publisher-name>IEEE</publisher-name>;<year>2007, September</year>; Vol.<volume>2</volume>: pp.<fpage>934</fpage>&#x2013;<lpage>938</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ICDAR.2007.4377052</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref46">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="editor">

                        <name name-style="western">
                            <surname>Marshall</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sutton</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>O&#x2019;Keefe</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>, editors.
                    <article-title>The Systematic Review Toolbox.</article-title>
                    <year>2022</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.systematicreviewtools.com/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref47">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Marshall</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wallace</surname>
                            <given-names>B</given-names>
                        </name>
</person-group>:
                    <article-title>Toward systematic review automation: A practical guide to using machine learning tools in research synthesis.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2019</year>;<volume>8</volume>(<issue>1</issue>):<fpage>110</fpage>&#x2013;<lpage>163</lpage>.
                    <pub-id pub-id-type="pmid">31296265</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13643-019-1074-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6621996</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref48">
                <mixed-citation publication-type="data">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>McGuinness</surname>
                            <given-names>LA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Schmidt</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <data-title>mcguinlu/COVID_suicide_living: Initial Release (v1.0.0).</data-title>[Data set].
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2020</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.3871366</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref49">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Nayak</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Balasubramaniam</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kutty</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <chapter-title>A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry.</chapter-title>
                    <source>

                        <italic toggle="yes">Data Mining: 19th Australasian Conference on Data Mining, AusDM 2021.</italic>
</source>
                    <publisher-loc>Singapore</publisher-loc>:
                    <publisher-name>Springer</publisher-name>;<year>2021, December</year>; pp.<fpage>209</fpage>&#x2013;<lpage>222</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-981-16-8531-6_15</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref50">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Neppalli</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Caragea</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mayes</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>MetaSeer. STEM: Towards automating meta-analyses.</article-title>
                    <source>

                        <italic toggle="yes">Proc. AAAI Conf. Artif. Intell.</italic>
</source>
                    <year>2016, February</year>;<volume>30</volume>(<issue>2</issue>):<fpage>4035</fpage>&#x2013;<lpage>4040</lpage>.
                    <pub-id pub-id-type="doi">10.1609/aaai.v30i2.19081</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref51">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Nowak</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kunstman</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>Team EP at TAC 2018: Automating data extraction in systematic reviews of environmental agents.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv:1901.02081.</italic>
</source>
                    <year>2019</year>.
                    <pub-id pub-id-type="doi">10.48550/arXiv.1901.02081</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref52">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ochoa-Hern&#x00e1;ndez</surname>
                            <given-names>JL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Barcelo-Valenzuela</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sanchez-Smitz</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <chapter-title>Concept identification from single-documents.</chapter-title>
                    <person-group person-group-type="editor">

                        <name name-style="western">
                            <surname>Valencia-Garc&#x00ed;a</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Alcaraz-M&#x00e1;rmol</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cioppo-Morstadt</surname>
                            <given-names>J</given-names>
                            <prefix>Del</prefix>
                        </name>

                        <etal/>
</person-group>, editors.
                    <source>

                        <italic toggle="yes">Technologies and Innovation. CITI 2018. Communications in Computer and Information Science.</italic>
</source>Vol.<volume>883</volume>.
                    <publisher-loc>Cham</publisher-loc>:
                    <publisher-name>Springer</publisher-name>;<year>2018</year>; pp.<fpage>141</fpage>&#x2013;<lpage>152</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-3-030-00940-3_12</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref53">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>O&#x2019;Connor</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsafnat</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thomas</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A question of trust: Can we build an evidence base to gain trust in systematic review automation technologies?</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2019</year>;<volume>8</volume>(<issue>1</issue>):<fpage>143</fpage>.
                    <pub-id pub-id-type="pmid">31215463</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13643-019-1062-0</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6582554</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref54">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>O&#x2019;Mara-Eves</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Thomas</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McNaught</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Using text mining for study identification in systematic reviews: A systematic review of current approaches.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2015</year>;<volume>4</volume>(<issue>1</issue>):<fpage>5</fpage>.
                    <pub-id pub-id-type="pmid">25588314</pub-id>
                    <pub-id pub-id-type="doi">10.1186/2046-4053-4-5</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4320539</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref55">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ouzzani</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hammady</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fedorowicz</surname>
                            <given-names>Z</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Rayyan-a web and mobile app for systematic reviews.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2016</year>;<volume>5</volume>(<issue>1</issue>):<fpage>210</fpage>.
                    <pub-id pub-id-type="pmid">27919275</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13643-016-0384-4</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5139140</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref56">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Page</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McKenzie</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bossuyt</surname>
                            <given-names>P</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The PRISMA 2020 statement: An updated guideline for reporting systematic reviews.</article-title>
                    <source>

                        <italic toggle="yes">J. Clin. Epidemiol.</italic>
</source>
                    <year>2021</year>;<volume>88</volume>:<fpage>105189</fpage>&#x2013;<lpage>105906</lpage>.
                    <pub-id pub-id-type="pmid">33789826</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.ijsu.2021.105906</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref57">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Park</surname>
                            <given-names>JJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Han</surname>
                            <given-names>H</given-names>
                        </name>
</person-group>:
                    <article-title>Research method trends in the field of human resource development [Refereed Extended Abstract].</article-title>
                    <source>

                        <italic toggle="yes">2021 AHRD Virtual Conference.</italic>
</source>
                    <year>2021, February 17-19</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.ahrd.org/page/2021-annual-conference">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref58">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pertsas</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Constantopoulos</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <chapter-title>Ontology-driven information extraction from research publications.</chapter-title>
                    <person-group person-group-type="editor">

                        <name name-style="western">
                            <surname>Aalberg</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Papatheodorou</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dobreva</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>, editors.
                    <source>

                        <italic toggle="yes">Digital Libraries for Open Knowledge: 22nd International Conference on Theory and Practice of Digital Libraries, (TPDL 2018).</italic>
</source>
                    <publisher-name>Springer International Publishing</publisher-name>;<year>2018</year>; pp.<fpage>241</fpage>&#x2013;<lpage>253</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-3-030-00066-0_21</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref59">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pigott</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Polanin</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>Methodological guidance paper: High-quality meta-analysis in a systematic review.</article-title>
                    <source>

                        <italic toggle="yes">Rev. Educ. Res.</italic>
</source>
                    <year>2020</year>;<volume>90</volume>(<issue>1</issue>):<fpage>24</fpage>&#x2013;<lpage>46</lpage>.
                    <pub-id pub-id-type="doi">10.3102/0034654319877153</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref60">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Piroi</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lipani</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lupu</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <chapter-title>DASyR (IR)-document analysis system for systematic reviews (in Information Retrieval).</chapter-title>
                    <source>

                        <italic toggle="yes">2015 13th International Conference on Document Analysis and Recognition (ICDAR).</italic>
</source>
                    <publisher-name>IEEE</publisher-name>;<year>2015, August</year>; pp.<fpage>591</fpage>&#x2013;<lpage>595</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ICDAR.2015.7333830</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref61">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Porter</surname>
                            <given-names>MF</given-names>
                        </name>
</person-group>:
                    <article-title>An algorithm for suffix stripping.</article-title>
                    <source>

                        <italic toggle="yes">Program: Electronic Library and Information Systems.</italic>
</source>
                    <year>1980</year>;<volume>14</volume>(<issue>3</issue>):<fpage>130</fpage>&#x2013;<lpage>137</lpage>.
                    <pub-id pub-id-type="doi">10.1108/eb046814</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref62">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Roldan-Baluis</surname>
                            <given-names>WL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zapata</surname>
                            <given-names>NA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vasquez</surname>
                            <given-names>MSM</given-names>
                        </name>
</person-group>:
                    <article-title>The effect of natural language processing on the analysis of unstructured text: A systematic review.</article-title>
                    <source>

                        <italic toggle="yes">Int. J. Adv. Comput. Sci. Appl.</italic>
</source>
                    <year>2022</year>;<volume>13</volume>(<issue>5</issue>):<fpage>43</fpage>&#x2013;<lpage>51</lpage>.
                    <pub-id pub-id-type="doi">10.14569/IJACSA.2022.0130507</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref63">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schmidt</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Olorisade</surname>
                            <given-names>BK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McGuinness</surname>
                            <given-names>LA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Data extraction methods for systematic review (semi)automation: A living systematic review [version 1; peer review: 3 approved].</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2021</year>;<volume>10</volume>:<fpage>401</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.51117.1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref64">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schmidt</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Olorisade</surname>
                            <given-names>BK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McGuinness</surname>
                            <given-names>LA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Data extraction methods for systematic review (semi)automation: A living review protocol (Version 2; peer review: 2 approved).</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2020</year>;<volume>9</volume>:<fpage>210</fpage>.
                    <pub-id pub-id-type="pmid">32724560</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.22781.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7338918</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref65">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schmidt</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Finnerty Mutlu</surname>
                            <given-names>AN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Elmore</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Data extraction methods for systematic review (semi)automation: Update of a living systematic review (Version 2; peer review: 3 approved).</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2023</year>;<volume>10</volume>:<fpage>401</fpage>.
                    <pub-id pub-id-type="pmid">34408850</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.51117.2</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8361807</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref66">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shahid</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Afzal</surname>
                            <given-names>MT</given-names>
                        </name>
</person-group>:
                    <article-title>Section-wise indexing and retrieval of research articles.</article-title>
                    <source>

                        <italic toggle="yes">Clust. Comput.</italic>
</source>
                    <year>2018</year>;<volume>21</volume>(<issue>1</issue>):<fpage>481</fpage>&#x2013;<lpage>492</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s10586-017-0914-4</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref67">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shen</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jiang</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A model for the identification of the functional structures of unstructured abstracts in the social sciences.</article-title>
                    <source>

                        <italic toggle="yes">Electron. Libr.</italic>
</source>
                    <year>2022</year>;<volume>40</volume>(<issue>6</issue>):<fpage>680</fpage>&#x2013;<lpage>697</lpage>.
                    <pub-id pub-id-type="doi">10.1108/EL-10-2021-0190</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref68">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Shirmohammadi</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mehdiabadi</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Beigi</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Mapping human resource development: Visualizing the past, bridging the gaps, and moving toward the future.</article-title>
                    <source>

                        <italic toggle="yes">Hum. Resour. Dev. Q.</italic>
</source>
                    <year>2021</year>;<volume>32</volume>(<issue>2</issue>):<fpage>197</fpage>&#x2013;<lpage>224</lpage>.
                    <pub-id pub-id-type="doi">10.1002/hrdq.21415</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref69">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sundaram</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Berleant</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <chapter-title>Automating systematic literature reviews with natural language processing and text mining: A systematic literature review.</chapter-title>
                    <source>

                        <italic toggle="yes">Eighth International Congress on Information and Communication Technology (ICICT).</italic>
</source>
                    <publisher-loc>Singapore</publisher-loc>:
                    <publisher-name>Springer</publisher-name>;<year>2023</year>; pp.<fpage>73</fpage>&#x2013;<lpage>92</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-981-99-3243-6_7</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref70">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Short</surname>
                            <given-names>JC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>McKenny</surname>
                            <given-names>AF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Reid</surname>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>More than words? Computer-aided text analysis in organizational behavior and psychology research.</article-title>
                    <source>

                        <italic toggle="yes">Annu. Rev. Organ. Psych. Organ. Behav.</italic>
</source>
                    <year>2018</year>;<volume>5</volume>(<issue>1</issue>):<fpage>415</fpage>&#x2013;<lpage>435</lpage>.
                    <pub-id pub-id-type="doi">10.1146/annurev-orgpsych-032117-104622</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref71">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Torres</surname>
                            <given-names>JAS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cruzes</surname>
                            <given-names>DS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nascimento Salvador</surname>
                            <given-names>L</given-names>
                            <prefix>do</prefix>
                        </name>
</person-group>:
                    <chapter-title>Automatic results identification in software engineering papers. Is it possible?</chapter-title>
                    <source>

                        <italic toggle="yes">2012 12th International Conference on Computational Science and Its Applications.</italic>
</source>
                    <publisher-name>IEEE</publisher-name>;<year>2012, June</year>; pp.<fpage>108</fpage>&#x2013;<lpage>112</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ICCSA.2012.27</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref72">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsafnat</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Glasziou</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Choong</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Systematic review automation technologies.</article-title>
                    <source>

                        <italic toggle="yes">Syst. Rev.</italic>
</source>
                    <year>2014</year>;<volume>3</volume>(<issue>1</issue>):<fpage>74</fpage>.
                    <pub-id pub-id-type="pmid">25005128</pub-id>
                    <pub-id pub-id-type="doi">10.1186/2046-4053-3-74</pub-id>
                    <pub-id pub-id-type="pmcid">PMC4100748</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref73">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wagner</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lukyanenko</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Par&#x00e9;</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Artificial intelligence and the conduct of literature reviews.</article-title>
                    <source>

                        <italic toggle="yes">J. Inf. Technol.</italic>
</source>
                    <year>2022</year>;<volume>37</volume>(<issue>2</issue>):<fpage>209</fpage>&#x2013;<lpage>226</lpage>.
                    <pub-id pub-id-type="doi">10.1177/02683962211048201</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref74">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wohlin</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kalinowski</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Felizardo</surname>
                            <given-names>KR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Successful combination of database search and snowballing for identification of primary studies in systematic literature studies.</article-title>
                    <source>

                        <italic toggle="yes">Inf. Softw. Technol.</italic>
</source>
                    <year>2022</year>;<volume>147</volume>:<fpage>106908</fpage>.
                    <pub-id pub-id-type="doi">10.1016/j.infsof.2022.106908</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref76">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Yang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kinshuk</surname>
                        </name>

                        <name name-style="western">
                            <surname>An</surname>
                            <given-names>Y</given-names>
                        </name>
</person-group>:
                    <article-title>A survey of the literature: how scholars use text mining in Educational Studies?</article-title>
                    <source>

                        <italic toggle="yes">Educ. Inf. Technol.</italic>
</source>
                    <year>2023</year>;<volume>28</volume>(<issue>2</issue>):<fpage>2071</fpage>&#x2013;<lpage>2090</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s10639-022-11193-3</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref77">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Young</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Abstract and Index and Web Discovery Services IEEE Partners.</italic>
</source>
                    <publisher-name>IEEE Xplore</publisher-name>;<year>2023, January</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://ieeexplore.ieee.org/Xplorehelp/downloads/discovery-services/ieee_indexing_agreements.pd">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref78">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Young</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hazarika</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Poria</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Recent trends in deep learning based natural language processing.</article-title>
                    <year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1708.02709">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref79">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhao</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Feng</surname>
                            <given-names>GC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ao</surname>
                            <given-names>SH</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Interrater reliability estimators tested against true interrater reliabilities.</article-title>
                    <source>

                        <italic toggle="yes">BMC Med. Res. Methodol.</italic>
</source>
                    <year>2022</year>;<volume>22</volume>:<fpage>232</fpage>.
                    <pub-id pub-id-type="pmid">36038846</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12874-022-01707-5</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9426226</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref80">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhitomirsky-Geffet</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bergman</surname>
                            <given-names>O</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hilel</surname>
                            <given-names>S</given-names>
                        </name>
</person-group>:
                    <article-title>Towards a wider perspective in the social sciences using a network of variables based on thousands of results.</article-title>
                    <source>

                        <italic toggle="yes">Scientometrics.</italic>
</source>
                    <year>2020</year>;<volume>123</volume>:<fpage>1385</fpage>&#x2013;<lpage>1406</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s11192-020-03446-0</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref81">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zielinski</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mutschke</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>Mining social science publications for survey variables.</article-title>
                    <source>

                        <italic toggle="yes">Proceedings of the Second Workshop on NLP and Computational Social Science.</italic>
</source>
                    <year>2017, August</year>; pp.<fpage>47</fpage>&#x2013;<lpage>52</lpage>.
                    <pub-id pub-id-type="doi">10.18653/v1/W17-2907</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report349101">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.172337.r349101</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Polanin</surname>
                        <given-names>Joshua</given-names>
                    </name>
                    <xref ref-type="aff" rid="r349101a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-5100-0164</uri>
                </contrib>
                <aff id="r349101a1">
                    <label>1</label>American Institutes for Research, Arlington, Virginia, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>1</month>
                <year>2025</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2025 Polanin J</copyright-statement>
                <copyright-year>2025</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport349101" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.151493.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you for the opportunity to submit this peer-review. I found the manuscript engaging and timely. I have several issues the authors can address ahead of final submission.&#x00a0;</p>
            <p> </p>
            <p> 1. I truly appreciated the lit review of related efforts. However, the organization of the section made it difficult for me to follow what had been done and what was ongoing, and importantly, where the other work potentially overlaps with this manuscript's work. I'm not sure exactly how the authors should restructure this section, but I would strongly urge the authors to consider it.&#x00a0;</p>
            <p> </p>
            <p> 2. Please make more clear that this manuscript is only looking at evaluations of tools/models/applications, and not a roundup of all the available AI tools. I agree that the manuscript's framing is useful, but it took me a while to understand that the authors were only interested in that aspect.&#x00a0;</p>
            <p> </p>
            <p> 3. I thought the methods section is pretty good and clear.&#x00a0;</p>
            <p> </p>
            <p> 4. The results are useful and well organized. Some of the figures are difficult to read and could use something beyond the base ggplot design. Shading or color or plots go a long way.&#x00a0;</p>
            <p> </p>
            <p> 5. I think the authors should re-examine their Conclusions section and really try and outline main findings in a really clear way. Right now it's tough to tell what they are. Relatedly, I think there's more limitations here than what is listed. This again goes back to the scope, but I think readers who zoom over the lit review will miss that this manuscript is only interested in evaluations of current tools. This means that many applications making claims about their usefulness (i.e., Elicit and other products) have not been included. Especially given the emphasis on qualitative summaries, the authors need to make clear that those tools have *not* been evaluated in the types of ways that the tools mentioned this article have.&#x00a0;</p>
            <p> </p>
            <p> 6. I appreciated the transparency and reporting done. Nice work!</p>
            <p> </p>
            <p> Overall this is a great article and will make a strong contribution. But a bit more could be done to clarify. I wish the authors good luck in finalizing!</p>
            <p>Are the rationale for, and objectives of, the Systematic Review clearly stated?</p>
            <p>Yes</p>
            <p>Is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>If this is a Living Systematic Review, is the &#x2018;living&#x2019; method appropriate and is the search schedule clearly defined and justified? (&#x2018;Living Systematic Review&#x2019; or a variation of this term should be included in the title.)</p>
            <p>Yes</p>
            <p>Are sufficient details of the methods and analysis provided to allow replication by others?</p>
            <p>Partly</p>
            <p>Are the conclusions drawn adequately supported by the results presented in the review?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>I'm an expert in research synthesis methods, designing applications for conducting syntheses. I've worked on both AI-based and non-AI-based architectures.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report298395">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.166140.r298395</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Schmidt</surname>
                        <given-names>Lena</given-names>
                    </name>
                    <xref ref-type="aff" rid="r298395a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0709-8226</uri>
                </contrib>
                <aff id="r298395a1">
                    <label>1</label>National Institute for Health and Care Research Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, England, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>30</day>
                <month>8</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Schmidt L</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport298395" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.151493.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you for providing this interesting review about automated data extraction from social science studies and model architectures, evaluation, data processing and more. Especially the information in tables, such as tables 3, 4, and 5 are adding a lot of value. I am not an expert in social science studies but am familiar with the general field of automated data extraction. Therefore I have just some very minor comments and questions for clarification below:</p>
            <p> </p>
            <p> The reference from Yu et al. below is mostly concerning screening automation and not data extraction if I am not missing a major point in the paper. If that is correct then there may exist better works to reference in this context? &#x201c;the process of extracting data from primary research is a labor-intensive effort, fraught with the potential for human error (see&#x00a0;Pigott &amp; Polanin, 2020;&#x00a0;Yu et al., 2018).&#x201d; I am not an expert in social science research, but a few included references in Table 2 caught my eye. For example&#x00a0;Iwatsuki et al. (2017)&#x00a0;about detecting in-line mathematical expressions or&#x00a0;Torres et al. (2012)&#x00a0;about software engineering or later&#x00a0;Nayak et al. (2021)&#x00a0;about cotton industry? Regarding this sentence in the conclusions, it might be more up-to-date to reference the review update from 2023 with 76 included papers: &#x201c;For example, while an LSR focusing on clinical research that is based on the PICO framework yielded 53 studies that included original data extraction (Schmidt et al., 2021)&#x201d; One of the challenges with living updates is to adapt the search whenever there are new developments in a field of research. You may have already considered adapting the search strategy to make sure that new methods relying on large language models (LLM) like GPT or T5 are picked up? There may be relevant articles coming through soon, for example&#x00a0;
                <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2405.14445">https://arxiv.org/abs/2405.14445</ext-link> &#x00a0;may be of interest for a future review update as it looks at social science study data extraction and if it is, then it would be good to make sure that the search can pick up the terminology correctly. In the methodology section, could you please state the dates when the search relevant to the baseline review cutoff was conducted (for each data source if different) ?"</p>
            <p>Are the rationale for, and objectives of, the Systematic Review clearly stated?</p>
            <p>Yes</p>
            <p>Is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>If this is a Living Systematic Review, is the &#x2018;living&#x2019; method appropriate and is the search schedule clearly defined and justified? (&#x2018;Living Systematic Review&#x2019; or a variation of this term should be included in the title.)</p>
            <p>Yes</p>
            <p>Are sufficient details of the methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results presented in the review?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Systematic review automation, automated data extraction (clinical trials), natural language processing, living reviews</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment12500-298395">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Legate</surname>
                            <given-names>Amanda</given-names>
                        </name>
                        <aff>University of Texas Tyler</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>22</day>
                    <month>9</month>
                    <year>2024</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We are honored that you agreed to review our research and sincerely appreciate your thoughtful review and feedback. Please find responses to each comment below.</p>
                <p> </p>
                <p> Comment 1</p>
                <p> The reference from Yu et al. below is mostly concerning screening automation and not data extraction if I am not missing a major point in the paper. If that is correct then there may exist better works to reference in this context? &#x201c;the process of extracting data from primary research is a labor-intensive effort, fraught with the potential for human error (see&#x00a0;Pigott &amp; Polanin, 2020;&#x00a0;Yu et al., 2018).&#x201d;</p>
                <p> </p>
                <p> Comment 1: Response</p>
                <p> Thank you for catching this oversight. You are absolutely correct; Yu et al. (2018) primarily focused on screening automation for primary study selection rather than data extraction stages of SLRs. We have removed this reference from the sentence to align with the context.</p>
                <p> </p>
                <p> Comment 2</p>
                <p> I am not an expert in social science research, but a few included references in Table 2 caught my eye. For example&#x00a0;Iwatsuki et al. (2017)&#x00a0;about detecting in-line mathematical expressions or&#x00a0;Torres et al. (2012)&#x00a0;about software engineering or later&#x00a0;Nayak et al. (2021)&#x00a0;about cotton industry?</p>
                <p> </p>
                <p> Comment 2: Response</p>
                <p> Thank you for your insights on the relevance of references in Table 2. Our search strategy was intentionally broad to include studies utilizing (semi)automated data extraction methods across various domains, provided they were not solely focused on clinical research. The goal was to ensure comprehensiveness; however, we understand your concern regarding the ambiguity of some references' relevance to social sciences. As our study is a "living" review, we see this as an excellent opportunity to consider refining our inclusion criteria in future updates. We will explore more targeted approaches that can help streamline the search strategy, potentially focusing on research that more directly applies to social sciences or explicitly demonstrates transferable methodologies that align with the needs of social science researchers. Additionally, we are discussing options for collaborating with experts who specialize in bibliometric analysis or search strategy optimization to ensure that our review remains focused, relevant, and complementary to your work.</p>
                <p> </p>
                <p> Comment 3</p>
                <p> Regarding this sentence in the conclusions, it might be more up-to-date to reference the review update from 2023 with 76 included papers: &#x201c;For example, while an LSR focusing on clinical research that is based on the PICO framework yielded 53 studies that included original data extraction (Schmidt et al., 2021)&#x201d;.</p>
                <p> </p>
                <p> Comment 3: Response</p>
                <p> Thank you for pointing out this important update, we appreciate your diligence in ensuring that references are current and reflective of the most recent findings. The manuscript has been revised to reference the 2023 update of your LSR to accurately reflect the most up-to-date results.</p>
                <p> </p>
                <p> Comment 4</p>
                <p> One of the challenges with living updates is to adapt the search whenever there are new developments in a field of research. You may have already considered adapting the search strategy to make sure that new methods relying on large language models (LLM) like GPT or T5 are picked up? There may be relevant articles coming through soon, for example&#x00a0;
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2405.14445">https://arxiv.org/abs/2405.14445</ext-link>&#x00a0;&#x00a0;may be of interest for a future review update as it looks at social science study data extraction and if it is, then it would be good to make sure that the search can pick up the terminology correctly.</p>
                <p> </p>
                <p> Comment 4: Response</p>
                <p> Thank you for highlighting the importance of adapting the search strategy to capture emerging developments in automation technologies, particularly those involving large language models (LLMs) like GPT or T5. We completely agree that a key aspect of maintaining the relevance and rigor of a living systematic review is to continuously update the search strategy to reflect the current state-of-the-art in the field. We will incorporate this valuable feedback into future iterations by updating our search terms and strategies to include LLM-related methodologies and terminologies, ensuring the inclusion of new and relevant articles. The paper you referenced (
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2405.14445">https://arxiv.org/abs/2405.14445</ext-link>) serves as an excellent example, and we will use it to refine our search criteria. This approach will help us stay current with advances in data extraction techniques. Thank you for providing specific references to guide this adaptation.</p>
                <p> </p>
                <p> Comment 5</p>
                <p> In the methodology section, could you please state the dates when the search relevant to the baseline review cutoff was conducted (for each data source if different) ?"</p>
                <p> </p>
                <p> Comment 5: Response</p>
                <p> Thank you for this thoughtful suggestion. While we reported the search dates in the extended data files housed in OSF, we agree with you that including them directly in the Methods section would add clarity and value for readers. We have updated the section to specify the dates when searches were conducted for each data source, ensuring this information is clear and accessible to readers.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report298402">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.166140.r298402</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Macura</surname>
                        <given-names>Biljana</given-names>
                    </name>
                    <xref ref-type="aff" rid="r298402a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4253-1390</uri>
                </contrib>
                <aff id="r298402a1">
                    <label>1</label>Stockholm Environment Institute, Stockholm, Sweden</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>19</day>
                <month>8</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Macura B</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport298402" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.151493.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This manuscript represents an important contribution to the evidence synthesis methodology. Given the rise of AI technology, a living evidence base on approaches to data extraction will be very useful. However, the manuscript could benefit from improved clarity. Below are my comments: 
                <list list-type="order">
                    <list-item>
                        <p>Title: 
                            <list list-type="order">
                                <list-item>
                                    <p>Clarify the type of data being extracted (qualitative, quantitative, or mixed).</p>
                                </list-item>
                                <list-item>
                                    <p>Since this review does not include any qualitative or quantitative synthesis per se, but rather provides an overview of the field (methods for semi-automated data extraction), I suggest removing "living systematic review" and adding "living systematic map."</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Abstract: 
                            <list list-type="order">
                                <list-item>
                                    <p>The summary of methods could include more detailed information on searches, screening, critical appraisal, and synthesis. Please specify which standards for review conduct were followed.</p>
                                </list-item>
                                <list-item>
                                    <p>The summary of results could provide more information (briefly) about the included studies.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Keywords: 
                            <list list-type="order">
                                <list-item>
                                    <p>Avoid repeating terms already present in the title</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Introduction: 
                            <list list-type="order">
                                <list-item>
                                    <p>The focus of this review&#x2014;extraction tools for quantitative data&#x2014;should be more explicitly stated. This emphasis needs to be clearer in the introduction and reflected in the title, as mentioned earlier. Specifically, the first paragraph of the Introduction should be revised to concentrate on the review topic&#x2014;quantitative data extraction and existing tools&#x2014;rather than a general introduction to meta-science or related areas.</p>
                                </list-item>
                                <list-item>
                                    <p>Additional details are needed on how this review contributes to and complements existing reviews on the topic. This information should be included in the "Related Research" section.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Objectives: 
                            <list list-type="order">
                                <list-item>
                                    <p>It would be helpful to define what is included under 
                                        <italic>&#x201c;social science research domains&#x201d;.</italic>
                                    </p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Methods: 
                            <list list-type="order">
                                <list-item>
                                    <p>Authors should be transparent and explicit about the guidelines and standards for both conduct and reporting that were used. Please clarify this at the beginig of the Methods section.</p>
                                </list-item>
                                <list-item>
                                    <p>The methods section should begin by addressing any deviations from the protocol. If there were no deviations, this should be clearly stated as well.</p>
                                </list-item>
                                <list-item>
                                    <p>Did you use any automation technologies to screen or select studies for this review? If yes, please clarify.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <underline>Methods/Eligibility criteria</underline>: 
                            <list list-type="order">
                                <list-item>
                                    <p>The eligibility criteria should be explicit about the field within which methods for (semi)automated data extraction are applied.</p>
                                </list-item>
                                <list-item>
                                    <p>A definition of &#x201c;(semi)automated&#x201d; is needed. The eligibility criteria currently state that semi-automated approaches will be eligible but then refer to &#x201c;any automated approach to data extraction&#x201d; in the next sentence. This needs to be clarified&#x2014;are the focus and criteria on semi-automated or automated approaches? Be more explicit and precise in the description of the eligibility criteria, and ensure alignment with the protocol.</p>
                                </list-item>
                                <list-item>
                                    <p>Instead of &#x201c;We excluded studies labeled as editorials, briefs..&#x201d; you may write &#x201c;Editorials, briefs, &#x2026;were not considered eligible&#x201d; &#x00a0;(and similar changes may be applied to the following sentence)</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <underline>Methods/</underline>
                            <underline>Searches</underline> 
                            <list list-type="order">
                                <list-item>
                                    <p>Be explicit about the citation indices included in your Web of Science subscription and note which library was used to access WoS. This will increase transparency and replicability of your searches.</p>
                                </list-item>
                                <list-item>
                                    <p>Clarify why following Schmidt et al.'s search strategy was important, given the different scope of this review. Consider including more social science databases to ensure comprehensive coverage. Did you include the Social Science Citation Index (within WoS)?</p>
                                </list-item>
                                <list-item>
                                    <p>Provide explanations for all abbreviations (IEEE, ACL, etc.) in the text.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <underline>Methods/Study selection</underline> 
                            <list list-type="order">
                                <list-item>
                                    <p>Clarify if three researchers simultaneously screened titles and abstracts (TA), and whether inter-rater reliability (IRR) was calculated for TA screening. How you trained reviewers to apply eligibility criteria?</p>
                                </list-item>
                                <list-item>
                                    <p>The sentence
                                        <italic>, </italic>
                                        <italic>&#x201c;</italic>
                                        <italic>In cases where level of abstraction and potential for transferability could not be determined from the abstract alone, full text articles were reviewed and discussed by all three researchers until consensus was reached</italic>&#x201d;, should more clearly state that there was NO full-text screening of all records (if this is correct), only of a sub-sample where abstracts did not clearly describe AI technology, etc. &#x00a0;</p>
                                </list-item>
                                <list-item>
                                    <p>Relatedly, Figure 1 should be adjusted to avoid giving the false impression that all records were screened in full text.</p>
                                </list-item>
                                <list-item>
                                    <p>This review seems to involve META-data extraction rather than DATA extraction. Please adjust the text and figures accordingly.</p>
                                </list-item>
                                <list-item>
                                    <p>It is not clear if IRR assessments were conducted for meta-data extraction. Please clarify/be explicit. If IRR was not done, describe how researchers were trained to use the extraction form.</p>
                                </list-item>
                                <list-item>
                                    <p>The sentence, &#x201c;
                                        <italic>coding forms allowed for input of &#x201c;other&#x201d; responses (e.g., APA data elements) that were not included in extant reviews that focus on medical and clinical data extraction (e.g., PICO elements)</italic>&#x201d; is unclear. &#x00a0;Consider removing or clarifying and linking it better with the rest of the text.</p>
                                </list-item>
                                <list-item>
                                    <p>Describe the procedure for screening and meta-data extraction of studies authored by the review team.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>
                            <underline>Methods/</underline>Critical appraisal and Synthesis 
                            <list list-type="order">
                                <list-item>
                                    <p>These sections are missing. Please state clearly if a critical appraisal of included studies was conducted and if so, how was it performed. Also, describe how synthesis was conducted.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Results/Challenges 
                            <list list-type="order">
                                <list-item>
                                    <p>Clarify that the described challenges reflect issues within the body of evidence included in this (baseline) review (otherwise this section can be mixed up with review limitations).</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                    <list-item>
                        <p>Conclusions/Limitations 
                            <list list-type="order">
                                <list-item>
                                    <p>Organize limitations into those related to the methodology used and those related to the evidence base.</p>
                                </list-item>
                                <list-item>
                                    <p>Discuss limitations related to the focus on publications in English, the inexhaustive list of search sources, and the lack of grey literature.</p>
                                </list-item>
                            </list> </p>
                    </list-item>
                </list>
            </p>
            <p>Are the rationale for, and objectives of, the Systematic Review clearly stated?</p>
            <p>Partly</p>
            <p>Is the statistical analysis and its interpretation appropriate?</p>
            <p>Not applicable</p>
            <p>If this is a Living Systematic Review, is the &#x2018;living&#x2019; method appropriate and is the search schedule clearly defined and justified? (&#x2018;Living Systematic Review&#x2019; or a variation of this term should be included in the title.)</p>
            <p>Partly</p>
            <p>Are sufficient details of the methods and analysis provided to allow replication by others?</p>
            <p>Partly</p>
            <p>Are the conclusions drawn adequately supported by the results presented in the review?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Systematic Evidence Synthesis Methodology</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment12505-298402">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Legate</surname>
                            <given-names>Amanda</given-names>
                        </name>
                        <aff>University of Texas Tyler</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>22</day>
                    <month>9</month>
                    <year>2024</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Dear Dr. Macura,</p>
                <p> </p>
                <p> Thank you for your thoughtful and detailed feedback on our manuscript. We appreciate the time and effort you have invested in providing suggestions to enhance our work. We also value rigorous research methods and reporting transparency and would like to clarify several points regarding the reporting guidelines we adhered to and the journal's policies and requirements.</p>
                <p> </p>
                <p> Our manuscript follows the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. As noted in the F1000Research &#x201c;Article Standards of Reporting&#x201d; (https://f1000research.com/about/policies#stofrep), systematic reviews published in this journal must adhere to PRISMA guidelines. We have ensured that our reporting aligns with PRISMA's emphasis on transparency, replicability, and comprehensiveness.</p>
                <p> </p>
                <p> We would like to express our genuine appreciation for the important work you and your colleagues have done in developing the ROSES (Reporting standards for Systematic Evidence Syntheses) guidelines for systematic evidence synthesis in environmental science. Improving transparency and standardization in research reporting is a goal we fully support. While we acknowledge the value of the ROSES guidelines, they were not the reporting standard required or appropriate for our systematic review. We noticed that many of your comments seem to assess our manuscript against the ROSES guidelines (Haddaway et al., 2017a; 2017b; 2018; Haddaway &amp; Macura, 2018). For example, the suggestion to emphasize "meta-data extraction" aligns more with ROSES, whereas PRISMA does not require such differentiation and focuses on clarity in describing the data collection process, whether it involves meta-data or primary data points.</p>
                <p> </p>
                <p> We believe it is essential to assess our work based on the scope and framework provided by PRISMA rather than extend it beyond its current focus to fit an alternative reporting framework. We are committed to making revisions that enhance the clarity and rigor of our research while remaining consistent with the standards required by the journal.</p>
                <p> </p>
                <p> Thank you again for your constructive feedback and for considering our clarifications.&#x00a0;</p>
                <p> </p>
                <p> 
                    <bold>References</bold>
                </p>
                <p> Haddaway, N. R., &amp; 
                    <bold>Macura, B.</bold> (2018). The role of reporting standards in producing robust literature reviews. 
                    <italic>Nature Climate Change, 8</italic>(6), 444&#x2013;447. https://doi.org/10.1038/s41558-018-0180-3</p>
                <p> Haddaway, N. R., 
                    <bold>Macura, B.</bold>, Whaley, P., &amp; Pullin, A. S. (2017a). 
                    <italic>ROSES for systematic map reports </italic>(Version 1.0) [Data file]. https://doi.org/10.6084/m9.figshare.5897299&#x00a0;</p>
                <p> Haddaway, N. R., 
                    <bold>Macura, B.</bold>, Whaley, P., &amp; Pullin, A. S. (2017b). 
                    <italic>ROSES for systematic review reports </italic>(Version 1.0) [Data file]. https://doi.org/10.6084/m9.figshare.5897272</p>
                <p> Haddaway, N. R., 
                    <bold>Macura, B.</bold>, Whaley, P., &amp; Pullin, A. S. (2018). ROSES reporting standards for systematic evidence syntheses: Pro forma, flow-diagram and descriptive summary of the plan and conduct of environmental systematic reviews and systematic maps. 
                    <italic>Environmental Evidence, 7</italic>(7). https://doi.org/10.1186/s13750-018-0121-7</p>
                <p> </p>
                <p> Comment 1</p>
                <p> [Title] Clarify the type of data being extracted (qualitative, quantitative, or mixed).</p>
                <p> </p>
                <p> Comment 2</p>
                <p> [Title] Since this review does not include any qualitative or quantitative synthesis per se, but rather provides an overview of the field (methods for semi-automated data extraction), I suggest removing "living systematic review" and adding "living systematic map."</p>
                <p> </p>
                <p> Comment 1 &amp; 2: Response</p>
                <p> Thank you for these valuable suggestions regarding the title. We have considered (1) specifying the type of data being extracted in the title and (2) changing the title from "living systematic review" to "living systematic map." However, we have retained the original title to ensure consistency with our pre-registered protocol, adhere to PRISMA reporting standards, and comply with F1000Research guidelines.</p>
                <p> </p>
                <p> Comment 3</p>
                <p> [Abstract] The summary of methods could include more detailed information on searches, screening, critical appraisal, and synthesis. Please specify which standards for review conduct were followed.</p>
                <p> </p>
                <p> Comment 3: Response</p>
                <p> Thank you for the suggestion to provide more detailed information on searches, screening, critical appraisal, and synthesis in the abstract to better align with ROSES reporting recommendations. To ensure compliance with the journal's requirements, we followed the PRISMA guidelines for a structured summary, which emphasize conciseness in presenting objectives, eligibility criteria, methods, results, and conclusions. While we understand the desire for additional details, we believe the current abstract aligns with these guidelines but will review it again to ensure optimal clarity.</p>
                <p> </p>
                <p> Comment 4</p>
                <p> [Abstract] The summary of results could provide more information (briefly) about the included studies.</p>
                <p> </p>
                <p> Comment 4: Response</p>
                <p> Thank you for the recommendation to provide more information about the included studies within the abstract. We will incorporate a brief summary of the included studies' key characteristics and findings in future updates to this review to enhance clarity and completeness.</p>
                <p> </p>
                <p> Comment 5</p>
                <p> [Keywords] Avoid repeating terms already present in the title</p>
                <p> </p>
                <p> Comment 5: Response</p>
                <p> Thank you for highlighting ROSES guidance indicating that keywords do not repeat the title but rather provide additional context. Where appropriate, we will revise keywords to avoid redundancy and enhance discoverability.</p>
                <p> </p>
                <p> Comment 6</p>
                <p> [Introduction] The focus of this review&#x2014;extraction tools for quantitative data&#x2014;should be more explicitly stated. This emphasis needs to be clearer in the introduction and reflected in the title, as mentioned earlier. Specifically, the first paragraph of the Introduction should be revised to concentrate on the review topic&#x2014;quantitative data extraction and existing tools&#x2014;rather than a general introduction to meta-science or related areas.</p>
                <p> </p>
                <p> Comment 6: Response</p>
                <p> Thank you for your feedback on clarifying the focus of our review. Our study does not exclusively focus on extraction tools for quantitative data; it encompasses approaches to data extraction for both quantitative and qualitative data elements relevant to evidence synthesis in systematic reviews and meta-analyses within social sciences. To better reflect this broader focus, we have revised the objective section to explicitly state that the review covers data extraction tools for a range of data types. We hope this adjustment will provide clearer insight into the comprehensive scope of our review.</p>
                <p> </p>
                <p> Comment 7</p>
                <p> [Introduction] Additional details are needed on how this review contributes to and complements existing reviews on the topic. This information should be included in the "Related Research" section.</p>
                <p> </p>
                <p> Comment 7: Response</p>
                <p> Thank you for this insightful comment. We agree on the importance of clearly situating our review within the existing literature to highlight its unique contributions. Although we did not adhere to ROSES guidelines for explaining the review's relevance to existing literature, we followed PRISMA guidelines in the "Related Literature" section to identify relevant prior reviews and synthesize their focus, findings, and limitations. We will consider ways to enhance this section to better emphasize our review's distinct contributions moving forward.</p>
                <p> </p>
                <p> Comment 8</p>
                <p> [Objectives] It would be helpful to define what is included under&#x00a0;
                    <italic>&#x201c;social science research domains&#x201d;.</italic>
                </p>
                <p> </p>
                <p> Comment 8: Response</p>
                <p> Thank you for this suggestion. Our pre-registered research protocol and the "Baseline Review Search Strategy" document (available in the project's OSF repository) provide a comprehensive list of over 100 subject categories included under social science research domains, ranging from sociology and political science to interdisciplinary areas such as "Social Sciences Mathematical Methods." To enhance clarity, we have updated the objectives section to include more details and a reference to the extended data file.</p>
                <p> </p>
                <p> Comment 9</p>
                <p> [Methods] Authors should be transparent and explicit about the guidelines and standards for both conduct and reporting that were used. Please clarify this at the beginning of the Methods section.</p>
                <p> </p>
                <p> Comment 9: Response</p>
                <p> We acknowledge that the ROSES guidelines recommend transparency in reporting the guidelines and standards for both conduct and reporting at the beginning of the Methods section. However, in accordance with the journal's article guidelines for living systematic reviews (available from: https://f1000research.com/for-authors/article-guidelines/living-systematic-reviews), this information is provided in the "Reporting Guidelines" section.&#x00a0;</p>
                <p> </p>
                <p> Comment 10</p>
                <p> [Methods] The methods section should begin by addressing any deviations from the protocol. If there were no deviations, this should be clearly stated as well.</p>
                <p> </p>
                <p> Comment 10: Response</p>
                <p> Thank you for highlighting this important aspect. While we did not adopt the ROSES reporting standards for this research, we recognize their guidance on stating any deviations from the protocol at the beginning of the methods section. We have addressed any deviations in the appropriate sections of the paper, and additional descriptions are provided in the extended data files to ensure transparency and replicability.</p>
                <p> </p>
                <p> Comment 11</p>
                <p> [Methods] Did you use any automation technologies to screen or select studies for this review? If yes, please clarify.</p>
                <p> </p>
                <p> Comment 11: Response</p>
                <p> Thank you for your question. The use of automation technologies is detailed in the "Search Sources" and "Study Selection" subsections of the Methods section. Additionally, to ensure transparency and replicability, further details are provided in the "Software Availability" section, as per F1000Research guidelines.</p>
                <p> </p>
                <p> Comment 12</p>
                <p> [Methods/Eligibility criteria] The eligibility criteria should be explicit about the field within which methods for (semi)automated data extraction are applied.</p>
                <p> </p>
                <p> Comment 12: Response</p>
                <p> Thank you for this comment. To ensure clarity, we have referenced the extended data files in the text, which provide comprehensive details and a full list of over 100 research fields. These details are openly available in the project repository, as specified in the protocol (please see response to Comment 8).</p>
                <p> </p>
                <p> Comment 13</p>
                <p> [Methods/Eligibility criteria] A definition of &#x201c;(semi)automated&#x201d; is needed. The eligibility criteria currently state that semi-automated approaches will be eligible but then refer to &#x201c;any automated approach to data extraction&#x201d; in the next sentence. This needs to be clarified&#x2014;are the focus and criteria on semi-automated or automated approaches? Be more explicit and precise in the description of the eligibility criteria and ensure alignment with the protocol.</p>
                <p> </p>
                <p> Comment 13: Response</p>
                <p> Thank you for your suggestion to clarify the phrasing regarding the eligibility criteria. We have revised the description to specify that the focus is on any "technique" applied for extracting data from literature in a semi-automated manner. This adjustment aligns with the study protocol.</p>
                <p> </p>
                <p> Comment 14</p>
                <p> [Methods/Eligibility criteria] Instead of &#x201c;We excluded studies labeled as editorials, briefs..&#x201d; you may write &#x201c;Editorials, briefs, &#x2026;were not considered eligible&#x201d; &#x00a0;(and similar changes may be applied to the following sentence).</p>
                <p> </p>
                <p> Comment 14: Response</p>
                <p> Thank you for the suggestion. We have revised the text to use passive construction, as recommended. We have also applied similar changes to the following sentence for consistency.</p>
                <p> </p>
                <p> Comment 15</p>
                <p> [Methods/ Searches] Be explicit about the citation indices included in your Web of Science subscription and note which library was used to access WoS. This will increase transparency and replicability of your searches.</p>
                <p> </p>
                <p> Comment 15: Response</p>
                <p> Thank you for this suggestion. To avoid redundancy in the manuscript, we have added a statement directing readers to the extended data files, which provide additional detail related to WoS indices and search settings.</p>
                <p> </p>
                <p> Comment 16</p>
                <p> [Methods/ Searches] Clarify why following Schmidt et al.'s search strategy was important, given the different scope of this review.</p>
                <p> </p>
                <p> Comment 16: Response</p>
                <p> Thank you for this comment. To clarify, we followed Schmidt et al.'s search strategy to ensure comprehensive coverage of relevant databases and consistency in methodological rigor, which is important even with a different scope. To avoid redundancy, we have added a reference in the manuscript directing readers to the extended data file and research protocol in our open-access repository, where this rationale is explained in detail.</p>
                <p> </p>
                <p> Comment 17</p>
                <p> [Methods/ Searches] Consider including more social science databases to ensure comprehensive coverage.</p>
                <p> </p>
                <p> Comment 17: Response</p>
                <p> Thank you for this valuable suggestion. We appreciate the importance of comprehensive coverage and will consider including additional social science databases in future updates to further enhance the scope of our review.</p>
                <p> </p>
                <p> Comment 18</p>
                <p> [Methods/ Searches] Did you include the Social Science Citation Index (within WoS)?</p>
                <p> </p>
                <p> Comment 18: Response</p>
                <p> Yes, the Social Science Citation Index within Web of Science was included. We have updated the text to clarify that all editions, settings, and search syntax used are detailed in the extended data files available in the open-access repository.</p>
                <p> </p>
                <p> Comment 19</p>
                <p> [Methods/ Searches] Provide explanations for all abbreviations (IEEE, ACL, etc.) in the text.</p>
                <p> </p>
                <p> Comment 19: Response</p>
                <p> Thank you for the suggestion. We have added explanations for all source abbreviations (e.g., IEEE, ACL) in the text to improve clarity for readers.</p>
                <p> </p>
                <p> Comment 20</p>
                <p> [Methods/ Study selection] Clarify if three researchers simultaneously screened titles and abstracts (TA), and whether inter-rater reliability (IRR) was calculated for TA screening. How you trained reviewers to apply eligibility criteria?</p>
                <p> </p>
                <p> Comment 20: Response</p>
                <p> Thank you for your question. The "Study Selection" section of the paper details independent screening procedures, training process for reviewers on applying eligibility criteria, and inter-rater reliability (IRR) considerations.</p>
                <p> </p>
                <p> Comment 21</p>
                <p> [Methods/ Study selection] The sentence
                    <italic>,&#x00a0;&#x201c;In cases where level of abstraction and potential for transferability could not be determined from the abstract alone, full text articles were reviewed and discussed by all three researchers until consensus was reached</italic>&#x201d;, should more clearly state that there was NO full-text screening of all records (if this is correct), only of a sub-sample where abstracts did not clearly describe AI technology, etc. &#x00a0;</p>
                <p> </p>
                <p> Comment 21: Response</p>
                <p> Thank you for this observation. While ROSES guidelines provide alternative flowchart formatting and descriptions, we adhered to PRISMA guidelines. According to the PRISMA flowchart (Figure 1), a total of 11,336 records were identified, and after deduplication, 10,644 articles underwent title and abstract screening. As indicated in the flowchart, only 46 articles proceeded to the full-text screening stage, which occurred separately.</p>
                <p> </p>
                <p> Comment 22</p>
                <p> [Methods/ Study selection] Relatedly, Figure 1 should be adjusted to avoid giving the false impression that all records were screened in full text.</p>
                <p> </p>
                <p> Comment 22: Response</p>
                <p> Thank you for highlighting this concern. The current figure indicates that 46 articles were included in the full-text screening stage, making it clear that not all records were screened in full text. However, we will consider expanding the flowchart in future updates to provide additional details that could further enhance transparency and clarity.</p>
                <p> </p>
                <p> Comment 23</p>
                <p> [Methods/ Study selection] This review seems to involve META-data extraction rather than DATA extraction. Please adjust the text and figures accordingly.</p>
                <p> </p>
                <p> Comment 23: Response</p>
                <p> Thank you for your observation. While ROSES emphasizes distinguishing between meta-data extraction and data extraction, PRISMA does not make this distinction as explicitly. Our paper follows PRISMA guidelines, focusing on transparency and completeness in documenting the tools, databases, and criteria used for extraction.</p>
                <p> </p>
                <p> Comment 24</p>
                <p> [Methods/ Study selection] It is not clear if IRR assessments were conducted for meta-data extraction. Please clarify/be explicit. If IRR was not done, describe how researchers were trained to use the extraction form.</p>
                <p> </p>
                <p> Comment 24: Response</p>
                <p> Thank you for raising this point. The "Study Selection" section of the paper details this information and discusses inter-rater reliability (IRR) assessments.</p>
                <p> </p>
                <p> Comment 25</p>
                <p> [Methods/ Study selection] The sentence, &#x201c;
                    <italic>coding forms allowed for input of &#x201c;other&#x201d; responses (e.g., APA data elements) that were not included in extant reviews that focus on medical and clinical data extraction (e.g., PICO elements)</italic>&#x201d; is unclear. &#x00a0;Consider removing or clarifying and linking it better with the rest of the text.</p>
                <p> </p>
                <p> Comment 25: Response</p>
                <p> Thank you for this suggestion. We have refined the statement to improve clarity and ensure it is better linked with the surrounding text.</p>
                <p> </p>
                <p> Comment 26</p>
                <p> [Methods/ Study selection] Describe the procedure for screening and meta-data extraction of studies authored by the review team.</p>
                <p> </p>
                <p> Comment 26: Response</p>
                <p> Thank you for highlighting ROSES guidance surrounding procedures for handling studies authored by the review team. However, no alternative procedures were implemented; therefore, there are no additional procedures to report.</p>
                <p> </p>
                <p> Comment 27</p>
                <p> [Methods/ Critical appraisal and Synthesis] These sections are missing. Please state clearly if a critical appraisal of included studies was conducted and if so, how was it performed. Also, describe how synthesis was conducted.</p>
                <p> </p>
                <p> Comment 27: Response</p>
                <p> Thank you for your comment. These sections are specific to ROSES guidelines. These sections are not required by PRISMA or the journal's reporting standards.</p>
                <p> </p>
                <p> Comment 28</p>
                <p> [Results/Challenges] Clarify that the described challenges reflect issues within the body of evidence included in this (baseline) review (otherwise this section can be mixed up with review limitations).</p>
                <p> </p>
                <p> Comment 28: Response</p>
                <p> To avoid confusion with review limitations, we have revised the first sentence of this section to clarify that the challenges discussed specifically reflect issues within the body of evidence included in this baseline review.</p>
                <p> </p>
                <p> Comment 29</p>
                <p> [Conclusions/Limitations] Organize limitations into those related to the methodology used and those related to the evidence base.</p>
                <p> </p>
                <p> Comment 29: Response</p>
                <p> While our research and protocol were developed following PRISMA guidelines rather than ROSES, which requires a structured discussion of limitations, we appreciate the value of differentiating between methodological constraints and evidence base gaps. We will consider this distinction in future updates to enhance clarity.</p>
                <p> </p>
                <p> Comment 30</p>
                <p> [Conclusions/Limitations] Discuss limitations related to the focus on publications in English, the inexhaustive list of search sources, and the lack of grey literature.</p>
                <p> </p>
                <p> Comment 30: Response</p>
                <p> Thank you for the suggestion. We have updated the limitations section to address the focus on publications in English, the inexhaustive list of search sources, and the lack of grey literature.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report298396">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.166140.r298396</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Oswald</surname>
                        <given-names>Fred</given-names>
                    </name>
                    <xref ref-type="aff" rid="r298396a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-7275-5408</uri>
                </contrib>
                <aff id="r298396a1">
                    <label>1</label>Rice University, Houston, Texas, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>16</day>
                <month>7</month>
                <year>2024</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2024 Oswald F</copyright-statement>
                <copyright-year>2024</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport298396" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.151493.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Overall, this paper is an excellent review of automated data-extraction methods for the purposes of synthetic reviews and meta-analysis. To my knowledge, there is no such review in the literature, and yet given the rise in AI-based technologies, there is a rising need for researchers to have a single resource identifying these extraction methods. This review nicely summarizes the types of tools that are out there, but it might further tie the tools more closely to a checklist that reflects must be typically must be accomplished when conducting meta-analysis (e.g., identifying literature, extracting sample sizes and effect sizes, converting effect sizes when necessary, coding effect sizes into variables, associated moderators, associated reliability coefficients, dealing with missing data). This would give the reader a better sense of what* gets automated and serves their purposes (e.g., you can take the &#x2018;model architectures and components&#x2019; section and populate the checklist/framework with these AI tools/functions) . Also, though it is certainly useful to document when and how humans are compared to automated systems, the level of accuracy reported (e.g., errors of commission and omission by automated systems) would be useful as well (i.e., are these automated systems any good? when systems agree with humans, when are they agreeing in an accurate way vs. a biased way?)</p>
            <p> </p>
            <p> Thank you for the opportunity to review &#x2013; again, this will be a valuable paper to readers.</p>
            <p>Are the rationale for, and objectives of, the Systematic Review clearly stated?</p>
            <p>Yes</p>
            <p>Is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>If this is a Living Systematic Review, is the &#x2018;living&#x2019; method appropriate and is the search schedule clearly defined and justified? (&#x2018;Living Systematic Review&#x2019; or a variation of this term should be included in the title.)</p>
            <p>Yes</p>
            <p>Are sufficient details of the methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results presented in the review?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment12499-298396">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Legate</surname>
                            <given-names>Amanda</given-names>
                        </name>
                        <aff>University of Texas Tyler</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>22</day>
                    <month>9</month>
                    <year>2024</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Comment 1</p>
                <p> Also, though it is certainly useful to document when and how humans are compared to automated systems, the level of accuracy reported (e.g., errors of commission and omission by automated systems) would be useful as well (i.e., are these automated systems any good? when systems agree with humans, when are they agreeing in an accurate way vs. a biased way?)</p>
                <p> Thank you for the opportunity to review &#x2013; again, this will be a valuable paper to readers.</p>
                <p> </p>
                <p> Comment 1: Response</p>
                <p> Thank you for this insightful suggestion. We agree that evaluating the accuracy of automated systems compared to human assessments, particularly regarding errors of commission and omission, would provide valuable insights into their effectiveness and potential biases. Understanding when automated systems align with human judgments accurately is indeed crucial for advancing the field.</p>
                <p> Given the "living" nature of our review, we see this as an important focus for future updates. Although additional technical expertise may be required to conduct a comprehensive comparative assessment of these accuracy measures, we hope to expand our team to include experts in areas of data science and AI evaluation. This addition will enhance the rigor of our review and address critical questions surrounding the reliability of automated tools.</p>
                <p> </p>
                <p> We appreciate your valuable feedback and are committed to integrating these considerations in future versions of our living review.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
