<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.130245.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Research Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Modeling document labels using Latent Dirichlet allocation for archived documents in Integrated Quality Assurance System (IQAS)</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Prianes</surname>
                        <given-names>Freddie</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-0168-6182</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Palaoag</surname>
                        <given-names>Thelma</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-5474-7260</uri>
                    <xref ref-type="corresp" rid="c2">b</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>College of Computer Studies, Camarines Sur Polytechnic Colleges, Nabua, Camarines Sur, 4432, Philippines</aff>
                <aff id="a2">
                    <label>2</label>College of Information Technology and Computer Science, University of the Cordilleras, Baguio City, Benguet, 2600, Philippines</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:fprianes@cspc.edu.ph">fprianes@cspc.edu.ph</email>
                </corresp>
                <corresp id="c2">
                    <label>b</label>
                    <email xlink:href="mailto:tdpalaoag@uc-bcf.edu.ph">tdpalaoag@uc-bcf.edu.ph</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>1</month>
                <year>2023</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2023</year>
            </pub-date>
            <volume>12</volume>
            <elocation-id>105</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>18</day>
                    <month>1</month>
                    <year>2023</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Prianes F and Palaoag T</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/12-105/pdf"/>
            <abstract>
                <p>
                    <bold>Background:</bold> As part of the transition of every higher education institution into an intelligent campus here in the Philippines, the Commission of Higher Education has launched a program for the development of smart campuses for state universities and colleges to improve operational efficiency in the country. With regards to the commitment of Camarines Sur Polytechnic Colleges in improving the accreditation operation and to resolve the evident problems in the accreditation process, the researchers propose this study as part of an Integrated Quality Assurance System that aims to develop an intelligent model that will be used in categorizing and automating tagging of archived documents used during accreditation.</p>
                <p>
                    <bold>Methods:</bold> As a guide in modeling the study, the researchers use an agile method as it promotes flexibility, speed, and, most importantly, continuous improvement in developing, testing, documenting, and even after delivery of the software. This method helped the researchers in designing the prototype with the implementation of the said model to aid the process in file searching and label tagging. Moreover, a computational analysis is also included to further understand the result from the devised model.</p>
                <p>
                    <bold>Results:</bold> As a result, from the processed sample corpus, the document labels are faculty, activities, library, research, and materials. The labels generated are based on the total relative frequencies which are 0.009884, 0.008825, 0.007413, 0.007413, 0.006354, respectively, that have been computed between the ratio on how many times the term was used in the document and the total word count of the whole document.</p>
                <p>
                    <bold>Conclusions:</bold> The devised model and prototype support the organization in file storing and categorization of accreditation documents. Through this, it is easier to retrieve and classify the data, which is the main problem for the task group. Further, other patterns in clustering, modeling, and text classification can be integrated in the prototype.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Latent Dirichlet allocation</kwd>
                <kwd>document labels</kwd>
                <kwd>natural language processing</kwd>
                <kwd>accreditation</kwd>
                <kwd>quality assurance</kwd>
                <kwd>Intelligent model</kwd>
                <kwd>CSPC</kwd>
            </kwd-group>
            <funding-group>
                <funding-statement>The author(s) declared that no grants were involved in supporting this work.</funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>The creation of a smart campus is a step toward the creation of a smart city. Teaching and learning will be more difficult in the future as a result of the rapid advancements in information and communication technology (
                <xref ref-type="bibr" rid="ref6">Kwok, 2015</xref>). With this rapid advancement, there is already a shift from the &#x201c;smart&#x201d; era to the &#x201c;intelligent&#x201d; era. A &#x201c;smart phone&#x201d;, &#x201c;smart building&#x201d; or &#x201c;smart home&#x201d; is one that is capable of adapting to changing conditions. The term &#x201c;intelligent,&#x201d; on the other hand, refers to more than just being smart; rather, it refers to having the ability to think, reason, and understand, as well as being able to adapt to changing conditions. If you apply this to a device example, &#x201c;smart devices&#x201d; can perform tricks, but &#x201c;intelligent devices&#x201d; can learn new tricks in response to their changing surroundings (
                <xref ref-type="bibr" rid="ref10">Ng 
                    <italic toggle="yes">et al.,</italic> 2010</xref>).</p>
            <p>As part of the transition of every Higher Educational Institution (HEI) to being an intelligent campus, the Commission of Higher Education (CHED) has launched a program under 
                <ext-link ext-link-type="uri" xlink:href="https://ched.gov.ph/wp-content/uploads/CMO-No.-9-s.-2020-Guidelines-on-the-Allocation-of-Financial-assistance-for-State-Universities-and-Colleges-for-the-Development-of-Smart-Campuses-provided-in-Section-10-i-of-RA-11494.pdf">CHED Memorandum Order No. 9 s. 2020 for the development of smart campuses for State Universities and Colleges (SUCs</ext-link>). In fact, CHED releases a budget to assist SUCs in the development of smart campuses in which HEIs use next-generation digital technologies woven seamlessly within a well-architected infrastructure in developing tools to enhance teaching and learning, research, and extension as well as to improve operational efficiency. On the other hand, as a requirement by CHED and maintaining the quality of education in HEIs, CHED gives the accountability and responsibility to the accrediting body, such as the Accrediting Agency of Chartered Colleges and Universities of the Philippines (AACCUP), Philippines Association of Colleges and Universities Commission on Accreditation (PACU-COA), Philippine Accrediting Association of Schools, Colleges, and Universities (PAASCU), and many others to assess and provide certifications of quality education in the accredited program/institution as stated in the 
                <ext-link ext-link-type="uri" xlink:href="https://ched.gov.ph/wp-content/uploads/2017/10/CMO-No.01-s2005.pdf">CHED Memorandum Order No. 1 s. 200</ext-link>.</p>
            <p>Achieving a smart/intelligent campus requires consideration of different areas by the institution. Based on the study of Ng 
                <italic toggle="yes">et al</italic>., there are six main areas of intelligence, namely (1) iLearning, (2) iManagement, (3) iGovernance, (4) iSocial, (5) iHealth, and (6) iGreen. The accreditation process alone will fall under iManagement, however that entire aspect and purpose of accreditation falls in all the areas.</p>
            <p>Camarines Sur Polytechnic Colleges (CSPC) as a state college will be one of the settings for the initial implementation of the system. As part of the goal of CSPC to be the center for development and center of excellence, the institution opted to go along with the launch of the CHED program to become one of the smart campuses in the region. In connection to this, the institution also undergoes continuous accreditation through AACUP, as depicted in 
                <xref ref-type="fig" rid="f1">Figure 1</xref>, and ISO quality assurance to achieve the goal and gain the university status as the Polytechnic University of Bicol.</p>
            <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                <label>Figure 1. </label>
                <caption>
                    <title>Agency of Chartered Colleges and Universities of the Philippines (AACCUP) accreditation process.</title>
                </caption>
                <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure1.gif"/>
            </fig>
            <p>The accreditation process, as shown in 
                <xref ref-type="fig" rid="f1">Figure 1</xref>, passes through various phases or actions: (a) Application: An educational institution submits an application to AACCUP for accreditation. (b) Institutional self-survey: After the application has been approved, the applicant institution is expected to conduct an internal evaluation by its internal accreditors to evaluate whether the program is ready for an external review. (c) Preliminary survey visit: This is when external accreditors evaluate the program for the first time. The program is eligible to receive a Candidate status that is good for two years after passing the assessment. (d) The first formal survey visit reviews the program that has obtained Candidate status, and if it has met a higher standard of excellence, it is given a Level I Accredited status, which is valid for three years. (e) The second survey visit entails evaluating an accredited program, and if it has met the standards for a greater degree of quality than the survey visit that came before it, the program may be eligible to get a Level II Re-accreditation status that is valid for five years. (f) During the third survey visit, the accreditation level is completed by a program after five years of holding Level II Re-accreditation status. The program is reviewed and must perform exceptionally in four categories, namely instruction and extension, which are essential; and two other areas, which must be selected from among research, performance in licensure exams, faculty development, and links. (g) The fourth survey visit is a more difficult level that, if passed, may grant the organization institutional accreditation status.</p>
            <p>Accompanied with the tedious accreditation process are many documents that needs to be produced. For most experiences in the current accreditation undertakings in CSPC, the majority of the tasks have been done manually. Though there are tools available for cloud storage and automation like Google Drive, Dropbox, 
                <italic toggle="yes">etc.</italic>, problems such as repetition of work, invalid instruments, inefficient resource utilization, and inefficient monitoring before, during, and after the accreditation are still experienced by the personnel. With this perceived problem, an integrated system dedicated to quality assurance processes is a must.</p>
            <p>Upon the CSPC&#x2019;s goal of becoming a university and becoming a smart/intelligent campus, the researchers propose a centralized system that will cater to the needs of the institution in the process of accreditation, which is part of quality assurance. Through this study, CSPC will benefit from being a smart/intelligent campus by means of utilization of the system in the iManagement area and, at the same time, it addresses the problems encountered during the accreditation processes.</p>
            <p>Based on the problems identified and the commitment of the institution to be a smart/intelligent campus, the researchers propose this study as a component in the Integrated Quality Assurance System (IQAS) (RRID:SCR_023146). The study focuses on the documents archive needed for the accreditation process. The system will have a document repository of archived documents and these documents will be analyzed by the system through the use of intelligent modeling. Through the use of this, the documents will be categorized by means of the extracted labels.</p>
            <p>In general, the study aims to create a model in support of the categorization and automated tagging of the archived documents used during accreditation.</p>
        </sec>
        <sec id="sec2">
            <title>Related works</title>
            <p>Unstructured data make it more difficult and time-consuming to find a relevant document due to the exponential growth of electronic documents. Text document classification, which organizes unstructured documents into pre-defined classifications, is crucial to information processing and retrieval (
                <xref ref-type="bibr" rid="ref1">Akhter 
                    <italic toggle="yes">et al.,</italic> 2020</xref>). The text documents provide a number of difficult problems for data processing in order to retrieve the pertinent data. One of the well-liked methods for information retrieval based on themes from biomedical documents is topic modeling. Finding the correct subjects from the biological documents is a difficult task in topic modeling. Additionally, redundancy in biomedical text documents has a detrimental effect on text mining quality. As a result, the exponential rise of unstructured documents necessitates the development of topic modeling machine learning approaches (
                <xref ref-type="bibr" rid="ref11">Rashid 
                    <italic toggle="yes">et al.,</italic> 2019</xref>). In the framework of document categorization, they have conducted a comparative analysis of three models for a feature representation of text documents. The most popular family of bag-of-words models, the recently suggested continuous space models Word2Vec and Doc2Vec, and the model based on the representation of text documents as language networks are all taken into consideration in detail (
                <xref ref-type="bibr" rid="ref8">Martin&#x010d;i&#x0107;-Ip&#x0161;i&#x0107; 
                    <italic toggle="yes">et al.,</italic> 2019</xref>).</p>
            <p>In this study, word representation techniques were used to analyze how the similarity between English words is calculated. This work used the Word2Vec paradigm to express words as vectors. The 320,000 English Wikipedia articles included in this study&#x2019;s model served as the corpus, and the similarity value was calculated using the cosine similarity calculation method (
                <xref ref-type="bibr" rid="ref5">Jatnika 
                    <italic toggle="yes">et al.,</italic> 2019</xref>). Real-world text categorization problems frequently involve a multitude of closely related categories arranged in a taxonomy or hierarchical structure. When processing huge sets of closely related categories, hierarchical multi-label text categorization has grown more difficult (
                <xref ref-type="bibr" rid="ref7">Ma 
                    <italic toggle="yes">et al.,</italic> 2021</xref>). A popular technique for clustering functional data is the functional k-means clustering algorithm. The derivative information is not further taken into account by this approach when determining how similar two functional samples are to one another. In actuality, the derivative information is crucial for spotting variances in trend characteristics among functional data. By including their derivative information, we establish a novel distance in this paper that is utilized to compare functional samples (
                <xref ref-type="bibr" rid="ref9">Meng 
                    <italic toggle="yes">et al.,</italic> 2018</xref>). Due to its capacity to analyze data from numerous sources or views, multi-view clustering has drawn a growing amount of interest in recent years. In the research, they presented a unique multi-view clustering method called Two-level Weighted Collaborative k-means (TW-Co-k-means) to simultaneously address the issues on consistency across different views and weighing the views for the improvement of cluster results. For multi-view clustering, a new objective function has been developed that leverages the unique information in each view while also cooperatively utilizing the complementarity and consistency between various views (
                <xref ref-type="bibr" rid="ref12">Zhang 
                    <italic toggle="yes">et al.,</italic> 2018</xref>). The various pattern matching algorithms are used to locate every instance of a constrained set of patterns inside an input text or input document in order to examine the content of the documents. This research utilized four string matching techniques that are now in use: the Brute Force approach, the Knuth&#x2013;Morris&#x2013;Pratt algorithm (KMP), the Boyer&#x2013;Moore algorithm, and the Rabin&#x2013;Karp algorithm (
                <xref ref-type="bibr" rid="ref2">Bhagya Sri 
                    <italic toggle="yes">et al.,</italic> 2018</xref>). All the literature listed has similarity in text clustering, modeling, and classification, and serves as a proof that the study is feasible, and the proposed intelligent model can be integrated to further assist in the accreditation process of CSPC.</p>
        </sec>
        <sec id="sec3" sec-type="methods">
            <title>Methods</title>
            <p>As a guide in modeling the study, the researchers used the agile method (
                <ext-link ext-link-type="uri" xlink:href="https://dx.doi.org/10.17504/protocols.io.n2bvj82mxgk5/v2">https://dx.doi.org/10.17504/protocols.io.n2bvj82mxgk5/v2</ext-link>) as it promotes flexibility, speed, and, most importantly, continuous improvement in developing, testing, documenting, and even after delivery of the software. Since the phases of this model are light, the teams are not bound by a rigid systematic-based process on pre-set constraints and restrictions as some other models, like the waterfall model, and can adjust changes whenever they are needed. This flexibility on every stage propagates creativity and freedom within processes. Furthermore, development teams can modify and re-prioritize the backlog, allowing for speedy implementation (
                <xref ref-type="bibr" rid="ref4">Trivedi, 2021</xref>).</p>
            <p>Following the agile methodology, the researchers adapted the stages, as presented in 
                <xref ref-type="fig" rid="f2">Figure 2</xref>. These are: (1) Plan: the researchers collected previous documents involved in the accreditation process, such as compliance reports under the areas of student, faculty, facility, library, and administration. Also, understanding of the existing problems in tracking, tagging, and duplication of these documents during the accreditation process. (2) Design: the requirement specifications in this stage were identified in relation to the existing problem of the HEI in tracking, tagging, and duplication of the documents for accreditation and quality assurance. Along with this, the researchers also created the process of the intelligent model, which will be the basis of document labelling. (3) Develop: this stage is intended on the creation of the prototype, which involves the processing of the documents in order to identify the proper label for each document. (4) Deploy: the prototype undergoes a test run during this stage. (5) Review: the researchers conduct a checklist function review to check if each component is running properly. Lastly, (6) launch: wherein the prototype is embedded to the local system of the HEI.</p>
            <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                <label>Figure 2. </label>
                <caption>
                    <title>Agile methodology.</title>
                </caption>
                <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure2.gif"/>
            </fig>
        </sec>
        <sec id="sec4" sec-type="results|discussion">
            <title>Results and discussion</title>
            <sec id="sec5">
                <title>Intelligent model</title>
                <p>The results from this intelligent model are used for visualization in the super word vector and histogram. The super word vector is presented in a cloud map word to visualize the frequency of the words in the corpus, and the histogram is used to present the relationship of the words per sentence in the form of line graphs. The extracted labels and generated word vector and histogram are tagged and linked to the uploaded document, as patterned in the process shown in 
                    <xref ref-type="fig" rid="f3">Figure 3</xref>. This model is implemented in the IQAS to assist in categorization and searching in the file repository of accreditation documents.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Process of the intelligent model.</title>
                    </caption>
                    <graphic id="gr3" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure3.gif"/>
                </fig>
            </sec>
            <sec id="sec6">
                <title>Prototype</title>
                <p>The design prototype presented in this section is focused on the label extraction feature for automatic tagging of the archive documents used in accreditation.</p>
                <p>
                    <italic toggle="yes">Upload and clean</italic>
                </p>
                <p>As shown in 
                    <xref ref-type="fig" rid="f4">Figure 4</xref>, this phase allows the user to upload and clean the document through tokenization. Once uploaded, the user may set the configuration in cleaning the document. The options are removing numbers, symbols, and duplicates, adding and uploading additional stopwords, and showing and downloading the pre-processed data. There are other useful features particularly in managing the stopwords, such as showing the list of default stopwords and deleting the added and uploaded stopwords.</p>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>Phase I&#x2014;upload and clean snapshot.</title>
                    </caption>
                    <graphic id="gr4" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure4.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">Setting up parameters</italic>
                </p>
                <p>Phase II is intended for setting up the parameters for topic modeling, as presented in 
                    <xref ref-type="fig" rid="f5">Figure 5</xref>. Right after uploading and cleaning the document, the user can set the topic modeling parameters that will be use in identifying and extracting the labels. The parameters included are the desired number of topics, frequency of iteration, the number of words per topic to be generated, optimization interval, and the model&#x2019;s name. These parameters are primarily the factors in modeling the topics and label identification for automatic tagging.</p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <title>Phase II&#x2014;setting up parameters snapshot.</title>
                    </caption>
                    <graphic id="gr5" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure5.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">Extract label</italic>
                </p>
                <p>This phase, as shown in 
                    <xref ref-type="fig" rid="f6">Figure 6</xref>, provides the result of the processed corpus from the processing of the pre-processed document and the parameters that have been set up from the previous phase. This shows the number of documents uploaded, the total number of words in the document, the number of unique words, vocabulary density, readability index, average words per sentence, and most importantly the frequent words in the corpus. These frequently used words are extracted to be the label for automatic tagging later on. The user can also set the items to be shown in the most frequent word.</p>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <title>Phase III&#x2014;extract label snapshot.</title>
                    </caption>
                    <graphic id="gr6" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure6.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">Word cloud</italic>
                </p>
                <p>Along with the results of phase III, a word cloud is also generated. Phase IV, as depicted in 
                    <xref ref-type="fig" rid="f7">Figure 7</xref>, is a super word vector view of the frequent words in the processed corpus. The most evident words in the word cloud are the frequently used words from the previous phase, which are 
                    <italic toggle="yes">faculty</italic>, 
                    <italic toggle="yes">activities</italic>, 
                    <italic toggle="yes">library</italic>, 
                    <italic toggle="yes">research</italic>, and 
                    <italic toggle="yes">materials.</italic> The font size of the word is based on how many times this word is used in the corpus.</p>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Phase IV&#x2014;word cloud snapshot.</title>
                    </caption>
                    <graphic id="gr7" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure7.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">LDA visualization</italic>
                </p>
                <p>With the result generated during phase III, this phase provides the histogram presentation of the sample processed corpus with the support of the LDA visualization, as shown in 
                    <xref ref-type="fig" rid="f8">Figure 8</xref>. The line graph provides the relative frequencies of each generated label per document segment.</p>
                <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                    <label>Figure 8. </label>
                    <caption>
                        <title>Phase V&#x2014;LDA visualization snapshot.</title>
                    </caption>
                    <graphic id="gr8" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure8.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">Auto-tagging to uploaded document</italic>
                </p>
                <p>After the five phases, automatic tagging of the generated labels takes place, as shown in 
                    <xref ref-type="fig" rid="f9">Figure 9</xref>. The document is then stored in the file repository of the IQAS. The uploaded document will have corresponding metadata such as filename, file size, user, date created, tags, and the link of the processed model. The filename can also be updated, and adding and removing tags is also possible.</p>
                <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                    <label>Figure 9. </label>
                    <caption>
                        <title>Phase VI&#x2014;auto-tagging of labels in the uploaded document snapshot.</title>
                    </caption>
                    <graphic id="gr9" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/142987/9cc56796-b02a-41cb-806f-2ad69f5d5790_figure9.gif"/>
                </fig>
                <p>
                    <italic toggle="yes">Computational analysis</italic>
                </p>
                <p>For better understanding, this section provides the computational analysis of the actual result based on the processed document.</p>
                <p>In reference to the results of phase III, there are four significant results evident in 
                    <xref ref-type="fig" rid="f6">Figure 6</xref>. Vocabulary density is the ratio between the total number of words present in the corpus and the unique words. To obtain the vocabulary density, the total number of unique words is divided by the total number of words; for the sample computation see Equation 1.
                    <disp-formula id="e1">
                        <mml:math display="block">
                            <mml:mtext>Vocabularity density</mml:mtext>
                            <mml:mspace width="0.25em"/>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>VD</mml:mi>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:mtext>Number of unique words</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>UW</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:mtext>Total word count</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>WC</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                            </mml:mfrac>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e2">
                        <mml:math display="block">
                            <mml:mi>VD</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mn>720</mml:mn>
                                <mml:mn>2,833</mml:mn>
                            </mml:mfrac>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e3">
                        <mml:math display="block">
                            <mml:mi>VD</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>0.254</mml:mn>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>
                    <bold>Equation 1.</bold> Vocabulary density computation.</p>
                <p>The vocabulary density of the processed corpus is 0.254, which implies that the corpus contains complex text with many unique words. Moreover, the readability index and average word per sentence uses Java break iteration, which is a local sensitive class that has an imaginary cursor that points to the current boundary in a string of natural language text. This contains different kinds of boundaries such as for text character, words, sentence instance, and potential line breaks. These boundaries are the basis for the readability index and average words per sentence, which are 16.106 and 21.5, respectively. Frequently used words are identified based on the number counts of the word used in the processed corpus.</p>
                <p>The LDA visualization is presented through the correlation of the relative frequency of the word per document segmentation, as shown in 
                    <xref ref-type="fig" rid="f8">Figure 8</xref>. To identify the relative frequency, it is necessary to decide the number of document segmentations. For the purpose of this study, the researchers used 10 segments for the document. The grouping of words per segment is based on the total word count. The prototype now determines how many times a particular word is used per segment. Upon determination, the identified number of counts is divided into the total word count. For the sample computation, see Equations 2 and 3.
                    <disp-formula id="e4">
                        <mml:math display="block">
                            <mml:mtext>Words</mml:mtext>
                            <mml:mspace width="0.25em"/>
                            <mml:mi>per</mml:mi>
                            <mml:mspace width="0.25em"/>
                            <mml:mtext>segment</mml:mtext>
                            <mml:mspace width="0.25em"/>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>WS</mml:mi>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:mtext>Desired number of segments</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>DNS</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:mtext>Total word count</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>WC</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                            </mml:mfrac>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e5">
                        <mml:math display="block">
                            <mml:mi>WS</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mn>10</mml:mn>
                                <mml:mn>2,833</mml:mn>
                            </mml:mfrac>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e9">
                        <mml:math display="block">
                            <mml:mi>WS</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:msup>
                                <mml:mn>283.3</mml:mn>
                                <mml:mo>*</mml:mo>
                            </mml:msup>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>* First seven segments contain 283 words while the last three segments contain 284 words.</p>
                <p>
                    <bold>Equation 2.</bold> Words per segment computation.
                    <disp-formula id="e6">
                        <mml:math display="block">
                            <mml:mtext>Relative frequency</mml:mtext>
                            <mml:mspace width="0.25em"/>
                            <mml:mfenced close=")" open="(">
                                <mml:mi>RF</mml:mi>
                            </mml:mfenced>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:mtext>Word count</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mi>per</mml:mi>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mtext>segment</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>WCS</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:mtext>Total word count</mml:mtext>
                                    <mml:mspace width="0.25em"/>
                                    <mml:mfenced close=")" open="(">
                                        <mml:mi>WC</mml:mi>
                                    </mml:mfenced>
                                </mml:mrow>
                            </mml:mfrac>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e7">
                        <mml:math display="block">
                            <mml:mi>RF</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mn>2</mml:mn>
                                <mml:mn>2,833</mml:mn>
                            </mml:mfrac>
                        </mml:math>
                    </disp-formula>
                    <disp-formula id="e8">
                        <mml:math display="block">
                            <mml:mi>RF</mml:mi>
                            <mml:mo>=</mml:mo>
                            <mml:mn>0.0007060</mml:mn>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>
                    <bold>Equation 3.</bold> Sample computation for relative frequency (Word: research|2
                    <sup>nd</sup> Segment).</p>
                <p>For the overall results of the histogram, 
                    <xref ref-type="table" rid="T1">Tables 1</xref> and 
                    <xref ref-type="table" rid="T2">2</xref> present the tabular representation of the relative frequency per label and per segment.</p>
                <table-wrap id="T1" orientation="portrait" position="float">
                    <label>Table 1. </label>
                    <caption>
                        <title>Word count of labels per document segment.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="2" valign="top">Labels</th>
                                <th align="left" colspan="10" rowspan="1" valign="top">Word count per document segment</th>
                                <th align="left" colspan="1" rowspan="2" valign="top">Total count</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">1</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">2</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">3</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">4</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">5</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">6</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">7</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">8</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">9</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">10</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Faculty</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">13</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">28</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Activities</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">25</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Library</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">14</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">21</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Research</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">12</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">21</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Materials</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">18</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <table-wrap id="T2" orientation="portrait" position="float">
                    <label>Table 2. </label>
                    <caption>
                        <title>Relative frequency of labels per document segment.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="2" valign="top">Labels</th>
                                <th align="left" colspan="10" rowspan="1" valign="top">Relative frequency per document segment</th>
                                <th align="left" colspan="1" rowspan="2" valign="top">Total count</th>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">1</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">2</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">3</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">4</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">5</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">6</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">7</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">8</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">9</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">10</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Faculty</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.004589</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001059</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001765</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000706</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.009884</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Activities</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001059</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001765</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.002118</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000706</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000706</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.002118</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.008825</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Library</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.004942</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001765</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.007413</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Research</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000706</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0000706</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.004235</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.000353</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001412</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.007413</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="middle">Materials</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001059</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.002118</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001059</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001039</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.001059</td>
                                <td align="left" colspan="1" rowspan="1" valign="middle">0.006354</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
        </sec>
        <sec id="sec7" sec-type="conclusions">
            <title>Conclusions</title>
            <p>CSPC is in an exploratory phase when it comes to solving this particular problem involving accreditation. It is evident that there are problems encountered by the organization pertaining to the accreditation process. Therefore, the researchers devised a model that supports the organization for accreditation. In addition, the researchers also designed a prototype with the implementation of the model to help the organization through the process. As a result, it is easier to retrieve and classify the data, which is the main problem of the task group. Furthermore, other text classification patterns may also be integrated into the system and the results compared with given parameters.</p>
        </sec>
        <sec id="sec8">
            <title>Software availability</title>
            <p>Software available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/CraigList056/iqas/tree/v1.0.0-alpha">https://github.com/CraigList056/iqas/tree/v1.0.0-alpha</ext-link>
            </p>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/CraigList056/iqas">https://github.com/CraigList056/iqas</ext-link>
            </p>
            <p>Archived source code at time of publication: 
                <ext-link ext-link-type="uri" xlink:href="https://www.doi.org/10.5281/zenodo.7507492">https://www.doi.org/10.5281/zenodo.7507492</ext-link>
            </p>
            <p>
                <bold>License:</bold> 
                <ext-link ext-link-type="uri" xlink:href="https://opensource.org/licenses/MIT">MIT License</ext-link>
            </p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>We would like to express our great appreciation to our colleagues and friends for their undeniable support and for uplifting our spirits to make this research paper possible. We would like to also extend our appreciation to our respective institutions (Camarines Sur Polytechnic Colleges and University of the Cordilleras), which have been our second home and witness of our efforts during the research process.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Akhter</surname>
                            <given-names>MP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jiangbin</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Naqvi</surname>
                            <given-names>IR</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Document-Level Text Classification Using Single-Layer Multisize Filters Convolutional Neural Network.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Access.</italic>
</source>
                    <year>2020</year>;<volume>8</volume>(<issue>Ml</issue>):<fpage>42689</fpage>&#x2013;<lpage>42707</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ACCESS.2020.2976744</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bhagya Sri</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bhavsar</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Narooka</surname>
                            <given-names>P</given-names>
                        </name>
</person-group>:
                    <article-title>String Matching Algorithms.</article-title>
                    <source>

                        <italic toggle="yes">International Journal Of Engineering And Computer Science.</italic>
</source>
                    <year>2018</year>;<volume>7</volume>(<issue>03</issue>):<fpage>23769</fpage>&#x2013;<lpage>23772</lpage>.
                    <pub-id pub-id-type="doi">10.18535/ijecs/v7i3.19</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <mixed-citation publication-type="other">
                    <collab>CraigList056</collab>:
                    <article-title>CraigList056/iqas: Initial Release (v1.0.0-alpha).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2023</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.7507492</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Trivedi</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Agile Methodologies.</article-title>
                    <source>

                        <italic toggle="yes">International Journal of Computer Science &amp; Communication.</italic>
</source>
                    <year>2021</year>;<volume>12</volume>(<issue>2</issue>):<fpage>91</fpage>&#x2013;<lpage>100</lpage>.</mixed-citation>
            </ref>
            <ref id="ref5">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jatnika</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bijaksana</surname>
                            <given-names>MA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Suryani</surname>
                            <given-names>AA</given-names>
                        </name>
</person-group>:
                    <article-title>Word2vec model analysis for semantic similarities in English words.</article-title>
                    <source>

                        <italic toggle="yes">Procedia Computer Science.</italic>
</source>
                    <year>2019</year>;<volume>157</volume>:<fpage>160</fpage>&#x2013;<lpage>167</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.procs.2019.08.153</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kwok</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <article-title>A vision for the development of i-campus.</article-title>
                    <source>

                        <italic toggle="yes">Smart Learning Environments.</italic>
</source>
                    <year>2015</year>;<volume>2</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>12</lpage>.
                    <pub-id pub-id-type="doi">10.1186/s40561-015-0009-8</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ma</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhao</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Hybrid embedding-based text representation for hierarchical multi-label text classification.</article-title>
                    <source>

                        <italic toggle="yes">Expert Systems with Applications.</italic>
</source>
                    <year>2021</year>;<volume>187</volume>(<issue>July 2020</issue>):<fpage>115905</fpage>.
                    <pub-id pub-id-type="doi">10.1016/j.eswa.2021.115905</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Martin&#x010d;i&#x0107;-Ip&#x0161;i&#x0107;</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mili&#x010d;i&#x0107;</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Todorovski</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <article-title>The influence of feature representation of text on the performance of document classification.</article-title>
                    <source>

                        <italic toggle="yes">Applied Sciences (Switzerland).</italic>
</source>
                    <year>2019</year>;<volume>9</volume>(<issue>4</issue>).
                    <pub-id pub-id-type="doi">10.3390/app9040743</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Meng</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cao</surname>
                            <given-names>F</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A new distance with derivative information for functional k-means clustering algorithm.</article-title>
                    <source>

                        <italic toggle="yes">Information Sciences.</italic>
</source>
                    <year>2018</year>;<volume>463-464</volume>:<fpage>166</fpage>&#x2013;<lpage>185</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.ins.2018.06.035</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ng</surname>
                            <given-names>JWP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Azarmi</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Leida</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The intelligent campus (iCampus): End-to-end learning lifecycle of a knowledge ecosystem.</article-title>
                    <source>

                        <italic toggle="yes">Proceedings - 2010 6th International Conference on Intelligent Environments, IE 2010.</italic>
</source>
                    <year>2010</year>;<fpage>332</fpage>&#x2013;<lpage>337</lpage>.
                    <pub-id pub-id-type="doi">10.1109/IE.2010.68</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rashid</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Adnan Shah</surname>
                            <given-names>SM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Irtaza</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Topic Modeling Technique for Text Mining over Biomedical Text Corpora through Hybrid Inverse Documents Frequency and Fuzzy K-Means Clustering.</article-title>
                    <source>

                        <italic toggle="yes">IEEE Access.</italic>
</source>
                    <year>2019</year>;<volume>7</volume>:<fpage>146070</fpage>&#x2013;<lpage>146080</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2944973</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>GY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>CD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>TW-Co-k-means: Two-level weighted collaborative k-means for multi-view clustering.</article-title>
                    <source>

                        <italic toggle="yes">Knowledge-Based Systems.</italic>
</source>
                    <year>2018</year>;<volume>150</volume>:<fpage>127</fpage>&#x2013;<lpage>138</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.knosys.2018.03.009</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report165836">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.142987.r165836</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Naseem</surname>
                        <given-names>Shahid</given-names>
                    </name>
                    <xref ref-type="aff" rid="r165836a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-0791-541X</uri>
                </contrib>
                <aff id="r165836a1">
                    <label>1</label>Department of Information Sciences, Division of Science &amp; Technology, University of Education, Lahore, Pakistan</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>29</day>
                <month>3</month>
                <year>2023</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2023 Naseem S</copyright-statement>
                <copyright-year>2023</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport165836" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.130245.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>By looking at paper overall structure, presentation and above all the provided contents, I would say the authors of the paper requires minor changes to accept it for indexing.&#x00a0; 
                <list list-type="order">
                    <list-item>
                        <p>In this study, the indexing of the tagging/titles, sub-titles is missing.</p>
                    </list-item>
                    <list-item>
                        <p>Number of sentences and grammar mistakes in different sections of the paper.</p>
                    </list-item>
                    <list-item>
                        <p>In result section, number of students studied in the batch to be accredited, financial statement, and infrastructure must also be included because these documents are also required in accreditation process.</p>
                    </list-item>
                    <list-item>
                        <p>In related work, there should be structured or labelled data instead of f pre-defined data items.</p>
                    </list-item>
                    <list-item>
                        <p>In second paragraph of related work, the authors defined four types of machine learning techniques, but didn&#x2019;t explain for what purpose, these four techniques were used in this study.</p>
                    </list-item>
                    <list-item>
                        <p>In figure 2, there should be one more step i.e. maintenance included.</p>
                    </list-item>
                    <list-item>
                        <p>All the equations used in this study must be numbering.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>In equation 1, explain the procedure to calculate VD, from where we get the used valued to calculate VD.</p>
                    </list-item>
                    <list-item>
                        <p>Numbers of references used to validate this study are too short. There must be some more literature review to authenticate this study be used in this research.</p>
                    </list-item>
                </list>
            </p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Partly</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Artificial Intelligence, Machine Learning, and Deep learning for analyzing healthcare data</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment10895-165836">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Prianes</surname>
                            <given-names>Freddie</given-names>
                        </name>
                        <aff>Camarines Sur Polytechnic Colleges, Philippines</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>14</day>
                    <month>1</month>
                    <year>2024</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We would like to express our sincere appreciation for reviewing our paper. Your comments and suggestions are utmost valued. These are our response:</p>
                <p> 1.&#x00a0;&#x00a0; &#x00a0;In this study, the indexing of the tagging/titles, sub-titles is missing. 
                    <list list-type="bullet">
                        <list-item>
                            <p>This is somehow unclear to us. But if this is pertaining to the generated tags/titles or sub-titles for the indexing of the documents, it&#x2019;s been mentioned on Fig. 9 &#x2013; Phase VI.</p>
                        </list-item>
                    </list> 2.&#x00a0;&#x00a0; &#x00a0;Number of sentences and grammar mistakes in different sections of the paper. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Accomplished</p>
                        </list-item>
                    </list> 3.&#x00a0;&#x00a0; &#x00a0;In result section, number of students studied in the batch to be accredited, financial statement, and infrastructure must also be included because these documents are also required in accreditation process. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Since the study is in exploratory analysis, we focused first on the area of Faculty and Library. But upon implementation of the prototype, we will include the other areas i.e. Students, Finance, and Infrastructure.</p>
                        </list-item>
                    </list> 4.&#x00a0;&#x00a0; &#x00a0;In related work, there should be structured or labelled data instead of f pre-defined data items. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Accomplished</p>
                        </list-item>
                    </list> 5.&#x00a0;&#x00a0; &#x00a0;In second paragraph of related work, the authors defined four types of machine learning techniques, but didn&#x2019;t explain for what purpose, these four techniques were used in this study. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Accomplished</p>
                        </list-item>
                    </list> 6.&#x00a0;&#x00a0; &#x00a0;In figure 2, there should be one more step i.e. maintenance included. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Actually we include maintenance as a sub-phase of launch. We did not elaborate on this phase because we are doing another research on prototype testing and implementation which we will touch the maintenance procedure.</p>
                        </list-item>
                    </list> 7.&#x00a0;&#x00a0; &#x00a0;All the equations used in this study must be numbering.&#x00a0; 
                    <list list-type="bullet">
                        <list-item>
                            <p>We believe that all the equations in this study has values and have been numbered.</p>
                        </list-item>
                    </list> 8.&#x00a0;&#x00a0; &#x00a0;In equation 1, explain the procedure to calculate VD, from where we get the used valued to calculate VD. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Accomplished</p>
                        </list-item>
                    </list> 9.&#x00a0;&#x00a0; &#x00a0;Numbers of references used to validate this study are too short. There must be some more literature review to authenticate this study be used in this research. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Accomplished</p>
                        </list-item>
                    </list> We already made another a submission for the version 2 of our paper. Thank you.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
