ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Revised

Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System

[version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]
Previously titled: Modeling document labels using Latent Dirichlet allocation for archived documents in Integrated Quality Assurance System (IQAS)
PUBLISHED 09 Apr 2024
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Background

As part of the transition of every higher education institution into an intelligent campus here in the Philippines, the Commission of Higher Education has launched a program for the development of smart campuses for state universities and colleges to improve operational efficiency in the country. With regards to the commitment of Camarines Sur Polytechnic Colleges to improve the accreditation operation and to resolve the evident problems in the accreditation process, the researchers propose this study as part of an Integrated Quality Assurance System that aims to develop an intelligent model that will be used in categorizing and automating tagging of archived documents used during accreditation.

Methods

As a guide in modeling the study, the researchers use an agile method as it promotes flexibility, speed, and, most importantly, continuous improvement in developing, testing, documenting, and even after delivery of the software. This method helped the researchers design the prototype with the implementation of the said model to aid the file searching process and label tagging. Moreover, a computational analysis is also included to understand the result from the devised model further.

Results

As a result, from the processed sample corpus, the document labels are faculty, activities, library, research, and materials. The labels generated are based on the total relative frequencies, which are 0.009884, 0.008825, 0.007413, 0.007413, and 0.006354, respectively, that have been computed between the ratio of how many times the term was used in the document and the total word count of the whole document.

Conclusions

The devised model and prototype support the organization in file storing and categorization of accreditation documents. Through this, retrieving and classifying the data is easier, which is the main problem for the task group. Further, other clustering, modeling, and text classification patterns can be integrated into the prototype.

Keywords

Latent Dirichlet allocation, document labels, natural language processing, accreditation, quality assurance, Intelligent model, CSPC

Revised Amendments from Version 2

Based on the feedback of the reviewer, we have adapted the new title as suggested.

See the authors' detailed response to the review by Zbigniew H. Gontar
See the authors' detailed response to the review by Shahid Naseem

Introduction

The creation of a smart campus is a step toward the creation of a smart city. Teaching and learning will be more difficult in the future due to the rapid advancements in information and communication technology (Kwok, 2015). With this rapid advancement, there is already a shift from the “smart” era to the “intelligent” era. A “smartphone,” “smart building,” or “smart home” is capable of adapting to changing conditions. The term “intelligent,” on the other hand, refers to more than just being smart; rather, it refers to having the ability to think, reason, and understand and adapting to adapt to changing conditions. If you apply this to a device example, “smart devices” can perform tricks, but “intelligent devices” can learn new tricks in response to their changing surroundings (Ng et al., 2010).

As part of every Higher Educational Institution (HEI)’s transition to an intelligent campus, the Commission of Higher Education (CHED) has launched a program under CHED Memorandum Order No. 9 s. 2020 for developing smart campuses for State Universities and Colleges (SUCs). In fact, CHED releases a budget to assist SUCs in developing smart campuses in which HEIs use next-generation digital technologies woven seamlessly within a well-architected infrastructure in developing tools to enhance teaching and learning, research, and extension, as well as to improve operational efficiency. On the other hand, as a requirement by CHED and maintaining the quality of education in HEIs, CHED gives accountability and responsibility to the accrediting body, such as the Accrediting Agency of Chartered Colleges and Universities of the Philippines (AACCUP), Philippines Association of Colleges and Universities Commission on Accreditation (PACU-COA), Philippine Accrediting Association of Schools, Colleges, and Universities (PAASCU), and many others to assess and provide certifications of quality education in the accredited program/institution as stated in the CHED Memorandum Order No. 1 s. 200.

Achieving a smart/intelligent campus requires consideration of different areas by the institution. Based on the study of Ng et al., there are six main areas of intelligence, namely (1) iLearning, (2) iManagement, (3) iGovernance, (4) iSocial, (5) iHealth, and (6) iGreen. The accreditation process alone will fall under iManagement; however, that entire aspect and purpose of accreditation fall in all the areas.

As a state college, Camarines Sur Polytechnic Colleges (CSPC) will be one setting for the initial implementation of the system. As part of the goal of CSPC to be the center for development and center of excellence, the institution opted to go along with the launch of the CHED program to become one of the smart campuses in the region. In connection to this, the institution also undergoes continuous accreditation through AACUP, as depicted in Figure 1, and ISO quality assurance to achieve the goal and gain the university status as the Polytechnic University of Bicol.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure1.gif

Figure 1. Agency of Chartered Colleges and Universities of the Philippines (AACCUP) accreditation process.

As shown in Figure 1, the accreditation process passes through various phases or actions: (a) Application: An educational institution submits an application to AACCUP for accreditation. (b) Institutional self-survey: After the application has been approved, the applicant institution is expected to conduct an internal evaluation by its internal accreditors to evaluate whether the program is ready for an external review. (c) Preliminary survey visit: This is when external accreditors evaluate the program for the first time. The program is eligible for a Candidate status that is good for two years after passing the assessment. (d) The first formal survey visit reviews the program that has obtained Candidate status. If it has met a higher standard of excellence, it is given a Level I Accredited status, valid for three years. (e) The second survey visit entails evaluating an accredited program. Suppose it has met the standards for a greater degree of quality than the survey visit that came before it. In that case, the program may be eligible for a Level II Re-accreditation status valid for five years. (f) During the third survey visit, the accreditation level is completed by a program after five years of holding Level II Re-accreditation status. The program is reviewed and must perform exceptionally in four categories, namely instruction, and extension, which are essential, and two other areas, which must be selected from among research, performance in licensure exams, faculty development, and links. (g) The fourth survey visit is a more difficult level that, if passed, may grant the organization institutional accreditation status.

Accompanied with the tedious accreditation process are many documents that need to be produced. For most of the experiences in the current accreditation undertakings in CSPC, most of the tasks have been done manually. Though tools are available for cloud storage and automation like Google Drive, Dropbox, etc., problems such as repetition of work, invalid instruments, inefficient resource utilization, and inefficient monitoring before, during, and after the accreditation are still experienced by the personnel. With this perceived problem, an integrated system dedicated to quality assurance processes is a must.

Upon the CSPC’s goal of becoming a university and becoming a smart/intelligent campus, the researchers propose a centralized system that will cater to the institution’s needs in the accreditation process, which is part of quality assurance. Through this study, CSPC will benefit from being a smart/intelligent campus by using the system in the iManagement area, and, at the same time, it addresses the problems encountered during the accreditation processes.

Based on the problems identified and the commitment of the institution to be a smart/intelligent campus, the researchers propose this study as a component in the Integrated Quality Assurance System (IQAS) (RRID: SCR_023146). The study focuses on the documents required for the accreditation process. The system will have a document repository of archived documents, and the system will analyze these documents through the use of intelligent modeling. Using this, the documents will be categorized through the extracted labels.

The study aims to create a model supporting the categorization and automated tagging of archived documents used during accreditation.

Related works

Unstructured data makes it more difficult and time-consuming to find a relevant document due to the exponential growth of electronic documents. Text document classification, which organizes unstructured documents into pre-defined classifications, is crucial to information processing and retrieval (Akhter et al., 2020). The text documents provide several difficult data processing problems for retrieving pertinent data. One of the well-liked methods for information retrieval based on themes from biomedical documents is topic modeling. Finding the correct subjects from the biological documents is difficult in topic modeling. Additionally, redundancy in biomedical text documents has a detrimental effect on text mining quality. As a result, the exponential rise of unstructured documents necessitates developing topic-modeling machine-learning approaches (Rashid et al., 2019). In the framework of document categorization, they have conducted a comparative analysis of three models for a feature representation of text documents. The most popular family of bag-of-words models, the recently suggested continuous space models Word2Vec and Doc2Vec, and the model based on the representation of text documents as language networks are all taken into consideration in detail (Martinčić-Ipšić et al., 2019).

Based on the previous articles, unstructured text data refers to textual information that lacks a predefined structure, making it challenging to analyze and extract meaningful insights. Latent Dirichlet Allocation (LDA) is a probabilistic generative model that can be used for topic modeling, a technique that helps uncover latent topics within a collection of documents (Curiskis et al., 2020). When applied to document labeling, LDA can assist in organizing unstructured text into structured representations based on underlying topics (Maier et al., n.d.).

The LDA algorithm works properly, and it is suitable for text mining. It enables the user to extract important content from text data sets (Tong & Zhang, 2016). Apart from this, it converts inaccessible data into a structured format, which can be used for further analysis. It also emphasizes facts and relationships from large data sets (Yehia et al., 2016). This information is extracted and converted into structured data for visualization, analysis, and integration as structured data and refines the information using machine-learning methods (Gnanavel et al., 2022).

The output of LDA is structured data that organizes documents into topics, allowing for the identification of the most significant topics in the corpus and their associated words. This structured data provides insights into the underlying structure and themes of the corpus, enabling further analysis and interpretation (Liu et al., 2023).

By employing LDA, unstructured text data is transformed into structured representations through the identification of latent topics, facilitating improved organization, retrieval, and analysis of textual information (Camilleri & Miah, 2021). The labeled documents provide a meaningful and interpretable way to understand the content and themes within the corpus (Markowitz, 2021).

The study used word representation techniques to analyze how the similarity between English words is calculated. In a similar work, it used the Word2Vec paradigm to express words as vectors. The 320,000 English Wikipedia articles included in this study’s model served as the corpus, and the similarity value was calculated using the cosine similarity calculation method (Jatnika et al., 2019). Real-world text categorization problems frequently involve a multitude of closely related categories arranged in a taxonomy or hierarchical structure. When processing huge sets of closely related categories, hierarchical multi-label text categorization has grown more difficult (Ma et al., 2021). A popular technique for clustering functional data is the functional k-means clustering algorithm. The derivative information is not further considered by this approach when determining how similar two functional samples are to one another. The derivative information is crucial for spotting variances in trend characteristics among functional data. By including their derivative information, we establish a novel distance in this paper to compare functional samples (Meng et al., 2018). Due to its capacity to analyze data from numerous sources or views, multi-view clustering has drawn growing interest in recent years. In the research, they presented a unique multi-view clustering method called Two-level Weighted Collaborative k-means (TW-Co-k-means) to simultaneously address the issues of consistency across different views and weigh the views for improving cluster results. For multi-view clustering, a new objective function has been developed that leverages the unique information in each view while also cooperatively utilizing the complementarity and consistency between various views (Zhang et al., 2018). The various pattern matching algorithms are used to locate every instance of a constrained set of patterns inside an input text or input document to examine the content of the documents. This research utilized four string matching techniques that are now in use: the Brute Force approach, the Knuth–Morris–Pratt algorithm (KMP), the Boyer–Moore algorithm, and the Rabin–Karp algorithm (Bhagya Sri et al., 2018). Analogous to the technique used by the researcher in exploring all possible combinations using the functionality of LDA, the Brute Force approach is like exhaustively considering all possible topics and their distribution in the document (Robinson & Quinn, 2018). However, it’s inefficient, much like considering every possible combination of words as potential topics (Murray et al., 2022). On the other hand, the KMP algorithm’s efficiency in skipping unnecessary comparisons (Lu, 2019), similar to LDA it efficiently identifies topics in documents by leveraging models, optimizing the process of finding meaningful patterns (topics) in text (Rawat et al., 2022). In terms of skipping the portions of the text based on the information gathered during preprocessing in the documents, LDA skips irrelevant words and focuses on key terms that contribute to the identification of topics (Hwang et al., 2023) that are similar to the Boyer–Moore algorithm skipping portions of text during the matching process (Danvy & Rohde, 2006). Moreover, Rabin–Karp’s hashing for efficient matching (Siahaan, 2018) is akin to LDA, which also involves modeling to identify relevant topics in documents, quickly bypassing irrelevant information (Asmussen & Møller, 2019). All the literature listed has similarities in text clustering, modeling, and classification. It serves as proof that the study is feasible and the proposed intelligent model can be integrated to further assist in the accreditation process in CSPC.

Methods

As a guide in modeling the study, the researchers used the agile method (https://dx.doi.org/10.17504/protocols.io.n2bvj82mxgk5/v2) as it promotes flexibility, speed, and, most importantly, continuous improvement in developing, testing, documenting, and even after delivery of the software. Since the phases of this model are light, the teams are not bound by a rigid systematic-based process on pre-set constraints and restrictions as some other models, like the waterfall model, can adjust changes whenever needed. This flexibility at every stage propagates creativity and freedom within processes. Furthermore, development teams can modify and re-prioritize the backlog, allowing for speedy implementation (Trivedi, 2021).

Following the agile methodology, the researchers adapted the stages presented in Figure 2. These are (1) Plan: the researchers collected previous documents involved in the accreditation process, such as compliance reports for student, faculty, facility, library, and administration. Also, understanding the existing problems in tracking, tagging, and duplicating these documents during the accreditation process. (2) Design: The requirement specifications in this stage were identified about the existing problem of the HEI in tracking, tagging, and duplicating the documents for accreditation and quality assurance. Along with this, the researchers also created the process of the intelligent model, which will be the basis of document labeling. (3) Develop: This stage is intended for the creation of the prototype, which involves processing the documents to identify the proper label for each document. (4) Deploy: the prototype undergoes a test run during this stage. (5) Review: the researchers conduct a checklist function review to check if each component runs properly. Lastly, (6) launch, wherein the prototype is embedded in the local system of the HEI together with the maintenance procedure upon full implementation of the system.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure2.gif

Figure 2. Agile methodology.

Results and discussion

Intelligent model

The results from this intelligent model are used for visualization in the super word vector and histogram. The super word vector is presented in a cloud map word to visualize the frequency of the words in the corpus, and the histogram is used to present the relationship of the words per sentence in the form of line graphs. The extracted labels and generated word vector and histogram are tagged and linked to the uploaded document, as patterned in the process shown in Figure 3. This model is implemented in the IQAS to assist in categorization and searching in the file repository of accreditation documents.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure3.gif

Figure 3. Process of the intelligent model.

Prototype

The design prototype presented in this section is focused on the label extraction feature for automatic tagging of the archive documents used in accreditation.

Upload and clean

As shown in Figure 4, this phase allows the user to upload and clean the document through tokenization. Once uploaded, the user may set the configuration to clean the document. The options are removing numbers, symbols, and duplicates, adding and uploading additional stopwords, and showing and downloading the pre-processed data. There are other useful features, particularly in managing the stopwords, such as showing the list of default stopwords and deleting the added and uploaded stopwords.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure4.gif

Figure 4. Phase I—upload and clean snapshot.

Setting up parameters

Phase II is intended to set up the parameters for topic modeling, as presented in Figure 5. Right after uploading and cleaning the document, the user can set the topic modeling parameters to identify and extract the labels. The parameters included are the desired number of topics, frequency of iteration, the number of words per topic to be generated, optimization interval, and the model’s name. These parameters are primarily the factors in modeling the topics and label identification for automatic tagging.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure5.gif

Figure 5. Phase II—setting up parameters snapshot.

Extract label

This phase, as shown in Figure 6, provides the result of the processed corpus from the processing of the pre-processed document and the parameters that have been set up from the previous phase. This shows the number of documents uploaded, the total number of words in the document, the number of unique words, vocabulary density, readability index, average words per sentence, and most importantly, the frequent words in the corpus. These frequently used words are extracted to be the label for automatic tagging later. The user can also set the items to be shown in the most frequent words.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure6.gif

Figure 6. Phase III—extract label snapshot.

Word cloud

A word cloud is also generated with the results of phase III. Phase IV, as depicted in Figure 7, is a super word vector view of the frequent words in the processed corpus. The most evident words in the word cloud are the frequently used words from the previous phase: faculty, activities, library, research, and materials. The font size of the word is based on how many times this word is used in the corpus.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure7.gif

Figure 7. Phase IV—word cloud snapshot.

LDA visualization

With the result generated during phase III, this phase provides the histogram presentation of the sample processed corpus with the support of the LDA visualization, as shown in Figure 8. The line graph provides the relative frequencies of each generated label per document segment.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure8.gif

Figure 8. Phase V—LDA visualization snapshot.

Auto-tagging to uploaded document

After the five phases, automatic tagging of the generated labels occurs which are faculty, activities, library, research, and materials, as shown in Figure 9. The document is then stored in the IQAS file repository. The uploaded document will have corresponding metadata such as filename, file size, user, date created, tags, and the link of the processed model. The filename can also be updated, and adding and removing tags is possible.

0f49538b-7643-464b-ba3c-3f0bd8e6d015_figure9.gif

Figure 9. Phase VI—auto-tagging of labels in the uploaded document snapshot.

Computational analysis

This section provides the computational analysis of the actual result based on the processed document for better understanding.

In reference to the results of phase III, four significant results are evident in Figure 6. Vocabulary density is the ratio between the total number of words in the corpus and the unique words (Crane, 2023). To obtain the vocabulary density, the total number of unique words is divided by the total number of words wherein the values used for sample computation are derive from the result of LDA algorithm embedded in the prototype which can be seen in Figure 4; for the sample computation, see Equation 1.

Vocabulary densityVD=Number of unique wordsUWTotal word countWC
VD=7202,833
VD=0.254

Equation 1. Vocabulary density computation.

The vocabulary density of the processed corpus is 0.254, which implies that the corpus contains complex text with many unique words. Moreover, the readability index and average word per sentence use Java break iteration, a locally sensitive class with an imaginary cursor pointing to the current boundary in a string of natural language text. This contains different kinds of boundaries, such as text characters, words, sentence instances, and potential line breaks. These boundaries are the basis for the readability index and average words per sentence, which are 16.106 and 21.5, respectively. Frequently used words are identified based on the number counts of the word used in the processed corpus.

The LDA visualization is presented by correlating the relative frequency of the word per document segmentation, as shown in Figure 8. To identify the relative frequency, deciding the number of document segmentations is necessary. For this study, the researchers used ten (10) segments for the document. The grouping of words per segment is based on the total word count. The prototype now determines how often a particular word is used per segment. Upon determination, the identified number of counts is divided into the total word count. For the sample computation (Crane, 2023), see Equations 2 and 3.

WordspersegmentWS=Desired number of segmentsDNSTotal word countWC
WS=102,833
WS=283.3*

* The first seven segments contain 283 words, while the last three segments contain 284 words.

Equation 2. Words per segment computation.

Relative frequencyRF=Word countpersegmentWCSTotal word countWC
RF=22,833
RF=0.0007060

Equation 3. Sample computation for relative frequency (Word: research|2nd Segment).

For the overall results of the histogram, Tables 1 and 2 present the tabular representation of the relative frequency per label and segment.

Table 1. Word count of labels per document segment.

LabelsWord count per document segmentTotal count
12345678910
Faculty1133512110128
Activities035160022625
Library0000114510021
Research0221201004021
Materials360300003318

Table 2. Relative frequency of labels per document segment.

LabelsRelative frequency per document segmentTotal count
12345678910
Faculty0.0003530.0045890.0010590.0017650.0003530.0007060.0003530.00035300.0003530.009884
Activities00.0010590.0017650.0003530.002118000.0007060.0007060.0021180.008825
Library00000.0003530.0049420.0017650.000353000.007413
Research00.00070600007060.00423500.000353000.00141200.007413
Materials0.0010590.00211800.00105900000.0010390.0010590.006354

Conclusions

CSPC is in an exploratory phase when it comes to solving this particular accreditation problem. It is evident that the organization encountered problems pertaining to the accreditation process. Therefore, the researchers devised a model that supports the organization’s accreditation. In addition, the researchers also designed a prototype with the implementation of the model to help the organization through the process. As a result, retrieving and classifying the data is easier, which is the main problem of the task group. Furthermore, other text classification patterns may also be integrated into the system, and the results may be compared with given parameters.

Software availability

Software available from: https://github.com/CraigList056/iqas/tree/v1.0.0-alpha

Source code available from: https://github.com/CraigList056/iqas

Archived source code at the time of publication: https://www.doi.org/10.5281/zenodo.7507492

License: MIT License

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 27 Jan 2023
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Prianes F and Palaoag T. Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System [version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]. F1000Research 2024, 12:105 (https://doi.org/10.12688/f1000research.130245.3)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 3
VERSION 3
PUBLISHED 09 Apr 2024
Revised
Views
8
Cite
Reviewer Report 29 Jul 2024
Sandhya Avasthi, CSE, ABES Engineering College, Ghaziabad, Uttar Pradesh, India 
Not Approved
VIEWS 8
  1. The problem is stated clearly, but the use of topic modelling in it is not clear. For example, the research paper didn’t include how many documents were taken for processing.
  2. As shown in results, the
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Avasthi S. Reviewer Report For: Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System [version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]. F1000Research 2024, 12:105 (https://doi.org/10.5256/f1000research.164614.r301532)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
18
Cite
Reviewer Report 11 Jul 2024
Tianbo Ji, Nantong University,, Nantong, China 
Approved with Reservations
VIEWS 18
This paper proposes an application of leveraging LDA for document classification and labelling. It is a well-motivated paper as it is specifically designed for the accreditation of a certain academic institution. However, this paper looks like the introduction to a ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Ji T. Reviewer Report For: Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System [version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]. F1000Research 2024, 12:105 (https://doi.org/10.5256/f1000research.164614.r301531)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 2
VERSION 2
PUBLISHED 26 Jan 2024
Revised
Views
22
Cite
Reviewer Report 28 Mar 2024
Zbigniew H. Gontar, SGH Warsaw School of Economics, Warszawa, Poland 
Not Approved
VIEWS 22
The title of the article, "Modeling document labels using Latent Dirichlet Allocation for archived documents in Integrated Quality Assurance System," suggests that it would address the issue of modeling within a defined domain using the Latent Dirichlet Allocation (LDA) method. ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
H. Gontar Z. Reviewer Report For: Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System [version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]. F1000Research 2024, 12:105 (https://doi.org/10.5256/f1000research.161414.r254578)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 19 Jun 2024
    Freddie Prianes, College of Computer Studies, Camarines Sur Polytechnic Colleges, Nabua, 4432, Philippines
    19 Jun 2024
    Author Response
    Thank you for the feedback. The suggested title has been adapted for this research paper.
    Competing Interests: No competing interests were disclosed.
COMMENTS ON THIS REPORT
  • Author Response 19 Jun 2024
    Freddie Prianes, College of Computer Studies, Camarines Sur Polytechnic Colleges, Nabua, 4432, Philippines
    19 Jun 2024
    Author Response
    Thank you for the feedback. The suggested title has been adapted for this research paper.
    Competing Interests: No competing interests were disclosed.
Views
12
Cite
Reviewer Report 26 Mar 2024
Shahid Naseem, Department of Information Sciences, Division of Science & Technology, University of Education, Lahore, Pakistan 
Approved
VIEWS 12
The authors of the said paper have incorporated ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Naseem S. Reviewer Report For: Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System [version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]. F1000Research 2024, 12:105 (https://doi.org/10.5256/f1000research.161414.r241034)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 27 Jan 2023
Views
38
Cite
Reviewer Report 29 Mar 2023
Shahid Naseem, Department of Information Sciences, Division of Science & Technology, University of Education, Lahore, Pakistan 
Approved with Reservations
VIEWS 38
By looking at paper overall structure, presentation and above all the provided contents, I would say the authors of the paper requires minor changes to accept it for indexing. 
  1. In this study, the indexing of the
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Naseem S. Reviewer Report For: Developing an Application for Document Analysis with Latent Dirichlet Allocation: A Case Study in Integrated Quality Assurance System [version 3; peer review: 1 approved, 1 approved with reservations, 2 not approved]. F1000Research 2024, 12:105 (https://doi.org/10.5256/f1000research.142987.r165836)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 13 Apr 2024
    Freddie Prianes, College of Computer Studies, Camarines Sur Polytechnic Colleges, Nabua, 4432, Philippines
    13 Apr 2024
    Author Response
    We would like to express our sincere appreciation for reviewing our paper. Your comments and suggestions are utmost valued. These are our response:
    1.    In this study, the indexing of ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 13 Apr 2024
    Freddie Prianes, College of Computer Studies, Camarines Sur Polytechnic Colleges, Nabua, 4432, Philippines
    13 Apr 2024
    Author Response
    We would like to express our sincere appreciation for reviewing our paper. Your comments and suggestions are utmost valued. These are our response:
    1.    In this study, the indexing of ... Continue reading

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 27 Jan 2023
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.