ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Data Note

CFM: a database of experimentally validated protocols for chemical compound-based direct reprogramming and transdifferentiation

[version 1; peer review: 2 approved with reservations]
PUBLISHED 16 Apr 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Cheminformatics gateway.

This article is included in the Bioinformatics gateway.

Abstract

Cell fate engineering technologies are critically important for basic and applied science, yet many protocols for direct cell conversions are still unstable, have a low yield and require improvement. There is an increasing need for a data aggregator containing a structured collection of protocols -  preprocessed, verified, and represented in a standardized manner to facilitate their comparison, and providing a platform for the researchers to evaluate and improve the protocols.  
We developed CFM (cell fate mastering), a database of experimentally validated protocols for chemical compound-based direct reprogramming and direct cell conversion. The current version of CFM contains 169 distinct protocols, 113 types of cell conversions, and 158 small molecules capable of inducing cell conversion. CFM allows stem cell biologists to compare and choose the best protocol with high efficiency and reliability for their needs. The protocol representation contains PubChem CIDs and Mechanisms Of Action (MOA) for chemicals, protocol duration, media , and yield with a comment on a measurement strategy. Ratings of the protocols and feedback from the community will help to promote high-quality and reproducible protocols. We are committed to a long-term database maintenance strategy. The database is currently available at https://cfm.mipt.ru}{cfm.mipt.ru

Keywords

Cell Fate Engineering, Chemical reprogramming, Direct cell conversion, Transdifferentiation

Introduction

Cell engineering technologies possess a tremendous potential for basic and applied science. Recent discoveries in this field pushed forward regenerative and personalized medicine, drug discovery, and toxicology. In the future, cell engineering technologies can enable regeneration of altered tissues and extend the human lifespan. Now they serve as a platform for drug testing and toxicology experiments. Furthermore, these approaches enable molecular mechanisms to be revealed that drive normal development and differentiation. Transcription factors (TFs) have been used to manipulate cell fate in the majority of published protocols.1 Low efficiency and safety concerns restrict the application of cell fate engineering approaches in clinical practice.2 Small molecules that target signaling pathways or regulate the epigenetic state offer powerful tools for refined manipulation of cell fate to the desired outcome.3 There is an increasing number of protocols that utilize small molecules in order to induce lineage differentiation or to facilitate transdifferentiation by increasing efficiency or by replacing genetic reprogramming factors.

Small-molecule-mediated approaches have more potential for clinical applications since:3

  • 1. The biological effects of small molecules are typically rapid, reversible, dose-dependent and predictable, allowing precise control over specific outcomes and precise resembling of target cell type;

  • 2. Compared with genetic interventions, the relative ease of handling and administration of small molecules make them more practical for further therapeutic development;

  • 3. Small molecules do not evoke safety concerns about possible genome integration in contradistinction to genetic interventions.

Chemical cell conversion protocols vary not only in chemicals used for reprogramming but also in media, making it difficult to choose the optimal protocol in practice. A database integrating cellular reprogramming protocol has been already published.4 Yet, the protocol representation is not standardized and the database has not been updated recently. Here emerges the need for aggregation of protocols in a standardized form that would contain all relevant and specific information from them. Also, aggregators should provide user-friendly access to enable expeditious retrieval of information.

In this work, we present a protocol aggregator, CFM (cell fate mastering), with a user-friendly interface that allows stem cell biologists to compare protocols, select the best one for their needs with high efficiency and reliability as well as leave feedback for their experience regarding the protocol for the research community. Additionally, we provide information about a potential mechanism of drug action and pathways involved in cell conversion. Data on molecular mechanisms affected by specific compounds could help to improve the protocol. We use expert opinions to develop an original standard for protocol representation, provide public ratings for protocols calculated based on users’ evaluation, and maintain a discussion forum. Protocol ratings as well as detailed feedback from the community will help to promote high-quality and reproducible protocols. Standardized and detailed protocol representation will facilitate development and validation of systematic computational approaches for cell fate engineering (e.g. DECODE5). The current version of CFM contains 169 distinct protocols, 113 types of cell conversions, and 158 small molecules. Summary statistics are available in Figure 1.

6467fb1e-a9b9-4170-bee5-2c91c5775317_figure1.gif

Figure 1. Overview of the database contents.

(A) Number of types of cell conversions. (B) Number of cell type conversion pairs in CFM (the bar for ”other 106 cell conversions” represents an averaged number of a protocols for each other cell conversions type). (C) Frequencies of chemical compounds used in the protocols. (D) Frequencies of initial cell types used in protocols.

We are committed to a strong database maintenance strategy, which means:

  • 1. We manually check data in order to provide comprehensive information. Public rating is conducive to promoting reproducible protocols;

  • 2. We communicate with experimental biologists and permanently improve and extend the functionality of the database to meet their needs. Hence, we created a special feedback form where users can inform us what features they would like to add;

  • 3. We support protocol updating by users (after manual curation of the suggested protocol);

  • 4. We also plan to update the content of the database on a regular basis (at least bi-annually).

Methods

Implementation

We collected protocols from PubMed and Google Scholar using the keywords ‘direct reprogramming’ AND ‘chemicals OR small molecules’, ‘transdifferentiation’ AND ‘chemicals OR small molecules’, ’direct cell conversion’ AND ‘chemicals OR small molecules’. In this way, more than 1000 papers were obtained. Experts prefiltered papers based on the content of key sections (abstract, introduction, method) discarding irrelevant papers. We manually extracted information from each relevant article and added data about small molecules implicated in protocol from PubChem as a part of postprocessing. The overview of the whole workflow can be seen in Figure 2.

The database can be queried at cfm.mipt.ru/query page (Figure 3). By default, all protocols are listed. If no protocols match a query, the whole database is displayed (default view). Search is case-independent. The first column contains a link to protocol description in the cfm.mipt.ru/viewProtocol/<protocolId> format, where protocol ID is an inner identifier. The protocol description page contains article information (DOI, article title, and authors), source and target cell lines , initial cell culture description, the species from which the source cell line was obtained, the total protocol duration, the media in which the protocol was implemented, protocol yield (with comments on how it was calculated), chemicals (with their methods of effect is known), transcription factors, stress factors and growth factors used during the procedure and some comments on the protocol in general. Also, we provide a simple interface for rating protocols for registered participants. After curation, the rating will be available on the protocol page. All rated protocols are shown in a personal user account (cfm.mipt.ru/login). A researcher can add a protocol (and a published related paper) to the CFM database by submitting it through cfm.mipt.ru/add after the email is verified by a CFM administrator. Detailed information about verification can be found on the Add Protocol page.

6467fb1e-a9b9-4170-bee5-2c91c5775317_figure2.gif

Figure 2. Data processing workflow diagram.

6467fb1e-a9b9-4170-bee5-2c91c5775317_figure3.gif

Figure 3. Database interface screenshot.

We strongly believe that the development and maintenance of such a database will encourage researchers to keep improving protocols for cell conversion. Among all cell conversions, reprogramming protocol still dominates over trans-differentiation Figure 1. As it is clear from the Figure 1B and C, fibroblasts are widely used as an initial cell type in various protocols, while other cell types are underrepresented. Although fibroblasts might be an easy cell type to obtain in the clinic, other initial cell types such as blood cells could be also of use. As can be seen in Figure 1D, there is a small group of chemicals effective in several protocols, suggesting that they target a general mechanism. Possibly other chemicals from the same functional class, such as epigenetic drugs, should also be tested for their cell conversion potential. Further research on the regarding relations between cell conversions and chemicals, media, transcription factors, etc. should lead to the development of a computational framework that will facilitate acceleration of the search for conditions conducive to new cell conversions which are highly applicable in medicine (for example,6-8).

Operation

We perform all computations on the server side, so the user only needs to have a web browser to use our service. However, we recommend not to access the service from smartphones because our interface is not optimized for them.

Use cases

Querying the database

In Figure 4, as well as on Figure 3, you can see examples of querying the database. New fields of search can be added using the green ADD button; unnecessary fields can be removed using the red DELETE button.

6467fb1e-a9b9-4170-bee5-2c91c5775317_figure4.gif

Figure 4. Query to the database.

Molecular similarity calculation

We calculate the similarity of molecules based on their SMILES. Retrieving Morgan fingerprints from them, we calculate the Tanimoto similarity and provide the result to the user. The interface can be seen in Figure 5.

6467fb1e-a9b9-4170-bee5-2c91c5775317_figure5.gif

Figure 5. SMILES similarity interface.

Data availability

All data was gathered from open sources such as PubMed; links to the source articles can be found on protocol pages accessible from cfm.mipt.ru/query page.

We collected protocols from PubMed and Google Scholar, and downloaded SMILES for chemical compounds from PubChem. All data can be downloaded from our GitHub Repo.

Software availability

Our service is available at: http:cfm.mipt.ru

Source code is avaliable from: https://github.com/lubitelpospat/CFM-source

Archived source code as at time of publication: http://doi.org/10.5281/zenodo.4500125.9

License: MIT

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 16 Apr 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Sizykh A, Murtazalieva K, Vyshkvorkina Y et al. CFM: a database of experimentally validated protocols for chemical compound-based direct reprogramming and transdifferentiation [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:295 (https://doi.org/10.12688/f1000research.28439.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 16 Apr 2021
Views
20
Cite
Reviewer Report 28 Jun 2021
Stephanie M. Willerth, Department of Mechanical Engineering, University of Victoria, Victoria, BC, Canada 
Approved with Reservations
VIEWS 20
The paper details the CFM database that enables users to explore, compare and evaluate different methods for chemical reprogramming cells from one phenotype into another. Cellular reprogramming is becoming an increasingly popular technique used for both understanding biology and for ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Willerth SM. Reviewer Report For: CFM: a database of experimentally validated protocols for chemical compound-based direct reprogramming and transdifferentiation [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:295 (https://doi.org/10.5256/f1000research.31475.r88121)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
28
Cite
Reviewer Report 11 May 2021
Erdem B. Dashinimaev, Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Pirogov Russian National Research Medical University, Moscow, Russian Federation 
Approved with Reservations
VIEWS 28
This work is devoted to creating a database of direct reprogramming protocols for human and mammalian cells. As a researcher working in this field, I welcome such initiatives to streamline and make sense of the large amounts of data emerging ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Dashinimaev EB. Reviewer Report For: CFM: a database of experimentally validated protocols for chemical compound-based direct reprogramming and transdifferentiation [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:295 (https://doi.org/10.5256/f1000research.31475.r83456)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 16 Apr 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.