A tutorial hub for realising the potential of tools within the SPARC ecosystem

Niloofar Shahidi; Omkar Athavale; Yuda Munarko; Mathias Roesler; Kenneth Tran

doi:10.12688/f1000research.138059.1

Home Browse A tutorial hub for realising the potential of tools within the SPARC...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Method Article

A tutorial hub for realising the potential of tools within the SPARC ecosystem

[version 1; peer review: 2 approved with reservations]

Niloofar Shahidi ¹, Omkar Athavale¹, Yuda Munarko¹, Mathias Roesler¹, Kenneth Tran¹

Niloofar Shahidi ¹, Omkar Athavale¹, [...] Yuda Munarko¹, Mathias Roesler¹, Kenneth Tran¹

PUBLISHED 08 Jan 2024

Author details Author details

¹ The Auckland Bioengineering Institute, The University of Auckland, Auckland, Auckland, 1010, New Zealand

Niloofar Shahidi
Roles: Visualization, Writing – Original Draft Preparation

Omkar Athavale
Roles: Conceptualization, Data Curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – Review & Editing

Yuda Munarko
Roles: Data Curation, Formal Analysis, Methodology, Software, Validation

Mathias Roesler
Roles: Conceptualization, Investigation, Methodology, Visualization, Writing – Review & Editing

Kenneth Tran
Roles: Conceptualization, Investigation, Methodology, Supervision, Visualization

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Bioinformatics gateway.

This article is included in the Reproducible Research Data and Software collection.

This article is included in the Python collection.

This article is included in the Hackathons collection.

Abstract

The Stimulating Peripheral Activity to Relieve Conditions (SPARC) data portal is a platform that allows users to access large sets of curated data and tools. Data on the portal is sourced from researchers working in a range of disciplines applied to peripheral nerve activity, including imaging, computational biology, electrophysiology and clinical research. However, many existing tutorials and examples for SPARC do not document examples of interfaces between tools within the SPARC ecosystem. Thus, we see an opportunity to lower the barrier for entry into the SPARC ecosystem, by creating accessible guides to create workflows using multiple SPARC tools. To do this, we have developed an online hub for hosting tutorials that can guide new users through the entire workflow of acquiring data, processing it, and visualizing it, using common, open-source scientific computing tools and crucially, spanning the repertoire of currently available SPARC tools. Our approach encapsulates the workflow within a Jupyter Notebook, which is designed to be readily accessible to new users of SPARC. Having a hub of tutorials that can be easily understood will not only accelerate the growth of the SPARC community but also increase the uptake and usage of tools by members of the community.

Keywords

SPARC tutorial, biological data, 3D mapping, FAIR data, Codeathon

Corresponding author: Niloofar Shahidi

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2024 Shahidi N et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Shahidi N, Athavale O, Munarko Y et al. A tutorial hub for realising the potential of tools within the SPARC ecosystem [version 1; peer review: 2 approved with reservations]. F1000Research 2024, 13:45 (https://doi.org/10.12688/f1000research.138059.1) First published: 08 Jan 2024, 13:45 (https://doi.org/10.12688/f1000research.138059.1) Latest published: 08 Jan 2024, 13:45 (https://doi.org/10.12688/f1000research.138059.1)

Events and goals

This publication is prepared as the outcome of the 2022 SPARC (Stimulating Peripheral Activity to Relieve Conditions) FAIR (Findable, Accessible, Interoperable, and Reusable Codeathon, held 6-8 August. The first edition of this event was held in July 2021. The results of the groups participating in the July 2021 SPARC FAIR Codeathon are collected and accessible on this resource page. In the 2022 SPARC FAIR Codeathon, five groups participated, and more information can be found here.

The goal of the SPARC FAIR Codeathons is to find novel ways of using the SPARC resources, data, or tools to demonstrate and enhance the FAIRness of the data.

Introduction

The SPARC program is a platform supported by the National Institutes of Health (NIH) Common Fund to advance bioelectronic medicine devices.¹ This is done by identifying the spatial distribution of neurons on the surface of organs using flatmaps.² The SPARC Portal hosts over a hundred datasets and resources (continuously being updated) to assist researchers in learning about organ functions, devise medical tools, and ultimately treat diseases.

The SPARC Portal emphasizes on a FAIR (Findable, Accessible, Interoperable and Reusable) repository of curated resources available to the researchers globally. The FAIR Data Principles were proposed in 2016 by Wilkinson et al. in order to increase the reusability of data.³ The SPARC project incorporates multiple open-source tools and platforms with their own documentation and tutorials. However, well-documented tutorials on how to properly utilize these tools in conjunction with SPARC data are lacking and could significantly aid users in seeing the advantages of using FAIR data.

In this Codeathon we aimed at practically improving the reusability of the SPARC Portal resources by giving visual and written instructions on how the available datasets and models can be employed, integrated, and visualized. One of the core platforms of Data Resource Center (DRC) of the SPARC program is $o^{2} S^{2} PARC$ (Open Online Simulations for Stimulating Peripheral Activity to Relieve Conditions).⁴ Researchers can, for example, use $o^{2} S^{2} PARC$ for simulating the peripheral nerve system and instantly observe its impact on pertaining organs. However, the utilization of $o^{2} S^{2} PARC$ is not necessarily straightforward for general users. Therefore, to increase the FAIRness of the SPARC Portal resources, we aimed to show how one can search and use the SPARC datasets using available SPARC tools, in conjunction with other open-access tools, in a structured and comprehensible manner. We lower the barrier for the general user who might be unfamiliar with the SPARC integrated platforms (like $O^{2} S^{2} PARC$ ) to acquire data, process it, and visualize it using common, open-source scientific computing tools.

The purpose of this article is to present the tutorials that were developed during the 2022 SPARC FAIR Codeathon on the procedure of utilizing the SPARC portal tools and resources. These tutorials demonstrated the ease with which a range of tools in the SPARC ecosystem could be used in concert to access, manipulate and interpret SPARC datasets. As a case study, we provide an example of generating a 3D scaffold of the rat stomach and projecting the coordination data of the vagal afferent and efferent. We discuss how this tutorial differs from the existing tutorials and documentation on SPARC Portal. Finally, we describe how our tutorial style can impact the FAIRness of the SPARC Portal and possible future developments.

Methods

This section discusses the improvement of tutorials and documentation on the utilization of the SPARC Portal tools along with the available open-access software packages such as SciPy. Figure 1 demonstrates the general workflow for the tutorials. The first step is to obtain SPARC datasets from the portal using Pennsieve. This free tool is a scalable, graph-based, and cloud-based data management platform that is used for managing scientific datasets on the SPARC Portal. It is used here to download.xlsx files. Once the files are downloaded from a project, the data can be manipulated as needed in order to perform computational analysis on it by employing a programming language such as Python or open-source tools from the SPARC Portal. After the processing step, the data can be visualised in 2D or 3D graphs. In the following, we explain these steps and the cases demonstrated in the tutorial.

Figure 1. The general workflow of utilizing the available datasets on the SPARC Portal.

The.xlsx data files are downloaded using Pennsieve and fed to a program written in Python which uses open-source tools, like those available from SPARC, for preprocessing and analysing the data. After being processed, the data can be visualised in different manners.

Data acquisition

Figure 2 illustrates the above-mentioned workflow using a previously published model of the rat stomach.⁵ We illustrate two applications of the acquired datasets and give detailed instructions on how they can be processed for different applications. Three datasets constitute the basis of these tutorials. They provide the 2D coordinates for different neuronal populations in the rat stomach, namely the efferent vagal neurons and afferent vagal neuron sub-populations of intraganglionic laminar endings (IGLEs) and intramuscular arrays (IMAs). These data are derived from biomedical imaging of stained rat stomach sections. The information regarding these neurons is contained in.xlsx files that are accessible on the SPARC portal from the following links (see also Underlying data):

• Spatial distribution and morphometric characterization of vagal afferents associated with the myenteric plexus of the rat stomach: IGLE_data.xlsx
• Spatial distribution and morphometric characterization of vagal efferents associated with the myenteric plexus of the rat stomach: Efferent_data.xlsx
• Spatial distribution and morphometric characterization of vagal afferents (intramuscular arrays (IMAs)) within the longitudinal and circular muscle layers of the rat stomach: IMA_analyzed_data.xlsx

Figure 2. The workflow of utilizing the available datasets on the SPARC Portal.

The upper panel (the general approach): The.xlsx data files are downloaded from Pennsieve and fed to Python for preprocessing (data extraction and cleaning). Using the available Scaffold Mapping Tools on the SPARC Portal, a geometric 3D scaffold of an organ is obtained. The preprocessed data can be projected on the 3D scaffold to create a data-enriched 3D image of the organ. The lower panel (the rat stomach example): Three different datasets of the spatial distribution of neurons in the rat stomach are downloaded from Pennsieve. The required 2D data (spatial coordinates of x and y) are extracted using Python. A 3D scaffold of the rat stomach is achieved using the SPARC Scaffold Mapping Tools. The preprocessed data of the spatial 2D distribution of neurons is projected on the rat stomach scaffold. The output is a 3D image of the rat stomach in which the neurons are mapped on the surface. The dots correspond to the spatial distribution of neurons and the three colors represent the three different datasets.

In this tutorial, we demonstrate a dataset acquisition mechanism that includes searching for datasets using keywords, e.g., ‘vagal,’ using the search API provided by Pennsieve with output datasets’ id, version, name, and tags. Using the search results, we can now download datasets manually or automatically, where the automatic method is facilitated by a script utilizing the discovery API that can be accessed freely. This mechanism is then helpful in searching for other datasets, which will later be used for different analysis purposes.

Data manipulation

The datasets obtained can be in various formats, for example, JSON, dat, knowledge-based, and xlsx; for the cases we raised, the file is xlsx. Within these datasets, we conserve information regarding the neuron location, innervation area, and anatomical location, whether dorsal or ventral. The neuron’s location is a 2D feature representing the percentage relative distance to an origin situated in the pyloric end of the stomach for the y-axis (left to right direction) and near the esophageal for the z-axis (bottom to top direction). To be able to plot this location, first, we define the boundary of z-axis and y-axis based on the 3D stomach scaffold generated using the Scaffold Mapping Tools of SPARC, and again in our cases we set to [0, 36.7] and [0, 24.6] respectively, where each element is the minimum or maximum point in millimeter. This boundary can calculate the approximate neuron location in the millimeter.

Visualization: Resampling data for simulations

After data manipulation, one can use location data to estimate the probability density of the spatial distribution of neurons in 2D as shown in the upper panel in Figure 2. For this visualization, first, we create a 2D mesh grid with z-axis and y-axis boundaries consisting of 100 points each. Then for each neuron location, the probability density is estimated using Gaussian KDE (Kernel Density Estimation)⁶^,⁷ and resampled by 1000 points. We estimate all three datasets and visualize them in 2D plots. It is interesting that estimates can use methods other than KDE for different analysis needs and datasets, such as the Cauchy and Laplacian kernels.

Visualization: Mapping 2D data to a 3D organ scaffold

Moreover, one can project the 2D points representing the location of neurons to a scaffold of the organ generated by the Scaffold Mapping Tools of SPARC as demonstrated in the lower panel of Figure 2. The scaffold organ obtained is a 3D mesh converted as an STL file to handle it using a Python-based package easily. Figure 3 is the representation of the 2D dataset that we map to the 3D organ scaffold. 3D mapping can be done because the datasets contain information on the anatomical location of neurons, dorsal and ventral.

Figure 3. A 2D map representing the location of neurons in the stomach.

The color difference indicates the type of neuron, and the color intensity indicates the area of the neuron (retrieved from [1]).

The step-by-step instructions on how to use the Python code are available on GitHub (see Software availability).

Results

In this section, we demonstrate the graphical results from the two tutorials that were developed during the Codeathon.

Figure 4 illustrates the results of the first tutorial, data resampling for density estimation simulations using KDE. The simulation includes all three datasets: vagal efferents, IGLEs, and IMAs. Within the plots, the green dot signifies the resampled point regarding vagal, while the backdrop color denotes the density of the data points. Areas in dark red indicate high density, while lighter ones represent lower density. It can be seen here that the highest concentration of vagal efferents is at 50% and above the x-axis approaching the forestomach. As for IGLEs, the distribution is relatively even, with the highest density between 10% and 50% of the x-axis. Meanwhile, IMA is concentrated in the antrum and scattered in the proximal stomach. These plots confirm the distribution of the stomach’s vagal afferent and efferent populations.⁵ They could be an alternative to visualize the distribution more effectively and efficiently. Here, we demonstrate that the datasets and tools provided by SPARC can be easily used and combined with other open-source tools. This tutorial module does not consider dorsal and ventral positions, naturally allowing for more precise visualization. We provide this space for learners as a learning process.

Figure 4. A probability density estimation of neurons on the surface of the stomach.

(a) efferent (b) IGLE (c) IMA. Statistical processes are used to compute a probability density estimation of neurons. This density function can be used to generate and simulate neuronal activity in a network.

Figure 5 illustrates the 3D mapping of the spatial distribution data to the rat stomach scaffold. The datasets for this module are the same as for the previous module, using data features on the ventral and dorsal position as well as the area of innervation. With this visualization, we can see the extent of innervation and later compare it with a 2D density plot. This area is tiny; the magnitude plot is inefficient; therefore, it can be represented well using a colour scale from white to dark green, where dark green represents the most significant area. We used the matplotlib widget module for⁸ interactivity so that the 3D plot could be rotated to observe the distribution of the vagal and area innervation in more detail.

Figure 5. The projection of the spatial locations of neurites on the surface of the rat stomach onto a 3D scaffold of the organ.

The result of our tutorial is the visualization of the neurons in an interactive figure in Jupyter Notebook. The user can select which type of neurons to visualise on the surface of the stomach from three categories: vagal afferent (IGLE), vagal afferent (IMA), and vagal efferent.

Conclusion

Our tutorials differ from the current tutorials on the SPARC Portal in a substantial way because they guide the users through understanding and using multiple SPARC tools to fulfil data manipulation and visualisation needs. Instead of the dense documentation and tutorials on the SPARC portal which mostly targets a single tool and is appropriate for users already familiar with the SPARC program, we have developed tutorials, along with examples, for a broader audience. We demonstrated the feasibility of using SPARC resources to drive our proposed workflow using a rat stomach scaffold as an example. A central benefit of this approach is that it leverages the FAIR principles that underlie these SPARC resources to ensure that the integrated models that are produced also follow the FAIR Data Principles. Our approach is creative because it encapsulates the workflow within a Jupyter Notebook online platform, which is designed to be readily accessible to new users of SPARC. Having tutorials that can be easily understood will not only accelerate the growth of the SPARC community but also increase the uptake and usage of tools by members of the community.

Future directions

These are prototype tutorials for what we envision to be a series of tutorials hosted on a website that serves as an introduction to the SPARC ecosystem and helps new users get started with SPARC tools.

Data availability

The data for the rat stomach example presented as a use case for our tutorials is available from SPARC at the following links:

https://sparc.science/datasets/10?type=dataset&datasetDetailsTab=abstract&path=files%2Fderivative

https://sparc.science/datasets/12?type=dataset&datasetDetailsTab=files&path=files%2Fderivative

https://sparc.science/datasets/11?type=dataset&datasetDetailsTab=files&path=files%2Fderivative

Software availability

Source code available from: https://github.com/SPARC-FAIR-Codeathon/QuiltedTutorials

Archived source code at the time of publication: https://doi.org/10.5281/zenodo.8323579.⁹

License: MIT

Visit our website to find an expanding collection of tutorials: https://quilted-tutorial.github.io/SPARC-guru

The archived source code at the time of publication: https://doi.org/10.5281/zenodo.8323579.⁹

Acknowledgements

All authors were part of the team QuiltedTutorials in the 2022 SPARC FAIR Codeathon. We would like to extend our special thanks to the SPARC Program of the NIH Common Fund and to the organisers of the 2022 SPARC FAIR Codeathon for their support during the planning and development of this project.

References

1. The Sparc Data and Resource Center: 2022. Reference Source
2. Osanlouy M, Bandrowski A, De Bono B, et al.: The SPARC DRC: Building a resource for the autonomic nervous system community. Front. Physiol. 2021; 12. PubMed Abstract | Publisher Full Text | Free Full Text
3. Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al.: The fair guiding principles for scientific data management and stewardship. Scientific Data. 2016; 3: 1–9. 2052-4463. PubMed Abstract | Publisher Full Text | Free Full Text Reference Source
4. O2S2PARC: 2022. Reference Source
5. Powley T, Hudson C, McAdams J, et al.: Vagal intramuscular arrays: The specialized mechanoreceptor arbors that innervate the smooth muscle layers of the stomach examined in the rat. J. Comp. Neurol. 2016; 524: 713–737. PubMed Abstract | Publisher Full Text | Free Full Text
6. Scott DW: Multivariate density estimation and visualization. Handbook of computational statistics: Concepts and methods. 2012; 549–569. Publisher Full Text
7. Silverman BW: Density estimation for statistics and data analysis. Routledge; 2018.
8. Hunter JD: Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 2007; 9(3): 90–95. Publisher Full Text
9. Athavale O, Munarko Y, Tran K, et al.: SPARC-FAIR-Codeathon/QuiltedTutorials: QuiltedTutorials 1.0.1 (v1.0.1). Zenodo. [Software]. 2023. Publisher Full Text

Footnotes

1 https://sparc.science/datasets/10

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 08 Jan 2024

Author details Author details

¹ The Auckland Bioengineering Institute, The University of Auckland, Auckland, Auckland, 1010, New Zealand

Niloofar Shahidi
Roles: Visualization, Writing – Original Draft Preparation

Omkar Athavale
Roles: Conceptualization, Data Curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – Review & Editing

Yuda Munarko
Roles: Data Curation, Formal Analysis, Methodology, Software, Validation

Mathias Roesler
Roles: Conceptualization, Investigation, Methodology, Visualization, Writing – Review & Editing

Kenneth Tran
Roles: Conceptualization, Investigation, Methodology, Supervision, Visualization

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 08 Jan 2024, 13:45

https://doi.org/10.12688/f1000research.138059.1

Copyright

© 2024 Shahidi N et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Shahidi N, Athavale O, Munarko Y et al. A tutorial hub for realising the potential of tools within the SPARC ecosystem [version 1; peer review: 2 approved with reservations]. F1000Research 2024, 13:45 (https://doi.org/10.12688/f1000research.138059.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 08 Jan 2024

Views

5

Reviewer Report 30 Apr 2024

Andrey Andreev, California Institute of Technology, Pasadena, California, USA

Approved with Reservations

https://doi.org/10.5256/f1000research.151233.r254208

The authors provide two data analysis examples based on public open datasets hosted by SPARC, NIH-lead initiative.
The authors provide reproducible code as Jupyter notebooks hosted on github, instruction to get started are clear to someone experienced with that ... Continue reading

The authors provide two data analysis examples based on public open datasets hosted by SPARC, NIH-lead initiative.
The authors provide reproducible code as Jupyter notebooks hosted on github, instruction to get started are clear to someone experienced with that technology, and code runs smoothly without modification (see comment below for one exception).

Two comments on the code: in tutorial 1 there should be a flag or variable that defines mesh resolution. 3D rendering runs slowly on macbook pro 2020, and as a user I would like to be able to speed up 3D figure by decreasing resolution.
In second tutorial line
#=> data_array = data_array[~data_array.isin([np.nan]).any(1)]

causes an error but can be replaced with:

#=> data_array = data_array.dropna()

Also when downloading datasets as a user I need to see progress or ability to estimate time. For example, authors can provide total number of items expected in each dataset.

In the text O2S2PARC is present with capital O as well as lowercase o. Please make this consistent.

The provided article and code allows people unfamiliar with SPARC data (such as this reviewer) to explore datasets and gain understanding of the underlying data. However, I would like to see an example of simple analysis such a co-localization of different neuronal types.

Figure 3 needs a legend (which color corresponds to which cell type?)

Is the rationale for developing the new method (or application) clearly explained?

Yes
Is the description of the method technically sound?

Yes
Are sufficient details provided to allow replication of the method development and its use by others?

Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: calcium imaging in different models and cell types, data analysis, reproducible data analysis and training in python-based data processing

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

10

Reviewer Report 28 Mar 2024

Bhavesh Patel, California Medical Innovations Institute, San Diego, USA

Approved with Reservations

https://doi.org/10.5256/f1000research.151233.r249610

Summary of the article:
The work presented here by the authors is the outcome of their participation at the 2022 Codeathon organized by the Stimulating Peripheral Activity to Relieve Conditions (SPARC) program of the NIH. The SPARC Portal provides ... Continue reading

Summary of the article:
The work presented here by the authors is the outcome of their participation at the 2022 Codeathon organized by the Stimulating Peripheral Activity to Relieve Conditions (SPARC) program of the NIH. The SPARC Portal provides a wealth of FAIR resources generated by researchers supported by the SPARC Program, including datasets and tools. The authors identified a lack of tutorials to conveniently understand and use the tools and datasets. To fill this gap, they developed during the Codeathon new tutorials for using SPARC resources in the form of a Jupyter Notebook. They are presenting their approach and resulting tutorials in this manuscript.

Rationale: Clearly expressed

Method: Well explained overall. Few comments/suggestion:

The main problem is that the SPARC datasets used are not cited. Each SPARC dataset has a DOI and the landing page of the dataset explains how the dataset should be cited if used. Mentioning them in the Data Availability section is not sufficient.
Minor: Are the xlsx files downloaded manually? If so, why was the Pennsieve Discover API, which allows to query files directly, not used?

Replication: Overall, enough details are provided to allow replication of the work.

Conclusion:

“A central benefit of this approach is that it leverages the FAIR principles that underlie these SPARC resources to ensure that the integrated models that are produced also follow the FAIR Data Principles.” I am not sure I understand this. For the resulting Jupyter notebook to be FAIR, it would have to comply with the FAIR Principles for research software. An easy way to do that is to check that the Notebook complies with the FAIR-BioRS guidelines (https://fair-biors.org/docs/guidelines). From the information available in the Code Availability section, the Jupyter Notebook seems to be compliant or almost compliant with the FAIR-BioRS guidelines.

Specific comments on the Introduction section:

"The SPARC program is a platform supported by" - I believe the authors mean "The SPARC Portal is a platform supported by"
"The SPARC Portal hosts over a hundred datasets and resources" - It has been over a hundred for a while now. It is now over 300.
"The SPARC Portal emphasizes on a FAIR (Findable, Accessible, Interoperable and Reusable) repository of curated resources" - I believe the repository itself is not FAIR but it provides FAIR resources so this would read better if formulated as "The SPARC Portal is a repository that provides access to well curated and FAIR (Findable, Accessible, Interoperable and Reusable) datasets and other resources"
“The FAIR Data Principles were proposed in 2016 by Wilkinson et al. in order to increase the reusability of data.” It would be good to add the following afterward for completeness “Accordingly, the SPARC Program has established strict standards for structuring data and metadata as well as guidelines for sharing the resulting data to ensure it is FAIR (Bandrowski A , et al., 2021 [Ref 1]), (Marroquin C , et al., 2023 [ Ref 2]), (Osanlouy M , et al., 2021 [Ref3])”
“The SPARC project incorporates multiple open-source tools and platforms” - Do the authors mean “The SPARC Portal”?

Typos:

Events and goals section: Parenthesis is missing after "Reusable here "FAIR (Findable, Accessible, Interoperable, and Reusable Codeathon"
“It is used here to download.xlsx files.” → Should there be a space after “download”?

Is the rationale for developing the new method (or application) clearly explained?

Yes
Is the description of the method technically sound?

Partly
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Partly

References

1. Bandrowski A, Grethe J, Pilko A, Gillespie T, et al.: SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data. bioRxiv. 2021. Publisher Full Text
2. Marroquin C, Clark J, Portillo D, Soundarajan S, et al.: SODA: Software to Support the Curation and Sharing of FAIR Autonomic Nervous System Data. bioRxiv. 2023. Publisher Full Text
3. Osanlouy M, Bandrowski A, de Bono B, Brooks D, et al.: The SPARC DRC: Building a Resource for the Autonomic Nervous System Community.Front Physiol. 2021; 12: 693735 PubMed Abstract | Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Data standards, FAIR practices, Research software

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 08 Jan 2024

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 1 08 Jan 24	read	read

Bhavesh Patel, California Medical Innovations Institute, San Diego, USA
Andrey Andreev, California Institute of Technology, Pasadena, USA

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

5 Views

30 Apr 2024 | for Version 1

Andrey Andreev, California Institute of Technology, Pasadena, California, USA

5 Views Cite this report Responses(0)

Approved With Reservations

The authors provide two data analysis examples based on public open datasets hosted by SPARC, NIH-lead initiative.
The authors provide reproducible code as Jupyter notebooks hosted on github, instruction to get started are clear to someone experienced with that technology, and code runs smoothly without modification (see comment below for one exception).

Two comments on the code: in tutorial 1 there should be a flag or variable that defines mesh resolution. 3D rendering runs slowly on macbook pro 2020, and as a user I would like to be able to speed up 3D figure by decreasing resolution.
In second tutorial line
#=> data_array = data_array[~data_array.isin([np.nan]).any(1)]

causes an error but can be replaced with:

#=> data_array = data_array.dropna()

Also when downloading datasets as a user I need to see progress or ability to estimate time. For example, authors can provide total number of items expected in each dataset.

In the text O2S2PARC is present with capital O as well as lowercase o. Please make this consistent.

The provided article and code allows people unfamiliar with SPARC data (such as this reviewer) to explore datasets and gain understanding of the underlying data. However, I would like to see an example of simple analysis such a co-localization of different neuronal types.

Figure 3 needs a legend (which color corresponds to which cell type?)

Is the rationale for developing the new method (or application) clearly explained?

Yes
Is the description of the method technically sound?

Yes
Are sufficient details provided to allow replication of the method development and its use by others?

Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

calcium imaging in different models and cell types, data analysis, reproducible data analysis and training in python-based data processing

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

10 Views

28 Mar 2024 | for Version 1

Bhavesh Patel, California Medical Innovations Institute, San Diego, USA

10 Views Cite this report Responses(0)

Approved With Reservations

Summary of the article:
The work presented here by the authors is the outcome of their participation at the 2022 Codeathon organized by the Stimulating Peripheral Activity to Relieve Conditions (SPARC) program of the NIH. The SPARC Portal provides a wealth of FAIR resources generated by researchers supported by the SPARC Program, including datasets and tools. The authors identified a lack of tutorials to conveniently understand and use the tools and datasets. To fill this gap, they developed during the Codeathon new tutorials for using SPARC resources in the form of a Jupyter Notebook. They are presenting their approach and resulting tutorials in this manuscript.

Rationale: Clearly expressed

Method: Well explained overall. Few comments/suggestion:

The main problem is that the SPARC datasets used are not cited. Each SPARC dataset has a DOI and the landing page of the dataset explains how the dataset should be cited if used. Mentioning them in the Data Availability section is not sufficient.
Minor: Are the xlsx files downloaded manually? If so, why was the Pennsieve Discover API, which allows to query files directly, not used?

Replication: Overall, enough details are provided to allow replication of the work.

Conclusion:

“A central benefit of this approach is that it leverages the FAIR principles that underlie these SPARC resources to ensure that the integrated models that are produced also follow the FAIR Data Principles.” I am not sure I understand this. For the resulting Jupyter notebook to be FAIR, it would have to comply with the FAIR Principles for research software. An easy way to do that is to check that the Notebook complies with the FAIR-BioRS guidelines (https://fair-biors.org/docs/guidelines). From the information available in the Code Availability section, the Jupyter Notebook seems to be compliant or almost compliant with the FAIR-BioRS guidelines.

Specific comments on the Introduction section:

"The SPARC program is a platform supported by" - I believe the authors mean "The SPARC Portal is a platform supported by"
"The SPARC Portal hosts over a hundred datasets and resources" - It has been over a hundred for a while now. It is now over 300.
"The SPARC Portal emphasizes on a FAIR (Findable, Accessible, Interoperable and Reusable) repository of curated resources" - I believe the repository itself is not FAIR but it provides FAIR resources so this would read better if formulated as "The SPARC Portal is a repository that provides access to well curated and FAIR (Findable, Accessible, Interoperable and Reusable) datasets and other resources"
“The FAIR Data Principles were proposed in 2016 by Wilkinson et al. in order to increase the reusability of data.” It would be good to add the following afterward for completeness “Accordingly, the SPARC Program has established strict standards for structuring data and metadata as well as guidelines for sharing the resulting data to ensure it is FAIR (Bandrowski A , et al., 2021 [Ref 1]), (Marroquin C , et al., 2023 [ Ref 2]), (Osanlouy M , et al., 2021 [Ref3])”
“The SPARC project incorporates multiple open-source tools and platforms” - Do the authors mean “The SPARC Portal”?

Typos:

Events and goals section: Parenthesis is missing after "Reusable here "FAIR (Findable, Accessible, Interoperable, and Reusable Codeathon"
“It is used here to download.xlsx files.” → Should there be a space after “download”?

Is the rationale for developing the new method (or application) clearly explained?

Yes
Is the description of the method technically sound?

Partly
Are sufficient details provided to allow replication of the method development and its use by others?

Partly
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?

Partly

References

1. Bandrowski A, Grethe J, Pilko A, Gillespie T, et al.: SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data. bioRxiv. 2021. Publisher Full Text
2. Marroquin C, Clark J, Portillo D, Soundarajan S, et al.: SODA: Software to Support the Curation and Sharing of FAIR Autonomic Nervous System Data. bioRxiv. 2023. Publisher Full Text
3. Osanlouy M, Bandrowski A, de Bono B, Brooks D, et al.: The SPARC DRC: Building a Resource for the Autonomic Nervous System Community.Front Physiol. 2021; 12: 693735 PubMed Abstract | Publisher Full Text

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Data standards, FAIR practices, Research software

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

[1] 1. The Sparc Data and Resource Center: 2022. Reference Source

[2] 2. Osanlouy M, Bandrowski A, De Bono B, et al.: The SPARC DRC: Building a resource for the autonomic nervous system community. Front. Physiol. 2021; 12. PubMed Abstract | Publisher Full Text | Free Full Text

[3] 3. Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al.: The fair guiding principles for scientific data management and stewardship. Scientific Data. 2016; 3: 1–9. 2052-4463. PubMed Abstract | Publisher Full Text | Free Full Text Reference Source

[4] 4. O2S2PARC: 2022. Reference Source

[5] 5. Powley T, Hudson C, McAdams J, et al.: Vagal intramuscular arrays: The specialized mechanoreceptor arbors that innervate the smooth muscle layers of the stomach examined in the rat. J. Comp. Neurol. 2016; 524: 713–737. PubMed Abstract | Publisher Full Text | Free Full Text

[6] 6. Scott DW: Multivariate density estimation and visualization. Handbook of computational statistics: Concepts and methods. 2012; 549–569. Publisher Full Text

[7] 7. Silverman BW: Density estimation for statistics and data analysis. Routledge; 2018.

[8] 8. Hunter JD: Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 2007; 9(3): 90–95. Publisher Full Text

[9] 9. Athavale O, Munarko Y, Tran K, et al.: SPARC-FAIR-Codeathon/QuiltedTutorials: QuiltedTutorials 1.0.1 (v1.0.1). Zenodo. [Software]. 2023. Publisher Full Text

A tutorial hub for realising the potential of tools within the SPARC ecosystem

Abstract

Keywords

Events and goals

Introduction

Methods

Figure 1. The general workflow of utilizing the available datasets on the SPARC Portal.

Data acquisition

Figure 2. The workflow of utilizing the available datasets on the SPARC Portal.

Data manipulation

Visualization: Resampling data for simulations

Visualization: Mapping 2D data to a 3D organ scaffold

Figure 3. A 2D map representing the location of neurons in the stomach.

Results

Figure 4. A probability density estimation of neurons on the surface of the stomach.

Figure 5. The projection of the spatial locations of neurites on the surface of the rat stomach onto a 3D scaffold of the organ.

Conclusion

Future directions

Data availability

Software availability

Acknowledgements

References

Footnotes

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated