BISCUIT: An Open-Source Platform for Visual Comparison of Segmentation Models in Bioimage Analysis

Emmanouela Rantsiou; Franziska Oschmann; Lukas von Ziegler; Thomas Wüst; Andrzej J. Rzepiela; Szymon Stoma

doi:10.12688/f1000research.171889.1

Home Browse BISCUIT: An Open-Source Platform for Visual Comparison of Segmentation...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Software Tool Article

BISCUIT: An Open-Source Platform for Visual Comparison of Segmentation Models in Bioimage Analysis

[version 1; peer review: awaiting peer review]

Emmanouela Rantsiou¹, Franziska Oschmann², Lukas von Ziegler², Thomas Wüst², Andrzej J. Rzepiela³, Szymon Stoma ³

Emmanouela Rantsiou¹, Franziska Oschmann², [...] Lukas von Ziegler², Thomas Wüst², Andrzej J. Rzepiela³, Szymon Stoma ³

PUBLISHED 18 Nov 2025

Author details Author details

¹ Histopixel, Wrocław, Poland
² SIS, Scientific IT Services, ETH Zurich, Zurich, Switzerland
³ ScopeM, ETH Zurich Scientific Center for Optical and Electron Microscopy, Zurich, Switzerland

Emmanouela Rantsiou
Roles: Data Curation, Methodology, Software, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Franziska Oschmann
Roles: Conceptualization, Methodology, Software

Lukas von Ziegler
Roles: Conceptualization, Methodology, Software

Thomas Wüst
Roles: Conceptualization, Funding Acquisition, Methodology, Supervision

Andrzej J. Rzepiela
Roles: Conceptualization, Data Curation, Formal Analysis, Methodology, Software, Supervision, Writing – Review & Editing

Szymon Stoma
Roles: Conceptualization, Data Curation, Funding Acquisition, Methodology, Project Administration, Software, Supervision, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the NEUBIAS - the Bioimage Analysts Network gateway.

Abstract

Background

Segmentation in microscopy images is a critical task in bioimage analysis, with many deep learning models available (e.g., Cellpose, Omnipose, StarDist, SAM-based models). However, researchers often face challenges in choosing the most suitable model for their data, as quantitative metrics do not always reflect the biological relevance of segmentation results.

Methods

We developed BISCUIT (BioImage Segmentation Comparison Utility and Interactive Tool), an open-source platform that enables users to run multiple state-of-the-art segmentation algorithms on the same images and visually compare their outputs side-by-side. BISCUIT is implemented as an interactive Jupyter Notebook pipeline, leveraging existing segmentation libraries, and can be executed either via a zero-installation cloud environment (Google Colab) or on local high-performance computing resources.

Results

Using BISCUIT, we demonstrate how visual inspection of segmentation outputs can reveal qualitative differences between algorithms that may be overlooked by abstract performance metrics. For example, in a fluorescence microscopy image dataset, BISCUIT allowed direct comparison of segmentations from Cellpose, Omnipose, and StarDist, highlighting differences in how each algorithm delineated cell boundaries. This visual approach helped identify the model that produced the most biologically plausible segmentation for the dataset.

Conclusions

BISCUIT provides an intuitive platform for bioimage analysts and life scientists to evaluate and “see what really works” on their data. The platform is openly available and extensible, lowering the barrier for researchers to perform rapid, interactive benchmarking of segmentation models on their own microscopy data.

Keywords

Bioimage segmentation, Deep learning, Model comparison, Microscopy, Cellpose, StarDist, Visual assessment, Open source tool

Corresponding authors: Andrzej J. Rzepiela, Szymon Stoma

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2025 Rantsiou E et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Rantsiou E, Oschmann F, von Ziegler L et al. BISCUIT: An Open-Source Platform for Visual Comparison of Segmentation Models in Bioimage Analysis [version 1; peer review: awaiting peer review]. F1000Research 2025, 14:1277 (https://doi.org/10.12688/f1000research.171889.1) First published: 18 Nov 2025, 14:1277 (https://doi.org/10.12688/f1000research.171889.1) Latest published: 18 Nov 2025, 14:1277 (https://doi.org/10.12688/f1000research.171889.1)

Introduction

Accurate segmentation in microscopy images is a foundational step in many biological studies,^1,2 enabling quantitative analysis of morphology, distribution, and behavior. In recent years, deep learning methods have achieved state-of-the-art performance in cell segmentation.^3–11 Notably, generalist frameworks like Cellpose^12,13 and its variants (e.g. Omnipose^14,15 for complex bacterial shapes) can segment a wide range of cell types without retraining, and specialized methods like StarDist excel at nuclei segmentation by representing objects as star-convex polygons.^16–19 With such a diverse toolkit of algorithms available, a new challenge arises^20,21: how to determine which segmentation method works best for a given dataset or experimental context, in a fast, user-friendly, and easily repeatable manner. Traditionally, researchers compare algorithms using quantitative metrics, such as Intersection-over-Union (IoU) or Dice scores, against a ground-truth segmentation.^3,22–28 However, this approach is problematic because it requires the availability of ground truth, which is often difficult and time-consuming to obtain in practice.²⁰ Different tools may expect ground-truth in different ways, adding further complexity. As a result, despite the lack of formal rigor, many researchers often rely on visual inspection of segmentation results to assess quality. In everyday practice, this “looking at the images” approach becomes the decisive step, as it directly reflects biological plausibility. BISCUIT²⁹ is designed precisely around this idea: instead of requiring ground truth, it enables side-by-side visual comparison of outputs from multiple segmentation methods.

Currently, there is a lack of user-friendly tools for directly comparing multiple segmentation algorithms on the same images in a visual, interactive manner. Researchers often need to run each algorithm separately and manually overlay or juxtapose results, which is time-consuming and requires technical scripting skills. To address this gap, we present BISCUIT, the BioImage Segmentation Comparison Utility and Interactive Tool.²⁹ BISCUIT is an open-source platform for visually comparing state-of-the-art segmentation models on microscopy images. It was designed with bioimage analysts and life scientists in mind, providing an intuitive way to evaluate segmentation quality across different algorithms without deep expertise in each algorithm’s code or parameters. By facilitating side-by-side visualization of segmentation outputs, BISCUIT enables users to leverage their domain knowledge and visual intuition when selecting a model, rather than relying solely on summary statistics.

By enabling rapid, visual benchmarking, BISCUIT complements existing evaluation methods and lowers the barrier for scientists to adopt the most suitable segmentation tools.

Methods

Implementation

BISCUIT is implemented as an interactive Jupyter Notebook pipeline written in Python. The core functionality of BISCUIT centers on running multiple segmentation algorithms on the same input images and aggregating their outputs for side-by-side visualization. To achieve this, BISCUIT interfaces with open-source segmentation libraries and pretrained models. In the current version, three state-of-the-art families of cell segmentation models are integrated by default: Cellpose, Omnipose, and StarDist. These models were chosen because they represent widely-used and complementary approaches to segmentation: Cellpose for general-purpose cell and nucleus segmentation,¹² Omnipose as an extension¹⁴ of Cellpose tailored to handle challenging morphologies like elongated or branched cells,¹⁵ and StarDist for precise nuclear segmentation using shape priors.¹⁶ In addition, several models from the BioImage Model Zoo, a community repository of pretrained models for bioimage analysis,³⁰ have been included, further broadening the range of available approaches. BISCUIT’s modular architecture enables the addition of new models, ensuring the platform can further evolve.

Under the hood, BISCUIT applies each selected model to the input dataset and collects the results. Users provide microscopy images (e.g., TIFF or PNG or any other general format supported by the Pillow Python library), which are processed through each model’s Python API with the pre-trained weights. For example, Cellpose and StarDist are invoked via their respective Python libraries to segment images. Most of the computations can leverage GPU acceleration, making the prediction fast (with multiple models and multiple images).

After segmentation, BISCUIT focuses on the visualization and comparison of results. Segmentation outputs from each model are typically instance labels, binary masks, or probability maps. BISCUIT renders these outputs in an interactive manner, for example by overlaying colored segmentation masks on the original images, or by showing side-by-side panels (original image next to the segmentation result from each model). The interactive notebook allows users to scroll through image sets. This design was informed by the principle of “visual first” evaluation, prioritizing clear visualization of segmentation boundaries, differences in object counts, and other qualitative features across models.

From a technical standpoint, BISCUIT requires a Python 3 environment with several key libraries installed. These include the deep learning frameworks and model-specific dependencies (e.g., TensorFlow or PyTorch for StarDist and Cellpose, respectively, as well as the Cellpose/Omnipose and StarDist packages themselves), and common image processing libraries such as NumPy, OpenCV, and scikit-image. The Jupyter notebook environment also uses matplotlib for plotting and image display. We have provided an environment configuration (e.g., a requirements.txt and Conda environment file in the repository) to ensure that users can install all necessary packages. Because deep learning models are computationally intensive, a machine with a modern GPU and sufficient memory is recommended for local execution of BISCUIT, especially on large collections of images.

Operation

The operation of BISCUIT is designed to be straightforward for end-users, requiring minimal software installation or configuration. We offer two primary modes of use:

1. Zero-Installation via Browser

Users can run BISCUIT directly in their web browser using Google Colab. A one-click link (Run BISCUIT Now!) is provided on the project website,²⁹ which opens the BISCUIT Google Colab Notebook. In this mode, all necessary dependencies and model weights are automatically fetched within the Colab environment. No local installation is needed, and the user is only required to have a Google account and an internet connection. Once the notebook is open, the BISCUIT interface guides the user through each step. The workflow in the notebook is as follows:

• Setup: The notebook will first install required libraries (such as the segmentation model packages) in the Colab session.
• Input Data: Example microscopy images are provided within the notebook, allowing users to get started immediately. Users can upload their own images (Colab provides an upload widget). In subsequent steps, users specify the channel to be analyzed and define the region of interest within the images.
• Model Selection: The notebook interface allows users to select segmentation models from a searchable table (see Figure 1) that provides details such as model family, architecture, version, target, modality, dimensionality, training data, strengths/limitations, expected channels, and documentation links. Users can currently choose from 11 available models.
• Running Segmentation: After selection, the user executes the notebook cell to run the segmentation. BISCUIT will process the images with each selected model sequentially and store the results.
• Visualization of Results: Once processing is complete, BISCUIT provides an interactive interface to inspect and compare model outputs (see Figure 2). For each selected image, the notebook displays a panel that includes the raw image, an overlap map highlighting agreements and disagreements between models, individual instance masks, and outline overlays on the raw data. Models are compared in pairs, and a bar plot summarizes per-model disagreement scores (mean ± SD), offering a quantitative complement to the visual inspection. The mean disagreement score (for a given model) is calculated as the mean of all model-pair segmentation pixel-based differences over all analysed images. Assuming that model prediction inaccuracies are uncorrelated between models, the model with the lowest score yields predictions closest to the ground truth.^31–35 Users can switch between models and images using dropdown menus and sliders, enabling fast, side-by-side evaluation of segmentation performance.
• Segmentation output: Based on the comparison plots and visual inspection, select the best-performing model from an interactive list, then apply it to segment the entire image stack. Users may also upload additional files for processing, and the resulting segmented images are saved for downstream analysis or storage.

2. Local or HPC Installation

Figure 1. Model selection interface in BISCUIT.

The searchable table enables users to filter models by various parameters (e.g., target, modality, or model family) and find the segmentation tools best suited to their data. Once selected, models can be applied directly within the notebook environment.

Figure 2. Example of pairwise comparison of segmentation models in BISCUIT.

For a selected image, the interface displays the raw input, an overlap map indicating agreements and disagreements between two models, individual instance segmentations, and outline overlays on the raw data. A summary bar plot further shows mean semantic disagreement (±SD) across all models, enabling both qualitative and quantitative comparison. This figure compares Model 1 (nuclei, left; lowest mean semantic difference) with Model 2 (worm_omni, right; highest mean semantic difference).

For users requiring more control or aiming to run large-scale analyses, BISCUIT can be installed on local machines or HPC clusters. The software is open-source and available in a GitHub repository, which includes documentation for installation. Installing BISCUIT involves setting up the Python environment with the required libraries and downloading the pre-trained weights for the segmentation models (the repository provides instructions to fetch these assets). The minimal system requirements for running BISCUIT locally include a Python 3.8+ environment, approximately 4–8 GB of RAM (depending on image sizes), and CUDA-compatible GPU with at least 16 GB of GPU memory to accelerate model inference. The package dependencies include the main deep learning frameworks (TensorFlow 2.x for StarDist and PyTorch for Cellpose/Omnipose).

When running locally, users can either launch the Jupyter Notebook interface or integrate BISCUIT’s components into their own pipelines. For instance, an imaging core facility might deploy BISCUIT on a server and run the notebook for various user projects, possibly connecting to a web interface for uploading images.

The workflow overview remains similar in the local scenario: load images, run the selected models, and then review the outputs. On an HPC cluster, one might use JupyterLab to provide the same notebook experience to users. A key feature of BISCUIT is its scalability, following a ‘prototype then scale’ approach. Users can rapidly test models on a few images in the browser and then move to an HPC deployment to process hundreds or thousands. Because the same model versions are used across environments, results remain consistent, enabling researchers to progress seamlessly from exploration to full dataset analysis without switching tools.

Conclusions/Discussion

We have introduced BISCUIT, an open-source interactive platform to compare and evaluate bioimage segmentation models visually, and illustrated how it can assist researchers in selecting the most appropriate segmentation model for their bioimage analysis needs. In doing so, we address a critical gap in the bioimage analysis workflow: the ability to benchmark segmentation algorithms based on qualitative output characteristics and biological plausibility, not only numerical performance metrics.

Side-by-side visualization of segmentation results can reveal strengths and weaknesses of algorithms that aggregate metrics might hide. BISCUIT puts the expert “in the loop” by enabling direct visual inspection, thus empowering users to apply their biological knowledge when evaluating models. This approach aligns with the way many image scientists inherently validate results - by looking at overlays and pictures - and BISCUIT formalizes and streamlines that process.

Benefits and Unique Features: The advantages of BISCUIT can be summarized in three main points, echoing the design principles outlined on the project’s website: Zero Setup, Scalable by Design, and Visual-First. Zero Setup refers to the ease of use via a web browser with no installation, which lowers the entry barrier for non-technical users. Scalable by Design means that BISCUIT can be run on modest datasets in the cloud or scaled to large datasets on HPC, providing a continuum from quick testing to large-scale application Visual-First emphasizes the focus on qualitative, image-level assessment, which is the core of what BISCUIT offers. To our knowledge, BISCUIT is one of the first tools specifically catering to interactive model output comparison in the context of bioimage segmentation. While some existing software (e.g., image analysis platforms like Ilastik,³⁶ Napari,³⁷ or Fiji³⁸), allow running multiple algorithms or plugins on images, they often do not provide an integrated side-by-side comparison workflow or require substantial user setup. BISCUIT’s contribution is in unifying multiple segmentation approaches under one roof.

Limitations: Despite its utility, BISCUIT has some limitations that we aim to address in future work. First, the platform currently supports a defined set of models (11 in total). If users need to compare other algorithms (for example, Ilastik classical segmentation, or proprietary software outputs), they may need to invest some effort to integrate those into the BISCUIT framework. Second, complementary scores, on top of the mean model disagreement, could be implemented. For instance, if ground truth is available, showing the metric scores for each model, or if not, perhaps asking the user to flag preferred segmentation in images and tallying a “preference count”.

The development of BISCUIT also opens up community-driven possibilities.

For example, the platform can be extended to compare other classes of image-processing models. That includes object detection models, denoising models, or image classification models. In each class, numerous models are developed, and visual inspection can bring similar benefits as in the case of segmentation.

Another exciting direction is using BISCUIT in educational settings: for teaching microscopy image analysis, instructors could use BISCUIT to demonstrate how different algorithms behave on the same data, helping students visually grasp concepts like under- vs over-segmentation, false positives vs false negatives, etc.

In conclusion, BISCUIT addresses an important need in the era of diverse AI-driven image analysis methods: it helps bridge the gap between algorithm developers and end-users by providing a simple yet powerful means for straightforward comparison of segmentation approaches. We believe this approach will contribute to more reliable and reproducible image analyses, as the human expert remains engaged in the validation loop rather than deferring entirely to automated metrics. As bioimage informatics advances, tools like BISCUIT will be essential for helping researchers leverage computational methods to extract accurate biological insights.

Software availability

• Software available from: BISCUIT project website – https://biscuit.let-your-data-speak.com Source code available from: GitHub – https://github.com/ScopeM/biscuit
• License: MIT License (open-source software).

Declaration of AI-assisted writing

During the preparation of this manuscript, the authors used ChatGPT (GPT-5, OpenAI, 2025) to assist in improving phrasing, grammar, and clarity, as well as for help with summarizing/shortening and rewording text sections. All scientific content, interpretations, and conclusions were written, reviewed, and approved by the authors, who take full responsibility for the final manuscript.

Data availability

Underlying data: No new primary datasets were generated for this Software Tool article. All example images used to demonstrate BISCUIT’s functionality are publicly available and redistributed within the BISCUIT repository.

The example image subsets included are as follows:

• TEM connectomics datasets: Two 256 × 256 cutouts were derived from 1,024 × 1,024 original images available at https://sites.google.com/view/connectomics. The samples correspond to image mask1079 from the Kasthuri++ dataset (folder Test_In) and image mask0114 from the Lucchi++ dataset (folder Test_In).
• DeepBacs dataset (E. coli brightfield images): Two 256 × 256 cutouts were derived from 1,024 × 1,024 original test images in the DeepBacs training dataset at https://zenodo.org/records/5550935. The samples correspond to images pos2_fr1 and pos7_fr80 from the test/brightfield folder.
• DeepBacs dataset (Bacillus subtilis fluorescence images): Two 256 × 256 cutouts were derived from 1,024 × 1,024 original test images in the DeepBacs training dataset at https://zenodo.org/records/5639253. The samples correspond to images test_2 and test_9 from the test/instance_segmentation_GT folder.
• Data Science Bowl dataset: Four 256 × 256 cutouts were derived from fluorescent nuclei images from the Data Science Bowl 2018 Kaggle competition https://www.kaggle.com/competitions/data-science-bowl-2018.

Acknowledgments

The authors thank all members of the ETH Zurich Scientific IT Services (SIS) teams who contributed to the development and testing of BISCUIT. Community beta-testers are acknowledged for their insightful suggestions, which helped improve BISCUIT’s features and documentation.

References

1. Carpenter AE, Jones TR, Lamprecht MR, et al.: CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 2006; 7: R100. PubMed Abstract | Publisher Full Text | Free Full Text
2. Meijering E: A bird’s-eye view of deep learning in bioimage analysis. Comput. Struct. Biotechnol. J. 2020; 18: 2312–2325. PubMed Abstract | Publisher Full Text | Free Full Text
3. Ronneberger O, Fischer P, Brox T: U-Net: Convolutional Networks for Biomedical Image Segmentation. Lecture Notes in Computer Science. Cham: Springer International Publishing; 2015; pp. 234–241. Publisher Full Text
4. Moen E, Bannon D, Kudo T, et al.: Deep learning for cellular image analysis. Nat. Methods. 2019; 16: 1233–1246. PubMed Abstract | Publisher Full Text | Free Full Text
5. Xu J, Zhou D, Deng D, et al.: Deep learning in cell image analysis. Intell Comput. 2022; 2022: 1–15. Publisher Full Text
6. Woodhams B, Uhlmann V: From images to understanding: Advances in deep learning for cellular dynamics analysis. Curr. Opin. Cell Biol. 2025; 97: 102585. Publisher Full Text
7. Archit A, Freckmann L, Nair S, et al.: Segment Anything for microscopy. Nat. Methods. 2025; 22: 579–591. PubMed Abstract | Publisher Full Text | Free Full Text
8. Cao J, Wenzel J, Zhang S, et al.: Rethinking deep learning in bioimaging through a data centric lens. Npj Imaging. 2025; 3: 29. PubMed Abstract | Publisher Full Text | Free Full Text
9. Virdi AJ, Joglekar AP: Cell-APP: A generalizable method for cell annotation and cell-segmentation model training. Mol. Biol. Cell. 2025; mbcE25020076. PubMed Abstract | Publisher Full Text
10. Achard C, Kousi T, Frey M, et al.: CellSeg3D, Self-supervised 3D cell segmentation for fluorescence microscopy. elife. 2025; 13: 13. Publisher Full Text
11. Pang M, Roy TK, Wu X, et al.: CelloType: a unified model for segmentation and classification of tissue images. Nat. Methods. 2025; 22: 348–357. PubMed Abstract | Publisher Full Text | Free Full Text
12. Stringer C, Wang T, Michaelos M, et al.: Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods. 2021; 18: 100–106. PubMed Abstract | Publisher Full Text
13. Pachitariu M, Rariden M, Stringer C: Cellpose-SAM: superhuman generalization for cellular segmentation. bioRxiv. 2025. Publisher Full Text
14. Lab B: TrackMate-Omnipose. ImageJ Wiki. Reference Source
15. Cutler KJ, Stringer C, Lo TW, et al.: Omnipose: a high-precision morphology-independent solution for bacterial cell segmentation. Nat. Methods. 2022; 19: 1438–1448. PubMed Abstract | Publisher Full Text | Free Full Text
16. stardist: StarDist - Object Detection with Star-convex Shapes. Github. Reference Source
17. Schmidt U, Weigert M, Broaddus C, et al.: Cell detection with star-convex polygons. arXiv [cs.CV]. 2018. Reference Source
18. Weigert M, Schmidt U, Haase R, et al.: Star-convex polyhedra for 3D object detection and segmentation in microscopy. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE; 2020. Publisher Full Text
19. Weigert M, Schmidt U: Nuclei instance segmentation and classification in histopathology images with StarDist. arXiv [cs.CV]. 2022. Reference Source
20. Caicedo JC, Goodman A, Karhohs KW, et al.: Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl. Nat. Methods. 2019; 16: 1247–1253. PubMed Abstract | Publisher Full Text | Free Full Text
21. Ulman V, Maška M, Magnusson KEG, et al.: An objective comparison of cell-tracking algorithms. Nat. Methods. 2017; 14: 1141–1152. PubMed Abstract | Publisher Full Text | Free Full Text
22. Dice LR: Measures of the amount of ecologic association between species. Ecology. 1945; 26: 297–302. Publisher Full Text
23. Caicedo JC, Roth J, Goodman A, et al.: Evaluation of deep learning strategies for nucleus segmentation in fluorescence images. Cytometry A. 2019; 95: 952–965. PubMed Abstract | Publisher Full Text | Free Full Text
24. Hirling D, Tasnadi E, Caicedo J, et al.: Segmentation metric misinterpretations in bioimage analysis. Nat. Methods. 2024; 21: 213–216. PubMed Abstract | Publisher Full Text | Free Full Text
25. Maier-Hein L, Eisenmann M, Reinke A, et al.: Author Correction: Why rankings of biomedical image analysis competitions should be interpreted with care. Nat. Commun. 2019; 10: 588. PubMed Abstract | Publisher Full Text | Free Full Text
26. Müller D, Soto-Rey I, Kramer F: Towards a guideline for evaluation metrics in medical image segmentation. BMC. Res. Notes. 2022; 15: 210. PubMed Abstract | Publisher Full Text | Free Full Text
27. Maier-Hein L, Reinke A, Godau P, et al.: Metrics reloaded: recommendations for image analysis validation. Nat. Methods. 2024; 21: 195–212. PubMed Abstract | Publisher Full Text | Free Full Text
28. Karmakar R, Nørrelykke SF: SoftPQ: Robust instance segmentation evaluation via soft matching and tunable thresholds. arXiv [cs.CV]. 2025. Reference Source
29. BISCUIT: BISCUIT.Reference Source
30. Ouyang W, Beuttenmueller F, Gómez-de-Mariscal E, et al.: BioImage Model Zoo: A community-driven resource for accessible deep learning in BioImage analysis. bioRxiv. 2022. Publisher Full Text
31. Sims Z, Strgar L, Thirumalaisamy D, et al.: SEG: Segmentation Evaluation in absence of Ground truth labels. bioRxivorg. 2023. Publisher Full Text
32. Warfield SK, Zou KH, Wells WM: Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans. Med. Imaging. 2004; 23: 903–921. PubMed Abstract | Publisher Full Text | Free Full Text
33. Kohlberger T, Singh V, Alvino C, et al.: Evaluating segmentation error without ground truth. Med. Image Comput. Comput. Assist Interv. 2012; 7510: 528–536. Publisher Full Text
34. Faska Z, Khrissi L, Haddouch K, et al.: A robust and consistent stack generalized ensemble-learning framework for image segmentation. J. Eng. Appl. Sci. 2023; 70: 70. Publisher Full Text
35. Lehtinen J, Munkberg J, Hasselgren J, et al.: Noise2Noise: Learning image restoration without clean data. arXiv [cs.CV]. 2018. Publisher Full Text
36. Berg S, Kutra D, Kroeger T, et al.: Ilastik: Interactive machine learning for (bio)image analysis. Nat. Methods. 2019; 16: 1226–1232. PubMed Abstract | Publisher Full Text
37. napari: a fast, interactive viewer for multi-dimensional images in Python. yesnapari. . http
38. Schindelin J, Arganda-Carreras I, Frise E, et al.: Fiji: an open-source platform for biological-image analysis. Nat. Methods. 2012; 9: 676–682. PubMed Abstract | Publisher Full Text | Free Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 18 Nov 2025

Author details Author details

¹ Histopixel, Wrocław, Poland
² SIS, Scientific IT Services, ETH Zurich, Zurich, Switzerland
³ ScopeM, ETH Zurich Scientific Center for Optical and Electron Microscopy, Zurich, Switzerland

Emmanouela Rantsiou
Roles: Data Curation, Methodology, Software, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Franziska Oschmann
Roles: Conceptualization, Methodology, Software

Lukas von Ziegler
Roles: Conceptualization, Methodology, Software

Thomas Wüst
Roles: Conceptualization, Funding Acquisition, Methodology, Supervision

Andrzej J. Rzepiela
Roles: Conceptualization, Data Curation, Formal Analysis, Methodology, Software, Supervision, Writing – Review & Editing

Szymon Stoma
Roles: Conceptualization, Data Curation, Funding Acquisition, Methodology, Project Administration, Software, Supervision, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 18 Nov 2025, 14:1277

https://doi.org/10.12688/f1000research.171889.1

Copyright

© 2025 Rantsiou E et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Rantsiou E, Oschmann F, von Ziegler L et al. BISCUIT: An Open-Source Platform for Visual Comparison of Segmentation Models in Bioimage Analysis [version 1; peer review: awaiting peer review]. F1000Research 2025, 14:1277 (https://doi.org/10.12688/f1000research.171889.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 18 Nov 2025

Open Peer Review

Reviewer Status

AWAITING PEER REVIEW

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

[1] 1. Carpenter AE, Jones TR, Lamprecht MR, et al.: CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 2006; 7: R100. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Meijering E: A bird’s-eye view of deep learning in bioimage analysis. Comput. Struct. Biotechnol. J. 2020; 18: 2312–2325. PubMed Abstract | Publisher Full Text | Free Full Text

[3] 3. Ronneberger O, Fischer P, Brox T: U-Net: Convolutional Networks for Biomedical Image Segmentation. Lecture Notes in Computer Science. Cham: Springer International Publishing; 2015; pp. 234–241. Publisher Full Text

[4] 4. Moen E, Bannon D, Kudo T, et al.: Deep learning for cellular image analysis. Nat. Methods. 2019; 16: 1233–1246. PubMed Abstract | Publisher Full Text | Free Full Text

[5] 5. Xu J, Zhou D, Deng D, et al.: Deep learning in cell image analysis. Intell Comput. 2022; 2022: 1–15. Publisher Full Text

[6] 6. Woodhams B, Uhlmann V: From images to understanding: Advances in deep learning for cellular dynamics analysis. Curr. Opin. Cell Biol. 2025; 97: 102585. Publisher Full Text

[7] 7. Archit A, Freckmann L, Nair S, et al.: Segment Anything for microscopy. Nat. Methods. 2025; 22: 579–591. PubMed Abstract | Publisher Full Text | Free Full Text

[8] 8. Cao J, Wenzel J, Zhang S, et al.: Rethinking deep learning in bioimaging through a data centric lens. Npj Imaging. 2025; 3: 29. PubMed Abstract | Publisher Full Text | Free Full Text

[9] 9. Virdi AJ, Joglekar AP: Cell-APP: A generalizable method for cell annotation and cell-segmentation model training. Mol. Biol. Cell. 2025; mbcE25020076. PubMed Abstract | Publisher Full Text

[10] 10. Achard C, Kousi T, Frey M, et al.: CellSeg3D, Self-supervised 3D cell segmentation for fluorescence microscopy. elife. 2025; 13: 13. Publisher Full Text

[11] 11. Pang M, Roy TK, Wu X, et al.: CelloType: a unified model for segmentation and classification of tissue images. Nat. Methods. 2025; 22: 348–357. PubMed Abstract | Publisher Full Text | Free Full Text

[12] 12. Stringer C, Wang T, Michaelos M, et al.: Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods. 2021; 18: 100–106. PubMed Abstract | Publisher Full Text

[13] 13. Pachitariu M, Rariden M, Stringer C: Cellpose-SAM: superhuman generalization for cellular segmentation. bioRxiv. 2025. Publisher Full Text

[14] 14. Lab B: TrackMate-Omnipose. ImageJ Wiki. Reference Source

[15] 15. Cutler KJ, Stringer C, Lo TW, et al.: Omnipose: a high-precision morphology-independent solution for bacterial cell segmentation. Nat. Methods. 2022; 19: 1438–1448. PubMed Abstract | Publisher Full Text | Free Full Text

[16] 16. stardist: StarDist - Object Detection with Star-convex Shapes. Github. Reference Source

[17] 17. Schmidt U, Weigert M, Broaddus C, et al.: Cell detection with star-convex polygons. arXiv [cs.CV]. 2018. Reference Source

[18] 18. Weigert M, Schmidt U, Haase R, et al.: Star-convex polyhedra for 3D object detection and segmentation in microscopy. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE; 2020. Publisher Full Text

[19] 19. Weigert M, Schmidt U: Nuclei instance segmentation and classification in histopathology images with StarDist. arXiv [cs.CV]. 2022. Reference Source

[20] 20. Caicedo JC, Goodman A, Karhohs KW, et al.: Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl. Nat. Methods. 2019; 16: 1247–1253. PubMed Abstract | Publisher Full Text | Free Full Text

[21] 21. Ulman V, Maška M, Magnusson KEG, et al.: An objective comparison of cell-tracking algorithms. Nat. Methods. 2017; 14: 1141–1152. PubMed Abstract | Publisher Full Text | Free Full Text

[22] 22. Dice LR: Measures of the amount of ecologic association between species. Ecology. 1945; 26: 297–302. Publisher Full Text

[23] 23. Caicedo JC, Roth J, Goodman A, et al.: Evaluation of deep learning strategies for nucleus segmentation in fluorescence images. Cytometry A. 2019; 95: 952–965. PubMed Abstract | Publisher Full Text | Free Full Text

[24] 24. Hirling D, Tasnadi E, Caicedo J, et al.: Segmentation metric misinterpretations in bioimage analysis. Nat. Methods. 2024; 21: 213–216. PubMed Abstract | Publisher Full Text | Free Full Text

[25] 25. Maier-Hein L, Eisenmann M, Reinke A, et al.: Author Correction: Why rankings of biomedical image analysis competitions should be interpreted with care. Nat. Commun. 2019; 10: 588. PubMed Abstract | Publisher Full Text | Free Full Text

[26] 26. Müller D, Soto-Rey I, Kramer F: Towards a guideline for evaluation metrics in medical image segmentation. BMC. Res. Notes. 2022; 15: 210. PubMed Abstract | Publisher Full Text | Free Full Text

[27] 27. Maier-Hein L, Reinke A, Godau P, et al.: Metrics reloaded: recommendations for image analysis validation. Nat. Methods. 2024; 21: 195–212. PubMed Abstract | Publisher Full Text | Free Full Text

[28] 28. Karmakar R, Nørrelykke SF: SoftPQ: Robust instance segmentation evaluation via soft matching and tunable thresholds. arXiv [cs.CV]. 2025. Reference Source

[29] 29. BISCUIT: BISCUIT.Reference Source

[30] 30. Ouyang W, Beuttenmueller F, Gómez-de-Mariscal E, et al.: BioImage Model Zoo: A community-driven resource for accessible deep learning in BioImage analysis. bioRxiv. 2022. Publisher Full Text

[31] 31. Sims Z, Strgar L, Thirumalaisamy D, et al.: SEG: Segmentation Evaluation in absence of Ground truth labels. bioRxivorg. 2023. Publisher Full Text

[32] 32. Warfield SK, Zou KH, Wells WM: Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans. Med. Imaging. 2004; 23: 903–921. PubMed Abstract | Publisher Full Text | Free Full Text

[33] 33. Kohlberger T, Singh V, Alvino C, et al.: Evaluating segmentation error without ground truth. Med. Image Comput. Comput. Assist Interv. 2012; 7510: 528–536. Publisher Full Text

[34] 34. Faska Z, Khrissi L, Haddouch K, et al.: A robust and consistent stack generalized ensemble-learning framework for image segmentation. J. Eng. Appl. Sci. 2023; 70: 70. Publisher Full Text

[35] 35. Lehtinen J, Munkberg J, Hasselgren J, et al.: Noise2Noise: Learning image restoration without clean data. arXiv [cs.CV]. 2018. Publisher Full Text

[36] 36. Berg S, Kutra D, Kroeger T, et al.: Ilastik: Interactive machine learning for (bio)image analysis. Nat. Methods. 2019; 16: 1226–1232. PubMed Abstract | Publisher Full Text

[37] 37. napari: a fast, interactive viewer for multi-dimensional images in Python. yesnapari. . http

[38] 38. Schindelin J, Arganda-Carreras I, Frise E, et al.: Fiji: an open-source platform for biological-image analysis. Nat. Methods. 2012; 9: 676–682. PubMed Abstract | Publisher Full Text | Free Full Text

BISCUIT: An Open-Source Platform for Visual Comparison of Segmentation Models in Bioimage Analysis

Abstract

Background

Methods

Results

Conclusions

Keywords

Introduction

Methods

Implementation

Operation

Figure 1. Model selection interface in BISCUIT.

Figure 2. Example of pairwise comparison of segmentation models in BISCUIT.

Conclusions/Discussion

Software availability

Declaration of AI-assisted writing

Data availability

Acknowledgments

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated