Keywords
bioinformatics; open source; open science; open data; machine learning
This article is included in the Bioinformatics Open Source Conference (BOSC) collection.
The 25th annual Bioinformatics Open Source Conference (BOSC 2024, open-bio.org/events/bosc-2024) was part of the 2024 conference on Intelligent Systems for Molecular Biology (ISMB 2024). Launched in 2000 and held yearly since, BOSC is the premier meeting covering open-source bioinformatics and open science.
ISMB 2024 was held in Montréal, Canada, with an online participation option. A total of nearly 2000 people attended; about 200 people participated in BOSC sessions.
Over the course of two days, BOSC covered a wide range of topics in open science and open source bioinformatics, including Data Analysis, Open Data, Visualization, Developer Tools and Libraries, Standards and Frameworks for Open Science, and Open AI/ML. Mélanie Courtot delivered an impactful first keynote with a perspective on how “The Data Shows We Need Better Data”. The second keynote speaker, Andrew Su, discussed “Open Data, Knowledge Graphs, and Large Language Models.” BOSC ended with a panel, “Open Source AI/ML: A Game Changer for Bioinformatics?,” in which Lawrence Hunter and Thomas Hervé Mboa Nkoudou joined BOSC’s keynote speakers as panelists.
Immediately following BOSC, the CollaborationFest was held at Montréal’s University of Québec campus. First launched in 2010, CoFest is a collaborative work event held yearly around BOSC. This year’s CoFest included 42 participants who worked together on 10 projects.
bioinformatics; open source; open science; open data; machine learning
The 25th annual Bioinformatics Open Source Conference, BOSC 2024 (open-bio.org/events/bosc-2024), took place in Montréal, Canada as part of the 2024 conference on Intelligent Systems for Molecular Biology (ISMB 2024). A total of nearly 2000 people attended ISMB 2024; about 200 people participated in BOSC sessions (Figure 1). Since 2000, BOSC has consistently championed the interests of the Open Bioinformatics Foundation (OBF) – its parent organization – in the promotion of open-source software and Open Science practices, and it has been part of ISMB on all but two occasions. Our website has a timeline of all the BOSCs: https://www.open-bio.org/events/about/.
BOSC is entirely volunteer-run. Nomi Harris, a program manager at the Lawrence Berkeley National Laboratory, has led BOSC since 2012. The organizing committee (Figure 2) and a dedicated review panel of 25 experts ensured a rigorous abstract selection process, with each submission receiving three independent reviews. Notably, BOSC’s commitment to transparency extends to its review process and rubric, documented and publicly available since 2020.
The conference began with a welcome address from chair Nomi Harris (Figure 3), followed by an introduction to the Open Bioinformatics Foundation by OBF Treasurer Heather Wiencko (Figure 4), where she described some of OBF’s undertakings, including the OBF Event Fellowship, which offers funds to help promote diverse participation at events related to open-source bioinformatics software development and open science in the biological research community. The opening session also featured CollaborationFest updates (see section below) and sponsor videos from Seqera and the NIH Office of Data Science Strategy (ODSS).
BOSC 2024 featured two thought-provoking keynote talks. Mélanie Courtot’s talk (Figure 5), “The Data Shows We Need Better Data,” began with a brief overview of her career, with the underlying message that changes bring opportunity, and emphasized the importance of data quality and open standards for tackling global challenges. Dr. Courtot noted that there is an extensive ecosystem of open data, open standards, and open source software that researchers can leverage to help free more time to focus on the interesting science. She introduced the TRUE principles (Tracked, Reasonable, Understandable, Ethical) as crucial aspects of preparing data for AI applications, building upon the established FAIR (Findable/Accessible/Interoperable/Reusable) principles. More details and slides from Dr. Courtot’s keynote at BOSC are available on her lab website.
Andrew Su’s keynote (Figure 6), “Open Data, Knowledge Graphs, and Large Language Models,” asked us, “Have LLMs obviated the need for structured knowledge?” (Spoiler alert: No!). Dr. Su discussed strategies to mitigate hallucinations in LLMs using Retrieval-Augmented Generation (RAG) and tool augmentation, as well as benchmarks for evaluating AI-generated answers and explanations. Dr. Su then engaged the audience (both in person and virtual) in several interactive, future-thinking exercises to explore our community’s perspectives on LLM integration within biomedical informatics. The results, presented in the form of word clouds and polls, revealed strong feelings reflecting both excitement and concern.
In addition to the two keynotes, BOSC 2024 showcased a range of topics in open source bioinformatics and open science in 36 talks chosen from submitted abstracts (Figure 7) and 23 posters presented in person or online (Figure 8). The complete lists of talks and posters can be found at www.open-bio.org/events/bosc-2024/bosc-2024-schedule.
Beatrice was one of 23 poster presenters at BOSC 2024.
This year’s session topics were as follows: Data Analysis, Open Data, Visualization, Developer Tools and Libraries, Standards and Frameworks for Open Science, and AI/ML.
The first session of BOSC 2024, Data Analysis, covered a range of biological topics and domains (antimicrobial resistance, nontraditional model organisms), data types (gene expression, clinical data), and techniques (3D and functional genomics, RNA-seq alignment), with the unifying theme of open source approaches. The Open Data session focused on data portals, platforms, and databases. The Visualization session (which we may rename next year to be more inclusive of those with visual challenges) included talks about the latest additions to decades-old genome browsers (JBrowse and Integrated Genome Browser) as well as a newer tool that focuses on 3D structures. The last session on the first day, Developer Tools and Libraries, featured an array of lightning talks running from Python libraries to a GitHub app to help make software FAIR.
Day 2 started with a session on Standards and Frameworks for Open Science. The talks focused on software and approaches to supporting reproducible, reusable, sustainable software. The next session, Open Approaches to AI/ML, included two talks about projects applying ML to problems in biology, and one “meta” talk about how ML results are reported. This session was followed immediately by the panel about Open Source AI/ML (described in the next section).
Our panel discussion, “Open Source AI/ML: A Game Changer for Bioinformatics?”, offered an open-source spin on a topic that appeared prominently in many tracks at ISMB: artificial intelligence (AI) and machine learning (ML). Featuring prominent bioinformatics researchers Lawrence Hunter, Thomas Hervé Mboa Nkoudou, Mélanie Courtot, and Andrew Su, and moderated by Monica Munoz-Torres, the panel explored the benefits and challenges of open-source AI/ML tools (Figure 9). Mélanie Courtot, who focuses on translational informatics, is applying AI/ML methods to gain new insights from shared data to help improve health for all. Andrew Su’s lab builds and applies bioinformatics infrastructure for biomedical discovery, with a recent focus on constructing and mining knowledge graphs. Thomas Mboa is a social scientist focused on responsible AI, open science, and knowledge transfer. Lawrence Hunter is widely recognized as one of the founders of bioinformatics and one of the pioneers in applying AI and ML to problems in biology, such as predicting molecular function.
The panelists started by briefly presenting their thoughts on the benefits and challenges of open source AI/ML, and then answered questions posed by the audience. On the subject of whether we should switch to using open models, one panelist answered emphatically, “Absolutely; it’s difficult to investigate sources of bias in the training data if you can’t see the data. Not only do we not know what’s in closed models, but we can be pretty sure they’re not acting in our best interests as researchers since the payoff for commercial AI is going to involve advertising (for example, advertisers pay to have their content used in the training data), which may degrade their potential in science.” There was also a spirited discussion about data privacy vs. openness; at least one panelist pointed out that we don’t yet have a clear understanding of the benefit vs. harm that may be incurred by sharing data such as personal medical information with an LLM. Another panelist noted that bioinformaticians who were surveyed varied widely in their level of comfort with sharing their health information that way. Moderator Monica Munoz-Torres closed by asking the panelists, “Do you think open-source models are a game changer for bioinformatics?” Responses were divided, but most of the panelists agreed that, yes, open source language models are going to be important going forward. A recording of the panel is available on our YouTube channel: https://youtu.be/jNgOhlDi-BI.
Immediately following BOSC, the two-day CollaborationFest, hosted by the Université du Québec À Montréal, under the sponsorship of Abdoulaye Diallo’s Laboratoire de Bioinformatique, provided a collaborative space for developers and researchers to tackle real-world bioinformatics challenges. Organized by Hervé Ménager, Tazro Ohta, Christopher Fields, and Peter Cock, the event attracted 42 participants (Figure 10), including 15 online attendees, representing a diverse array of geographic regions across Africa, Asia, Europe, and the Americas, with a notable number of participants from Canada.
A wide range of projects were undertaken, addressing topics such as workflows, data processing, databases, web development, visualization, genome browsers, and protein structures. A summary of the ten CoFest projects is provided in Table 1. Many of these projects emerged spontaneously from onsite discussions and interactions among participants, with notable collaborations including those on Common Workflow Language and Nextflow, Common Workflow Language and codefair, and JBrowse2 and Tripal.
Project title | Keywords | Links |
---|---|---|
CWL v1.3 | Workflows | https://github.com/common-workflow-language/cwl-v1.3/,https://github.com/common-workflow-language/cwltool/ |
Taking over the world with data frames | Data processing | |
JBrowse2 Tripal Integration & Embedding | Database, Genome Browser | https://github.com/tripal/tripal_jbrowse |
Project: Workflow Benchmarking | Workflows | http://workflows.community/groups/benchmarking |
Combining JBrowse2 and iCn3D | Visualization, Structure, Genome Browser | https://github.com/ncbi/icn3, https://github.com/GMOD/jbrowse-components |
Displaying Protein-Ligand Interactions in iCn3D | Visualization, Structure, Chemoinformatics | https://github.com/ncbi/icn3d |
End-to-End-Open Biomedical AI | Large Language Models | |
Describing/Scanning workflows using LLMs | Large Language Models | https://github.com/david4096/llama-cli-cwl |
Tataki and the nightmare of file formats | Data processing | https://github.com/sapporo-wes/tataki |
Codefair.io | FAIR, best practices | https://codefair.io/ |
Although the BOSC meeting happens only once a year, its impact extends throughout the year. BOSC, under the umbrella of the International Society for Computational Biology (ISCB, which organizes ISMB) is a “Community of Special Interest” (COSI), fostering ongoing engagement. OBF and BOSC maintain active social media channels (Slack, LinkedIn, and Mastodon) to connect with the community year-round. (OBF and BOSC transitioned away from X/Twitter in 2023 because it had become inconsistent with our values.)
BOSC actively participates in ISCBacademy, a free webinar series organized by the ISCB. Over the past year, we sponsored webinars (Figure 11) by Sierra Moxon, who presented on LinkML (an open data modeling framework, grounded with ontologies), and Gemma Turon, who presented Ersilia (open source AI/ML for (antimicrobial) drug discovery). Recordings of these talks are available on our YouTube channel, https://www.youtube.com/@OBFBOSC/videos.
The BOSC organizing committee, which consists entirely of volunteers, devotes substantial effort and resources to making BOSC more diverse and accessible (learn more on our DEI page). The BOSC abstract submission form allows authors to discreetly request financial assistance for conference registration fees. These requests are not considered during the review process. This initiative, funded by generous sponsors (see https://www.open-bio.org/events/bosc-2024/bosc-2024-sponsors/ and the Acknowledgements below), has enabled us to offer free ISMB registration to dozens of BOSC participants, primarily from underrepresented groups in the field. In 2024 alone, thanks to our sponsors, we provided free registration to 14 attendees, 13 of whom belong to underrepresented groups in our community.
We continuously strive for inclusivity and transparency in our keynote speaker selection process. In the first phase of the process, we invite as broad a community as possible to nominate potential speakers. Shortlisted candidates are then evaluated using our established Invited Speaker Rubric, followed by a call for community feedback regarding any potential concerns. After reviewing the feedback, we make the final selection, ensuring that our keynote speakers represent a diverse spectrum of perspectives and backgrounds.
We are always pleased to hear feedback from BOSC participants about how our efforts made them feel included. This year, UCLA undergraduate Bernice Mihalache, who was one of the 14 people granted free registration, wrote a blot post entitled “My Amazing BOSC 2024 Experience” in which she commented, “My research lab at UCLA allocates conference participation funds only for graduate students, therefore I also included a request for fee waiver with my conference submission. I was excited when a few weeks later I received notification that not only I was accepted as a poster presenter, but that I also got free registration to both BOSC and to the entire ISMB (Intelligent Systems for Molecular Biology) conference!” Additionally, a BOSC 2024 participant with visual impairments described BOSC as the most inclusive scientific conference they had attended.
All photos in this report are shared under a CC-BY-SA license. All identifiable subjects in the photos were contacted, and they consented to have their photos used in this report.
We are grateful to all of those who helped make BOSC 2024 a success: the organizers, reviewers, volunteers, presenters, attendees, and the generous sponsors who enabled us to offset some meeting expenses and provide free registration to 14 participants. We had two Platinum sponsors this year: Seqera and the Chan-Zuckerberg Initiative. The NIH Office of Data Science Strategy (ODSS) returned as a Gold sponsor, as did Silver sponsor GigaScience.
We thank ISCB staff, particularly Seth Mulholland and Bel Hanson, for helping ISMB 2024 run smoothly.
We are grateful to our CoFest sponsor, Abdoulaye Diallo’s Laboratoire de Bioinformatique, and host, Université du Québec À Montréal, as well as Karen Reynard for organizing the hosting of the event.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
This article is an Editorial and has not been subject to external peer review.
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)