ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Opinion Article

Hackathon-driven tutorial development

[version 1; peer review: 1 approved, 1 approved with reservations]
PUBLISHED 24 Dec 2018
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Hackathons collection.

Abstract

Software is essential for data science. However, several software tools remain out of reach for many users due to a lack of documentation, thus limiting progress in the field. Tutorial development by authors and users can greatly improve a tool's accessibility and accelerate its adoption. In this article, we explore hackathons such as hackseq as a venue for authors and users to develop tutorials to address the lack of documented software. We describe four advantages of hackathon-driven tutorial development as well as three challenges that we faced. We also discuss our experience with remote participation. In short, if properly prepared, hackathons can provide a productive venue for assembling a group of passionate people, including remote participants, to develop a suite of related tutorials and address the growing need for accessible software.

Keywords

software, hackathon, documentation, tutorial, vignettes, programming, data science

Data science is an interdisciplinary field that relies heavily on the use of software tools. These tools require advanced domain-specific knowledge, which is often difficult to acquire and keep up-to-date considering the rate at which new methods become available and best practices evolve. This difficulty primarily stems from incomplete or unclear documentation. In the case of software packages (e.g. in R and Python), minimal documentation consists of describing inputs and outputs for individual functions. The Comprehensive R Archive Network (CRAN) is the de facto package repository for the R programming language and requires that submitted packages at least include this degree of documentation. CRAN also offers a framework for package developers to include additional documentation in the form of vignettes, which typically demonstrate real-world use cases. However, it appears that the majority of CRAN packages do not have vignettes1. On the other hand, vignettes are required for submission to Bioconductor, a package repository geared towards computational biology. This requirement has made Bioconductor and its numerous packages much more accessible to both new and experienced users. Unfortunately, the practice of including user-friendly vignettes or tutorials remains uncommon. To address this problem, the authors experimented with tutorial development in a hackathon project at hackseq 20172. We describe our experience in this article.

The aim of our project was to organize the collective knowledge of a group of computational biologists into modular tutorials that leveraged the same dataset. Tutorial topics were proposed by and then assigned to members of the team. The tutorials were designed to be independent from each other, but they can readily be combined to form workshop lessons that use the same dataset. In this paper, we explore the benefits and challenges associated with hackathon-driven tutorial development, including the trade-offs of remote hackathon participation. Briefly, we believe that hackathons are an excellent venue for tutorial development and are particularly suitable for remote participation.

We found at least four benefits of collaboratively developing tutorials in a hackathon setting. First, interest in a given topic can be assessed based on the voluntary participation of hackathon attendees. The assembly of people interested in a topic can further motivate tutorial development. Second, once a common dataset is selected and processed, team members can efficiently work in parallel. Third, although team members do not have to rely on one another, they may draw on the collective knowledge of the team. The various perspectives and ideas from different research specialities can guide tutorial design, resulting in higher-quality material. In more practical terms, developing tutorials during a hackathon allows problems to be more readily resolved, and team members can perform peer review, leading to more polished tutorials. Fourth, these hackathon projects often bring together community members that have yet to work together and can thus catalyze new collaborations.

That said, there are some challenges or considerations to keep in mind when developing tutorials in the context of hackathons. First, the hackathon project should feature a theme or topic that is focused in scope so that team members can assist one another. The skill level of the target audience should also be determined beforehand. Second, once a theme is decided, any existing tutorials should be identified beforehand to avoid repetition. As mentioned previously, vignettes exist for many software tools, and time is best invested in developing new material that is not yet available. Alternatively, one could build on the work of others by adapting or improving existing open-source tutorials. Third, we suggest that you identify a dataset that can be used in all proposed tutorials and meets the following criteria: openly accessible; properly formatted (i.e. little to no missing or malformed data); relevant to the target audience; and ideally sized (i.e. large enough to be interesting but small enough to fit on a personal computer). For example, in the case of tutorials geared towards computational biologists, there are several interesting human genomic datasets, but access is often restricted for privacy. Accordingly, it may be more practical to select a dataset first and then determine a theme and set of tutorial topics that can be developed using this dataset.

It is often cost-prohibitive to travel for short conferences, especially when travel awards do not cover non-traditional meetings such as hackathons. Fortunately, remote participation is not only possible for hackathons, but relatively straightforward. Several tools exist to support collaborative projects while eliminating the need for collocation. For instance, GitHub offers decentralized code sharing, Skype enables face-to-face team discussions, and Slack is a popular platform for asynchronous communication, which is essential when team members inhabit distant time zones. For example, we successfully managed our project despite being located in Vancouver, BC with remote participants in China and France. We argue that remote participation is especially straightforward for tutorial development because material can be developed independently and thus asynchronously. On the other hand, virtual attendance precludes any participation in social or networking events. There is also additional work involved in ensuring that every local and remote team member have assigned tasks at any given time. Overall though, we believe that allowing remote participation is a net benefit for a hackathon project, especially for tutorial development.

In conclusion, our experience with developing tutorials at a hackathon with remote participants was positive. We believe that we were able to achieve more together than separately, mostly because we gain access to immediate peer review of our tutorials. We also learned lessons that will help ensure success for future hackathon projects geared toward tutorial development. We believe this approach to be generalizable to other fields and a model for assembling passionate data scientists with similar interests and organizing their collective knowledge into modular tutorials. In turn, these tutorials can greatly benefit the field of data science by facilitating the adoption of powerful tools and accelerating the training of future data scientists.

Data availability

No data are associated with this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 24 Dec 2018
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Grande BM, Baghela A, Cavalla A et al. Hackathon-driven tutorial development [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2018, 7:1974 (https://doi.org/10.12688/f1000research.16959.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 24 Dec 2018
Views
10
Cite
Reviewer Report 04 Feb 2019
Brad A. Chapman, Bioinformatics Core, Harvard T.H. Chan School of Public Health, Boston, MA, USA 
Approved
VIEWS 10
The authors describe their experience with tutorial development in a collaborative event. These events bring together volunteer community participants, either locally or remotely, for a defined period of time to accomplish specific tasks. In this case, the motivation is to ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Chapman BA. Reviewer Report For: Hackathon-driven tutorial development [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2018, 7:1974 (https://doi.org/10.5256/f1000research.18543.r43146)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
16
Cite
Reviewer Report 11 Jan 2019
Ming Tang, Harvard FAS informatics (Faculty of Arts and Sciences), Harvard University, Cambridge, MA, USA 
Approved with Reservations
VIEWS 16
Software is the driving force for data science. However, lacking documentation hinders the adoption and usage of certain software tools. In this article, Bruno et.al described the advantages and challenges of hackathon-based tutorial development to promote software usage. Additionally, the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Tang M. Reviewer Report For: Hackathon-driven tutorial development [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2018, 7:1974 (https://doi.org/10.5256/f1000research.18543.r42213)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 24 Dec 2018
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.